Cliffs Quick Revie w
Linear Algebra by Steven A . Leduc
Series Edito r Jerry Bobrow, Ph .D .
Wiley Publishing, Inc .
Cliffs Quick Revie w
Linear Algebra by Steven A . Leduc
Series Edito r Jerry Bobrow, Ph .D .
Wiley Publishing, Inc .
CliffsNotes TM Linear Algebra Published by: Wiley Publishing, Inc. 909 Third Avenu e New York, NY 1002 2 www.wiley.com
Note : If you purchased this book without a cover , you should be aware that this book is stolen prop erty. It was reported as "unsold and destroye d " to the publisher, and neither the author nor the publisher has received any payment for this "stripped book ."
Copyright © 1996 Wiley Publishing, Inc ., New York, New York ISBN : 0-8220-5331-4 Printed in the United States of Americ a 1098765 4 1 O/SV/QW/QS/I N Published by Wiley Publishing, Inc ., New York, NY Published simultaneously in Canad a No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section s 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, o r authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewoo d Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4744 . Requests to the Publisher for permission should b e addressed to the Legal Department, Wiley Publishing, Inc ., 10475 Crosspoint Blvd., Indianapolis, IN 46256 , 317-572-3447, fax 317-572-4447, or e-mail permcoordinator@wiley . corn LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY : THE PUBLISHER AND AUTHOR HAVE USE D THEIR BEST EFFORTS IN PREPARING THIS BOOK . THE PUBLISHER AND AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF TH E CONTENTS OF THIS BOOK AND SPECIFICALLY DISCLAIM ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE . THERE ARE NO WARRANTIES WHICH EXTEND BEYOND THE DESCRIPTIONS CONTAINED IN THIS PARAGRAPH . NO WARRANTY MAY B E CREATED OR EXTENDED BY SALES REPRESENTATIVES OR WRITTEN SALES MATERIALS . THE ACCURACY AND COMPLETENESS OF THE INFORMATION PROVIDED HEREIN AND THE OPINION S STATED HEREIN ARE NOT GUARANTEED OR WARRANTED TO PRODUCE ANY PARTICULA R RESULTS, AND THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FO R EVERY INDIVIDUAL. NEITHER THE PUBLISHER NOR AUTHOR SHALL BE LIABLE FOR ANY LOSS O F PROFIT OR ANY OTHER COMMERCIAL DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL , INCIDENTAL, CONSEQUENTIAL, OR OTHER DAMAGES. FULFILLMENT OF EACH COUPON OFFER IS THE RESPONSIBILITY OF THE OFFEROR. Trademarks : Wiley, the Wiley Publishing logo, Cliffs, CliffsNotes, the CliffsNotes logo, CliffsAP, CliffsComplete , CliffsTestPrep, CliffsQuickReview, CliffsNote-a-Day and all related logos and trade dress are registered trademarks o r trademarks of Wiley Publishing, Inc ., in the United States and other countries . All other trademarks are property o f their respective owners . Wiley Publishing, Inc ., is not associated with any product or vendor mentioned in this book . For general information on our other products and services or to obtain technical support, please contact our Custome r Care Department within the U.S . at 800-762-2974, outside the U .S . at 317-572-3993, or fax 317-572-4002 . Wiley also publishes its books in a variety of electronic formats . Some content that appears in print may not be avail able in electronic books.
VECTOR A►LGEBRA
1
The Space R2 Vectors in R2 Position vectors Vector addition Vector subtraction Scalar multiplication Standard basis vectors in R2 The Space R 3 Standard basis vectors in R3 The cross product The Space R" The norm of a vector Distance between two points Unit vectors The dot product The triangle inequality The Cauchy-Schwarz inequality Orthogonal projections Lines Planes
1 3 7 8 11 12 13 16 17 19 22 23 25 26 27 31 32 36 39 43
's
47
MA A►I~GEBItA Matrices Entries along the diagonal Square matrices Triangular matrices The transpose of a matrix Row an column matrices Zero matrices Operations with Matrices Matrix addition
LINEAR ALGEBRA
47 49 50 50 51 52 53 53 53
CONTENTS
Scalar multiplication Matrix multiplication Identity matrices The inverse of a matrix
56 57 70 76
LINEAR SYSTEMS Solutions to Linear Systems Gaussian Elimination Gauss-Jordan Elimination Using Elementary Row Operations to Determine K 1
85 85 90 96 109
REAL EUCLIDEAN VECTOR SPACES Subspaces of Rn The Nullspace of a Matrix Linear Combinations and the Span of a Collection of Vectors Linear Independence The Rank of a Matrix A Basis for a Vector Space Orthonormal bases Projection onto a Subspace The Gram-Schmidt orthogonalization algorithm The Row Space and Column Space of a Matrix Criteria for membership in the column space The Rank Plus Nullity Theorem Other Real Euclidean Vector Space s and the Concept of Isomorphism Matrix spaces Polynomial spaces Function spaces
123 12 3 13 0
iv
134 137 145 150 157 160 16 7 17 6 17 7 18 3 18 9 18 9 19 2 194
CLIFFS QUICK REVIE W
CONTENTS
THE DETE
ANT
Definitions of the Determinant Method 1 for defining the determinant Method 2 for defining the determinant Laplace Expansions for the Determinant Laplace expansions following row-reduction Cramer's Rule The Classical Adjoint of a Square Matrix
LINEAR TRANSFORMATIONS Definition of a Linear Transformation Linear Transformations and Basis Vectors The Standard Matrix of a Linear Transformation The Kernel and Range of a Linear Transformation Injectivity and surj ectivity Composition of Linear Transformations
197 19 7 19 7 20 6 220 23 4 23 6 24 1
251 25 1 260 263 27 2 28 0 28 5
EIGENVALUES AND EIGENVECTORS
293
Definition and Illustration o f an Eigenvalue and an Eigenvector Determining the Eigenvalues of a Matrix Determining the Eigenvectors of a Matrix Eigenspaces Diagonalization
29 3 29 4 29 8 30 7 31 1
LINEAR ALGEBRA v
VECTOR ALGEBRA
It is assumed that at this point in your mathematical education, you are familiar with the basic arithmetic operations an d algebraic properties of the real numbers, the set of which i s denoted R . Since the set of reals has a familiar geometric depiction, the number line, R is also referred to as the real lin e and alternatively denoted R ' ("R one") .
The Space R2 Algebraically, the familiar x -y plane is simply the collection of all pairs (x, y) of real numbers. Each such pair specifies a point in the plane as follows . First, construct two copies of the real line—one horizontal and one vertical which intersect perpendicularly at their origins ; these are called the axes . Then, given a pair (xl , x2), the first coordinate, x l , specifies th e point's horizontal displacement from the vertical axis, whil e the second coordinate, x2, gives the vertical displacement fro m the horizontal axis . See Figure 1 . Clearly, then, the order i n which the coordinates are written is important since the point (x l , x2) will not coincide generally with the point (x2, xl) . To emphasize this fact, the plane is said to be the collection o f ordered pairs of real numbers . Since it takes two real number s to specify a point in the plane, the collection of ordered pair s (or the plane) is called 2-space, denoted R 2 ("R two") .
LINEAR ALGEBRA 1
VECTOR ALGEBRA
;
(XI , X2 )
x2 xl
■ Figure 1
■
R2 is given an algebraic structure by defining two operation s on its points . These operations are addition and scalar multiplication . The sum of two points x = (xl , xZ ) and x' = (x;, xZ ) is defined (quite naturally) by the equation x+x' =(x l , x Z) + (x i, x2) = (xl +x i, xZ +x2 ) and a point x = (xl , x2 ) is multiplied by a scalar c (that is, by a real number) by the rule cx = c(xl , x2) = (cx1, cx2 ) Example 1 : Let x = (1, 3) and y = (—2, 5) . Determine th e points x + y, 3x, and 2x — y . The point x + y is (1, 3) + (—2, 5) = (—1, 8), and the poin t 3x equals 3(1, 3) = (3, 9) . Since —y = (—1)y = (2, -5) , 2x—y=2x+(—y) = 2(1, 3) + (2, -5 ) = (2, 6) + (2, -5) = (4, 1 )
CLIFFS QUICK REVIEW 2
VECTOR ALGEBRA
By defining x — x' to be x + (—x'), the difference of tw o points can be given directly by the equatio n x - x' _
( .710
1 x2 ) - (x;, ,
JICZ)
_
(xi -
7Ci ,
x2
- JIC2 )
Thus, the point 2x — y could also have been calculated as follows : 2x—y = 2(1, 3)—(—2, 5) = (2, 6)—(—2, 5 ) = (2 — (—2), 6 — 5) = (4, 1)
111
Vectors in R 2. A geometric vector is a directed line segment from an initial point (the tail) to a terminal or endpoint (th e tip) . It is pictured as an arrow as in Figure 2 . endpoint
("tip" )
initial point ("tail") ■ Figure 2
■
The vector from point a to point b is denoted ab . If a = (a l , a2) is the initial point and b = (b 1 , b2) is the terminal point , then the signed numbers b 1 — a1 and b2 — a2 are called th e components of the vector ab . The first component, b 1 — a l , indicates the horizontal displacement from a to b, and the second component, b2 — a2, indicates the vertical displacement . See Figure 3. The components are enclosed in parentheses t o specify the vector ; thus, ab = ( b 1 — al, b2 — a2) .
LINEAR ALGEBRA
3
VECTOR ALGEBRA
■ Figure 3
■
Example 2 : If a = (4, 2) and b = (—5, 6), then the vector from a to b has a horizontal component of -5 — 4 = -9 and a vertical component of 6 — 2 = 4 . Therefore, ab = (—9, 4), which i s sketched in Figure 4 . b=(—5,6)
ab = (—I, 4 ) 4 -9
a=(4,2 )
■ Figure 4 ■
4
CLIFFS QUICK REVIE W
VECTO R ALGEBRA
Example 3 : Find the terminal point of the vector xy = (8, -7 ) if its initial point is x = (—3, 5) . Since the first component of the vector is 8, adding 8 t o the first coordinate of its initial point will give the first coordinate of its terminal point . Thus, y l = x i + 8 = -3 + 8 = 5 . Similarly, since the second component of the vector is -7 , adding -7 to the second coordinate of its initial point will giv e the second coordinate of its endpoint . This gives y 2 = xZ + (—7) = 5 + (—'7) = -2 . The terminal point of the vector xy i s therefore y = (y,, y2) = (5, -2) ; see Figure 5 . x=(—3,5)
xy = (8, 7)
8
Y= Y2 ) = (x 1 + 8, x 2 — 7) =(—3+8, 5—7 ) = ( 5, — 2 )
■ Figure 5 ■ Two vectors in R2 are said to be equivalent (or equal) i f they have the same first component and the same secon d component . For instance, consider the points a = (—1, 1), b = (l, 4), c = (l, 2), and d = (3, 1) . The horizontal component of the vector ab is 1 — (—1) = 2, and the vertical component o f ab is 4 — 1 = 3 ; thus, ab = (2, 3) . Since the vector cd has a
LINEAR ALGEBRA
5
VECTO R ALGEBRA
horizontal component of 3 – 1 = 2, and a vertical componen t of 1 – (–2) = 3, cd = (2, 3) also . Therefore, ab = cd; see Figure 6 . b=(1,4)
a=(—1, 1)
o
d = (3, 1 )
c= (1, 2)
■ Figure 6 ■ To translate a vector means to slide it so as to change its initia l and terminal points but not its components . If the vector ab i n Figure 6 were translated to begin at the point c = (1, 2), it would coincide with the vector cd. This is another way to say that ab = cd . Example 4 : Is the vector from a = (0, 2) to b = (3, 5) equivalent to the vector from x = (2, -4) to y = (5, 1) ? Since the vector ab equals (3, 3), but the vector xy equals (3, 5), these vectors are not equivalent . Alternatively, if th e vector ab were translated to begin at the point x, its termina l point would then be (x l + 3, x 2 + 3) = (2 + 3, -4 + 3) = (5 , -1) . This is not the point y ; thus, ab ~ xy . ■
CLIFFS QUICK REVIEW
6
VECTOR ALGEBRA
Position vectors . If a vector has its initial point at the origin , the point 0 = (0, 0), it is called a position vector . If a position vector has x = (x 1 , x2 ) as its endpoint, then the component s of the vector Ox are x 1 — 0 = x 1 and x 2 — 0 = x 2 ; so Ox = (x l , x2 ). If the origin is not explicitly written, then a positio n vector can be named by simply specifying its endpoint ; thus, x = (x 1 , x 2 ). Note that the position vector x with components x 1 and x2 is denoted (x 1 , x 2), just like the point x with coordinates x 1 and x2 . The context will make it clear which meaning is in tended, but often the difference is irrelevant . Furthermore , since a position vector can be translated to begin at any othe r point in the plane without altering the vector (since translatio n leaves the components unchanged), even vectors that do not begin at the origin are named by a single letter . Example 5 : If the position vector x = (-4, 2) is translated s o that its new initial point is a = (3, 1), find its new termina l point, b. If b = (b l , b2 ), then the components of the vector ab ar e bl — 3 and b 2 — 1 . Since ab = x , (b l — 3, b 2 — 1) = (—4, 2) ~ (b i, b2) = (—1, 3 ) See Figure 7.
LINEAR ALGEBRA 7
VECTOR ALGEBRA
b = (-1, 3)
x= (-`1 , 2 )
■ Figure 7
a = (3,1 )
■
Vector addition . The operations defined earlier on points (xl , x2) in R 2 can be recast as operations on vectors in R 2 (calle d 2-vectors, because there are 2 components) . These operations are called vector addition and scalar multiplication . The sum of two vectors x and x' is defined by the same rule that gave the sum of two points : x + x ' = (x l , x 2 ) + (x1', x2) = (xl + x;, x 2 + x2 ) Figure 8 depicts the sum of two vectors . Geometrically, one o f the vectors (x', say) is translated so that its tail coincides with the tip of x . The vector from the tail of x to the tip of th e translated x' is the vector sum x + x' . This process is often referred to as adding vectors tip-to-tail.
8
CLIFFS QUICK REVIEW
VECTOR ALGEBRA
■ Figure 8 ■ Because the addition of real numbers is commutative, that is, because the order in which numbers are added is irrelevant, i t follows that (x i +
x2
+ x)
= ( x 1 + xl , x2 +x2 )
This implies the addition of vectors is commutative also : x+x'= x' + x Thus, when adding x and x' geometrically, it doesn't matte r whether x' is first translated to begin at the tip of x or x i s translated to begin at the tip of x' ; the sum will be the same i n either case . Example 6 : The sum of the vectors x = (1, 3) and y = (-2, 5 ) is x + y = (1 + (-2), 3 + 5) = (-1, 8 ) See Figure 9 .
LINEAR ALGEBRA
9
VECTOR ALGEBRA
x+ Y = (1 , 8)
Y=(2,5 )
■ Figure 9 ■ Example 7 : Consider the position vector a = (1, 3) . If b is th e point (5, 4), find the vector ab and the vector sum a + ab . Provide a sketch . Since ab has horizontal component 5 — 1 = 4 and vertica l component 4 — 3 = 1, the vector ab equals (4, 1) . So a + ab = (1, 3) + (4, 1) = (5, 4), which is the position vector b . Figure 10 clearly shows that a + ab = b .
10
CLIFFS QUICK REVIEW
VECTOR ALGEBRA
ab = (4, 1) i
b = (5, 4)
a = (1, 3)
■ Figure 10
■
Vector subtraction . The difference of two vectors is defined in precisely the same way as the difference of two points . For any two vectors x and x' in R2, x — x' _ (xl , x2 ) — (x;, x2) _ (x1 — x;, x 2 — x2 ) With x and x' starting from the same point, x — x' is the vector that begins at the tip of x' and ends at the tip of x . Thi s observation follows from the identity x' + (x — x') = x and the method of adding vectors geometrically . See Figure 11 .
■ Figure 11
LINEAR ALGEBRA
■
11
VECTOR ALGEBRA
In general, it is easy to see tha t ab = b — a whether the letters on the right-hand side are interpreted a s position vectors or as points . Figure 10 showed that a + ab = b, which is equivalent to the statement ab = b — a, where a and b are position vectors . Although this example dealt with a particular case, the identity ab = b — a holds in general . Example 8 : The vector ab from a = (4, -1) to b = (—2, 1) i n Figure 12 i s ab=b—a=(—2, 1)—(4,—1)=(2—4, 1 +1)=(-6,2 )
,
■ Figure 12 ■ Scalar multiplication. A vector x is multiplied by a scalar c b y the rule cx = c(xI, x2 ) = (cxI, cx2 )
12
CLIFFS QUICK REVIEW
VECTO R ALGEBR A
If the scalar c is 0, then for any x, cx equals (0, 0)—the zero vector, denoted 0. If c is positive, the vector cx points in th e same direction as x, and it can be shown that its length is c times the length of x . However, if the scalar c is negative, the n cx points in the direction exactly opposite to that of the original x, and the length of cx is Id times the length of x . Some examples are shown in Figure 13 :
■ Figure 13
■
Two vectors are said to be parallel if one is a positive scala r multiple of the other and antiparallel if one is a negative scalar multiple of the other . (Note : Some authors declare two vectors parallel if one is a scalar multiplepositive or negative—of the other.) Standard basis vectors in R 2. By invoking the definitions o f vector addition and scalar multiplication, any vector x = (x, , x2) in R2 can be written in terms of the standard basis vector s
LINEAR ALGEBRA
13
VECTOR ALGEBRA
(1, 0) and (0, 1) : (x,, x2 )=(x„ 0)+(0, x2 )=x, (l, 0)+x2 (0, 1 ) The vector (1, 0) is denoted by i (or e,), and the vector (0, 1 ) is denoted by j (or e 2). Using this notation, any vector x in R2 can be written in either of the two form s x = x l i +x2j or x = x l el + x2 e2 See Figure 14 .
j'
xli
x=x1i+x 2j ■ Figure 14 ■ Example 9 : If x = 2i + 4j and y = i — 3j, determine (and pro vide a sketch of) the vectors x and x + y . Multiplying the vector x by the scalar
2
yields
x= (2i+4j) = ( . 2)i+(. . The sum of the vectors x and y is
14
CLIFFS QUICK REVIEW
VECTO R ALGEBR A
x + y = (2i + 4j) + (i — 3j) _ (2 + 1)i + (4 — 3)j = 3i + j These vectors are shown (together with x and y) in Figure 15 . x=2i+4j y=i—3j
2x=i+2j
x+y=3i+ j
Sr
= I - 3j
■ Figure 15
■
Example 10 : Find the scalar coefficients k, and k2 such that k,(1, -3) + k2(—1, 2) = (—1, -2 )
The given equation can be rewritten as follows : (lc, — k2 , -3k l + 2k2) = (—1, -2) This implies that both of the following equations must be satisfied : -3k1 + 2k2 = -2
(* )
Multiplying the first equation by 3 then adding the result t o the second equation yields
LINEAR ALGEBRA
15
VECTO R ALGEBRA
Ski — 3k2
= -3
+2k2 =—2 —k2
= -5
Thus, k2 = 5 . Substituting this result back into either of th e equations in (*) gives kl = 4 . ■
The Space R3 mutually perpendicular copies of the real line intersec t at their origins, any point in the resulting space is specified b y an ordered triple of real numbers (x 1 , x2 , x3). The set of al l ordered triples of real numbers is called 3-space, denoted R 3 ("R three") . See Figure 16 . If three
■ Figure 16 ■ The operations of addition and scalar multiplication define d on R2 carry over to R3:
16
CLIFFS QUICK REVIEW
(x1,
x2, x3) + ( x l, x2 ,
x3)=(xl
+xl, x2 +x, x3 +x3 )
C( x 1, x2, x3) = (Cx1,
Cx2,
Cx 3 )
Vectors in R3 are called 3-vectors (because there are 3 components), and the geometric descriptions of addition and scalar multiplication given for 2-vectors also carry over to 3 vectors . Example 11 : If x = (3, 0, 4) and y = (2, 1, -1), the n 3x — 2y = 3(3, 0, 4) — 2(2, 1, -1 ) = (9, 0, 12) — (4, 2, 2) = (5, 2, 14)
■
Standard basis vectors in R3. Since for any vector x = (x l , x2, x3 ) in R3,
(x1 , x2 , x 3) = (x 1 , 0, 0) + (0, x2, 0) + (0, 0, x3) = x 1(l, 0, 0) + x 2(0, 1, 0) + x 3(0, 0, 1 ) the standard basis vectors in R3 are i = e 1 = (1, 0, 0), j = e2 = (0, 1, 0), and k = e 3
=
(0, 0, 1 )
Any vector x in R3 may therefore be written a s x = x 1i+x2 j+x3 k
or x = x1el +x2e2 +x3 e3
See Figure 17 .
LINEAR ALGEBRA 17
VECTOR ALGEBRA
■ Figure 17
■
Example 12 : What vector must be added to a = (1, 3, 1) t o yield b = (3, 1, 5) ? Let c be the required vector; then a + c = b . Therefore , c=b—a=(3, 1,5)—(1,3, 1)=(2, 2,4) Note that c is the vector ab ; see Figure 18 .
■ Figure 18 ■
CLIFFS QUICK REVIE W 18
VECTOR ALGEBRA
The cross product . So far, you have seen how two vectors can be added (or subtracted) and how a vector is multiplied by a scalar. Is it possible to somehow "multiply" two vectors? On e way to define the product of two vectors—which is done onl y with vectors in R3 is to form their cross product. Let x = (x l , x2, x3) and y = (y 1 , y2, y3) be two vectors in R 3 . The cross product (or vector product) of x and y is defined as follows : xxy= (xl,
x2 , x3) x CYI , y2, y3 )
l
_ (xzY3 — xsYz, x3Y1 - 'UP x Y2 — x2Yl )
The cross product of two vectors is a vector, and perhaps th e most important characteristic of this vector product is that it is perpendicular to both factors . (This will be demonstrate d when the dot product is introduced.) That is, the vector x x y will be perpendicular to both x and y ; see Figure 19 . [There is an ambiguity here : the plane in Figure 19, which contains th e vectors x and y, has two perpendicular directions : "up" an d "down ." Which one does the cross product choose? The answer is given by the right-hand rule: Place the wrist of your right hand at the common initial point of x and y, with you r fingers pointing along x ; as you curl your fingers toward y , your thumb will point in the direction of x x y. This show s that the cross product is anticommutative : y x x = — (x x y) . ]
LINEAR ALGEBRA
19
VECTO R ALGEBRA
xxy
A
■ Figure 19
■
The length of a vector x = (x,, x 2 , x3 ) in R3 , which is denote d lxii, is given by the equatio n l xll =
)2 + ( x2 )2 + ( x3 ) 2
a result which follows from the Pythagorean Theorem (see th e discussion preceding Figure 22 below) . While the direction of the cross product of x and y is determined by orthogonality and the right-hand rule, the magnitude (that is, the length) o f x x y is equal to the area of the parallelogram spanned by the vectors x and y.
x ■ Figure 20 ■
CLIFFS QUICK REVIE W
20
VECTOR ALGEBRA
Since the area of the parallelogram in Figure 20 i s area = base •height =
sin e
the following equation holds : Ilx
x
ii
=
sine
where 0 is the angle between x and y . Example 13 : Let x = (2, 3, 0) and y = (—1, 1, 4) be positio n vectors in R 3 . Compute the area of the triangle whose vertice s are the origin and the endpoints of x and y and determine th e angle between the vectors x and y . Since the area of the triangle is half the area of the parallelogram spanned by x and y , area of A = i fix -
x
YI l
z
— xsYz x3Yl 9
— x iYs 9
x iYz — xzYi )I I
—iII(3 .4—0 .1, 0•(—1)—2 .4, 2 . 1—3•(—1)11 — 8, 5 )~~ i _ —z' 122 + (—8)2
+
52
= 2x/23 3 Now, since 'Ix x yll = lIxIl' IlYO sin 0, the angle between x and y i s given by
LINEAR ALGEBRA 21
VECTOR ALGEBRA
sing
= xyl _
,233
lixil lill
V2 2 + 3Z + OZ • V(–1)2 + 12 + 42
,/233 13 • 1 8 /23 3 23 4 Therefore, 8 = siri-1 V233/234 .
■
The Space Rn By analogy with the preceding constructions (R 2 and R3), yo u can consider the collection of all ordered n-tuples of real numbers (xl , x2 , . . ., xn) with the analogous operations of ad dition and scalar multiplication . This is called n-space (denoted Rn), and vectors in Rn are called n-vectors . The standard basis vectors in R n are e1 = (1, 0, 0, . . ., 0), e2 = (0, 1, 0, . . ., 0), . . ., en = (0, 0, 0, . . ., 1) where ek has a 1 in the kth place and zeros elsewhere . All th e figures above depicted points and vectors in R2 and R3. Although it is not possible to draw such diagrams to illustrat e geometric figures in R n if n > 3, it is possible to deal with the m algebraically, and therein lies the real power of the algebrai c machinery.
22
CLIFFS QUICK REVIEW
VECTO R ALGEBRA
Example 14 : Consider the vectors a = (1, 2, 0, 3 ), b = (0, 1 , -4, 2), and c = (5, -1, -1, 1) in R 4. Determine the vector 2a — b+c. Extend the definitions of scalar multiplication and vecto r addition in the natural way to vectors in R 4 to compute 2a—b+c=2(1, 2, 0, -3)—(0, 1, -4, 2)+(5, -1, -1, 1) =(2,4,0,—6)—(0,1,—4,2)+(5,—1,—1, 1 ) =(2—0+5, 4—1—1, 0+4—1, -6—2+1 ) =(7,2,3,—7) ■
Example 15 : Determine the sum of the standard basis vectors e l , e 3 , and e 4 in R5. All vectors in R5 have five components . Four of the components in each of the standard basis vectors in R 5 are zero , and one component the first in e l , the third in e 3 , and the fourth in e 4—has the value 1 . Therefore , e 1 + e 3 + e4 = (1, 0, 0, 0, 0) + (0, 0, 1, 0, 0) + (0, 0, 0, 1, 0) _ (1, 0, 1, 1, 0)
■
The norm of a vector. The length (or Euclidean norm) of a vector x is denoted and for a vector x = (x l , x 2 ) in R2, 11 4 is easy to compute (see Figure 21) by applying the Pythagorean Theorem : 2 , )2 = V(XI ) + (X2
LINEAR ALGEBRA 23
VECTOR ALGEBRA
x = (xl , x 2 ) J(xi)2
+ (x2 ) 2
xi
■ Figure 21
■
The expression for the length of a vector x = (x l , x2 , x3 ) in R3 follows from two applications of the Pythagorean Theorem, as illustrated in Figure 22 : l x i = V( x l )2 + ( x2 )2
+ ( 'Z3 ) 2
■ Figure 22 ■
24
CLIFFS QUICK REVIEW
VECTOR ALGEBRA
In general, the norm of a vector x = (x 1 , x2 , x3 , given by the equatio n ll
x ll = 4(xi )2
+ ( x,
)2
+ +
. . ., xn )
in Rn
is
n
Example 16 : The length of the vector x = (3, 1, 5, 1) in R4 is lxii = ~3 2
+
12 + (—5)2 + 12
=
36 = 6
■
Example 17 : Let x be a vector in W . If c is a scalar, how doe s the norm of cx compare to the norm of x ? If x = (xi , x2 , .. ., x„), then cx = (cxi , cx2 , ..., cx„). Therefore, I+ (CJCZ
)2 -F . . . + ( Cxn32
j c2 [(x i ) 2 + (x2 ) Z + . . . + ( xn )2 ] CZ
. V( x l )2 + ( x2 )2 + . . . + (x,02
C X II
—I I I
Thus, multiplying a vector by a scalar c multiplies its norm b y Icl . Note that this is consistent with the geometric descriptio n given earlier for scalar multiplication . ■ Distance between two points. The distance between two point s x and y in Rn—a quantity denoted by d(x, y)—is defined t o be the length of the vector xy : d(x, y) = ii xyii
LINEAR ALGEBRA
25
VECTOR ALGEBRA
Example 18 : What is the distance between the points p = (3, 1 , 4) and q = (1, 3, 2)? Since pq = q — p = (l, 3, 2) — (3, 1, 4) = (—2, 2, -2), th e distance between the points p and q i s d(p, q)
= Il pqD — II (—2,
2, — 2) II = V(—2) 2 + 2 2
+
(—2)Z
=
2-li-
■
Unit vectors. Any vector whose length is 1 is called a uni t vector. Let x be a given nonzero vector and consider the scalar multiple x/llxll . (The zero vector must be excluded fro m consideration here, for if x were 0, then would be 0, and the expression x/llxll would be undefined .) Applying the result of Example 17 (with c =1/llxll), the norm of the vecto r x/llXll is
l xll
ilxii
V=II = ~
Thus, for any nonzero vector x, x li xil is a unit vector. This vector is denoted i ("x hat") and represents the unit vector in the direction of x . (Indeed, one can g o further and actually call x the direction of x.) Note in particular that all the standard basis vectors are unit vectors ; the y are sometimes written as i, j, etc . (or e l , e2 , etc.) to emphasize this fact .
CLIFFS QUICK REVIEW
26
VECTOR ALGEBRA
Example 19 : Find the vector y in R2 whose length is 10 an d which has the same direction as x = 3i + 4j . The idea is simple : Find the unit vector in the same direction as 3i + 4j, and then multiply this unit vector by 10 . The unit vector in the direction of x i s x _ x _ 3i+4j _ 3i+4j _ s s i+ 4s J 5 llxll V3 2 + 42 Therefore, y=10z=10(si+5 j)=6i+8j
■
The dot product . One way to multiply two vectors—if they li e in R3 is to form their cross product . Another way to form the product of two vectors—from the same space R n , for any n —is as follows . For any two n-vectors x = (xl , x2 , . . ., xn ) and y = (v i , y2 , . . ., yn ), their dot product (or Euclidean inner product) is defined by the equatio n x .y = xl .yl +x 2y2
+ . . . + xnyn
(The symbol x • y is read "x dot y .") Note carefully that, unlike the cross product, the dot product of two vectors is a scalar. For this reason, the dot product is also called the scala r product. It can be easily shown that the dot product on R " satisfies the following useful identities : Homogeneity : (cx) . y = x • (cy) = c(x • y ) Commutative property : z•y=y•x Distributive property : x • (y ± z) = x • y ± z • z
LINEAR ALGEBRA
27
VECTOR ALGEBRA
Example 20 : What is the dot product of the vectors x = (–1, 0, 4) and y = (3, 6, 2) in R 3 ? By the commutative property, it doesn't matter whethe r the product is taken to be x • y or y • x ; the result is the same in either case . Applying the definition yield s ■ x•y=(—1)(3)+(0)(6)+(4)(2)=—3+0+8=5
The dot product of a vector x = (x l , x2, . . ., xn) with itsel f is x • x = x lxi + x2x2 + • • • i.. xn'xn
=
(xl ) 2 + (JCZ ) 2 -I- • • • + (Xn ) 2
Notice that the right-hand side of this equation is also the expression for Ilxr IIXIIZ
= (xl ) 2 + (x2 ) 2 +- . .+(x,,)
2
Therefore, for any vector x , li xil 2 = x • x
This identity is put to use as follows. Since Ilall 2 = a • a , the distributive and commutative properties of the dot product imply that for any vectors x and y in R " ,
CLIFFS QUICK REVIE W
28
VECTO R ALGEBRA
ilx+ yii 2
=(X+y) . (X+y ) =(x+y) .x+(x+y) . y
=x .(x+y)+y .(x+y) =(x .x+x .y)+(y .x+y .y) =X•X+x .y+X-y.+y• y =X-X+ZX . y+y . y
Thus,
+ Y~~2
= IN1Z
+ 2x • y +
of
(*)
Now, if x 1 y, then by Figure 23, the Pythagorean Theorem would say li x +YII Z
=Il xil z
+ IIYIi Z
■ Figure 23
(** )
■
Therefore, if x1y, equations (*) and (**) imply (14Z + 0112 = 114 Z
+ 2x • y + 011 Z
which simplifies to the simple statement x • y = 0 . Since thi s argument is reversible (assuming that it is agreed that the zer o vector is orthogonal to every vector), the following fact ha s
LINEAR ALGEBRA
29
VECTOR ALGEBRA
been established : xly
if and only if
xy= 0
This says that two vectors are orthogonal—that is, perpendicular—if and only if their dot product is zero . Example 21 : Use the dot product to verify that the cross pro duct of the vectors x = (2, 3, 0) and y = (–1, 1, 4) from Ex ample 13 is orthogonal to both x and y ; then show that x x y is orthogonal to both x and y for any vectors x and y in R 3 . In Example 13, it was determined that x x y = (12, – 8, 5) . The criterion for orthogonality is the vanishing of the do t product . Since both (xxy) .x=(12, -8, 5) . (2, 3, 0)=12 . 2–8 . 3+5 . 0= 0 and (xxy) .y=(12, —s, s) - (–1, 1, 4)=12•(–1)–8 . 1+5 . 4= 0 the vector x x y is indeed orthogonal to x and to y . In general , (xxy)•y—y•(xxy) _ (Yi, YZ Y3) ' ( xzYs – xsY2 , 9
x3.1'i
– x iYs ,
x iY2 –
X2Y1 )
= Yl (x2Y3 – x 3Yz) + YZ( xsYi – x1Y3) + Ys( x l Yz – XzYi ) (yl x2y3 'Y3 x zYi) + (–y1 x 3 y2 + Yz xs3'i )
+ ( - Y2XlY3 + Y3 X lY2 ) =0+0+ 0 =o
and a similar calculation shows that (x x y) • x = 0 also .
■
CLIFFS QUICK REVIEW
30
VECTOR ALGEBRA
The triangle inequality . From elementary geometry, yo u know that the sum of the lengths of any two sides of a triangl e must be greater than the length of the third side. That is, if A , B, and C are the vertices of a triangle, then AC
■ Figure 24
■
The triangle inequality can be generalized to vectors in R n. If x and y are any two n-vectors, then ll x +
YII
llx ll + IIYII
■ Figure 25
■
LINEAR ALGEBRA
31
VECTO R ALGEBRA
Figure 25 shows that this statement can be interpreted in th e same way as the elementary geometric fact about the length s of the sides of a triangle . [One notable difference, however, i s that if x and y happen to be parallel (that is, if y is a positive scalar times x) or if either x or y is the zero vector, the n + = Pill + 11Y11• The generalized triangle inequality must take these degenerate cases into account (hence the weak inequality, <_), whereas the triangle inequality from elementary geometry does not (and hence the strong (or strict) inequality, <).] Example 22 : Verify the triangle inequality for the vectors x =
(—1, 0, 4) and y = (3, 6, 2) from Example 20 . The sum of these vectors is x + y = (2, 6, 6), and th e lengths of the vectors x, y, and x + y are H) 2 + 02 + 42 = ,lf. 7---
V
= V3 2 + 62 + 22
-2--f-§-
=V
=7
lix+yIi=V22 + 62 + 6Z = 76 + With these lengths, the triangle inequality, 'Ix + iI becomes X176 <_ 17 + 7, which is certainly true, since the lefthand side is less than 9, while the right-hand side is greate r than 4+7 = 11 . ■
The Cauchy-Schwarz inequality. One of the most important inequalities in mathematics is known as the Cauchy-Schwarz inequality . For Rn equipped with an inner product, this ine-
quality states
CLIFFS QUICK REVIE W
32
VECTO R ALGEBR A
Ix
Yl llxll 11371 1
which says the absolute value of the dot product of two vector s is never greater than the product of their norms . Because o f this inequality, it must be true that for any two nonzero vector s x and y, Ix . yl<
lixil iil i Since both and IIYII are positive, the absolute value sign s can be repositioned : x• y
1
113711
a statement which now directly implie s x•y l x ll 113, 11
-
This final inequality says that there is precisely one value of 0 between 0 and it (inclusive) such tha t x . y = cos 0 lx ii 113'ii This 0 is called the angle between the vectors x and y ; geometrically, it is the smaller angle between them. (Note : No angle 0 is defined if either x or y is the zero vector .) To verify that this 0 is indeed the geometric angle between x and y , consider Figure 26, where it is assumed that the angle betwee n x and y is acute .
LINEAR ALGEBRA
33
VECTOR ALGEBRA
z
y
A
x cx ■ Figure 26 ■ The vector z is orthogonal to x, and the figure shows that y is the sum of z and a positive scalar multiple, cx, of x : cx + z = y Taking the dot product of both sides of this equation with x yields cx + z = y x (cx + z) = x y c(x . x)+x . z=xe y Since x and z are orthogonal, the dot product x • z is 0 . This reduces the equation above t o
But Figure 26 and the definition of cos 0 indicate that iicxil =Now,since ispotive,Ilcx = Pill = c
so this equatio n
becomes
CLIFFS QUICK REVIE W
34
VECTO R ALGEBR A
lIIIcos 0 ll x ll
Equations (*) and (**), together with the identity x • x = 114 z , then imply _r Jyll
llxll2
coSe llxll
which becomes x • y = ',xi(
cos 0
This proof can be extended to the case where the angle between x and y is obtuse, thus validating the following alternate—but entirely equivalent—definition of the dot product : x y = 'Ix!'
cos 0
Note that this equation is consistent with the observatio n x 1.y ~ x • y = 0 , since 8 = It/2 implies cos 8 = 0 . Example 23 : Use the dot product to determine the angle between the vectors x = (2, 3, 0) and y = (—1, 1, 4) from Example 13 . From the boxed equation directly above ,
LINEAR ALGEBRA
35
VECTO R ALGEBRA
cos°
x = D x li IiIi
24—1)+3 . 1+0 . 4 V2 2
+3 2 +O Z V(—1) 2 +1 2 +42
1 x/234 Therefore, 8 =cos-' V1 / 234 . [Technical note : In Example 13, it was determined tha t -l V233/234 . Although this is consistent with the presen t 0 = sin calculation (because cos 2 6 + sin 2 0 must always equal 1 fo r any 0, and this is certainly true here), it is better to use the do t product than the cross product to determine the angle between two vectors in R3. Why? The statement sinO = .233/234 implie s that 0 is either 86 .25° or 93 .75°, and without further investigation, it is difficult to say which is correct . Even a picture may not help here ; the angle is so close to 90° that your drawing will probably not be accurate enough to tell the difference . But the statement cos ° = 1/1/234 says that 0 is definitely 86 .25°, with no ambiguity . Within the range between 0 and 180°, the sine function is entirely positive and cannot differentiate between an acute angle and its supplement . However, the cosine function is positive for acute angles and negative for obtuse angles, so it can—immediately—differentiat e ■ between an acute angle and its supplement .] Orthogonal projections . Consider two nonzero vectors x an d y emanating from the origin in W . Dropping a perpendicular from the tip of x to the line containing y gives the (orthogonal) projection of x onto y. This vector is denoted proj Yx.
CLIFFS QUICK REVIE W
36
VECTOR ALGEBRA
■ Figure 27 ■
■ Figure 28 ■ If 8 < n/2 (Figure 27), then the component of x along y, a positive scalar denoted comp yx, is equal to the norm of th e (vector) projection of x onto y . If 8 > 7c/2 (Figure 28), the n the component of x along y is a negative scalar, equal to th e negative of the norm of the projection of x onto y . (And i f
LINEAR ALGEBRA
37
VECTO R ALGEBRA
8 = n/2 , then comp yx = 0, since the orthogonal projection o f x onto y is the zero vector.) In any case, the following equation holds: compyx = 11xil cos 8 where 0 is the angle between x and y . Now, since x • y 114 11Y11 cos 0, this equation for the component of x along y ca n be rewritten as compyx = Ilxll .11xll The vector projection of x onto y is equal to this scalar time s the unit vector in the direction of y : proj Yx = (compyg)y x .y y
Or, since
2=y'Y proj Yx =
x y
y
Y .Y Example 24 : Find the projection of x = (2, 2, 4) onto the vector y = (2, 6, 3) . If 0 is the angle between x and y, then the component of x along y is given by
CLIFFS QUICK REVIEW
38
VECTOR ALGEBRA
comps x = cos 0
cos 0
lixil
xr
I
1131 1 (2)(2) + 42 Z
(2)(6) + (4)(3) +62 +3 2
=4 Therefore, 2,6,3)_ 8 24 ' - ~7, 7 7
projYx = (comp s x)y = 4y = 4 IIYII
9
j)
12
See Figure 29 .
■ Figure 29
■
Lines. A line is determined by two distinct points, say p and q . If the vector pq is drawn from p to q, then pq defines th e line's direction. The description of a line can therefore be re formulated as follows : A line L is uniquely determined b y
LINEAR ALGEBRA
39
VECTOR ALGEBRA
• a given point through which L passes , and • a given (nonzero) vector which is parallel to L Let p be the given point through which the line will pass , and let v be the vector that defines its direction .
■ Figure 30 ■ From Figure 30, it is easy to see that a point x will be on th e line if and only if the vector px is parallel (or antiparallel) t o v, which happens if px is a scalar multiple of v : px = tv Or, since px = x — p, x=p+tv This is the parametric equation for the line through p paralle l to v . The scalar t is the parameter, and every point on the lin e is given by a particular choice of t.
CLIFFS QUICK REVIEW 40
VECTOR ALGEBRA
Example 25 : Find the equation of the line L in R 3 that passe s through the point p = (2, 4, 2) and is parallel to the vector v = (1, 2, 3) . Where does this line pierce the x-y plane ? A point x = (x, y, z) is on the line L if and only if the vec tor px is a scalar multiple of v : px = tv x–p=tv x = p+tv x = (2, 4, 2)+t(1, 2, 3)=(2+t, 4+ 2t, 2+ 3t) Therefore , L={(x,
y, z) : x=2+t, y=4+2t, z=2+3t, foranytinR}
Now, the line intersects the x -y plane when z = 0 . Sinc e z=0
2+3t=0
t=— 3
L pierces the x-y plane when the parameter t takes on the valu e -2/3. For this t, x=2+t=2— 3= 3 and y=4+2t=4— 3 = 3 so the point of intersection of L and the x-y plane is a = (4/3 , 8/3, 0). See Figure 31 .
LINEAR ALGEBRA 41
VECTO R ALGEBRA
y
■ Figure 31
■
Example 26 : Give the equation of the line in R 4 that passe s through the points a = (—1, 1, 2, 0) and b = (3, 4, 0, -5) . Doe s the point c = (7, 7, 2, -2) lie on this line ? Since the line is parallel to the vecto r v = ab = b — a = (3, 4, 0, -5) — (—1, 1,2,0)=(4,3, 2,—5 ) every point x on the line is described by the parametric equation x=a+tv=(—1, 1,2,0)+t(4, 3, 2,—5 ) = (—1 + 4t, 1 + 3t, 2 — 2t, -5t) The point c = (7, 7, 2, -2) will lie on this line if and only i f there is a value of the parameter t such tha t (—1 +4t, 1 +3t,2—2t,—5t)=(7,7, 2, 2) (*) However, although the first three components in (*) agre e when t = 2, the fourth components do not . Therefore, th e point c does not lie on the line. ■
CLIFFS QUICK REVIE W 42
VECTO R ALGEBRA
Planes . A plane in R3 is determined by three noncollinea r points. If these points are labeled a, b, and c, then the cros s product of the vectors ab and ac will give a vector v perpendicular to the plane . This vector v defines the plane's orientation in space; see Figure 32 .
■ Figure 32 ■ The description of a plane can be formulated as follows : plane P is uniquely determined by
A
• a given point through which P passes , and • a given nonzero vector which is normalthat is, perpendicular-to P Let a be the given point on the plane, and let v be the definin g normal vector, with its initial point at a . Then, as illustrated in Figure 33, for a point x to lie in P, v must be perpendicular t o the vector ax .
LINEAR ALGEBRA
43
VECTO R ALGEBRA
■ Figure 33 ■ Since v l ax = v • ax = 0, the plane is determined by th e equation v-ax =0 (* ) To illustrate, let the point a = (al, a2, a3) and the vector v = (v1, v2, v3). Since for any point x = (x, y, z), the vecto r ax=x—a=(x—a,, y—a2 , z—a 3 ), equation (*) become s vl (x —al )+v2 (y—a2 )+v3(z —a 3 ) = 0 which can also be written a s vlx+v2y+viz = d where d = v l a i + v2a2 + v3a3 . For a plane in R 3 , this is the standard equation . Note carefully that the coefficients of x, y, and z in the standard equation are precisely the components of a vector normal to the plane.
CLIFFS QUICK REVIEW 44
VECTOR ALGEBRA
Example 27 : Give the standard equation of the plane P determined by the points p = (2, -1, 2), q = (2, 2, -1), and r = (0, 1, 1) . Does this plane contain the origin? If not, give th e equation of the plane which is parallel to P that does contai n the origin . The vectors pq = (0, 3, -3) and pr = '(2, 2, -1) lie in P , and their cross product, pq x pr, is normal to P. See Figure 34 .
■ Figure 34 ■ Recalling the expression XXy=
(xl ,
x2 )
x3)x (YD Y2, Y3 )
_ ( xzYs — xaYz~ x3Y1
—
x iY3~ xlYz — xzYi )
for the cross product, it is determined tha t pq x pr = (0, 3, — 3) x (—2, 2, -1) =(3•(—1)—(—3)•2, (—3)•(—2)—0•(—1), 0 .2—3•(—2)) _ (3, 6, 6 )
LINEAR ALGEBRA
45
VECTOR ALGEBRA
Since the vector v = (3, 6, 6) is normal to P, the standar d equation of P is given b y 3x+6y+6z= d for some constant d . Substituting the coordinates of any of the three given points (p, q, or r) into this equation yields d = 12 . Thus, P is given by the equation 3x + 6y + 6z = 12, or mor e simply, x+2y+2z= 4 Now, since (0, 0, 0) does not satisfy this equation, P does no t contain the origin. However, the equation x + 2y + 2z = 0 specifies a plane parallel to P which is satisfied by 0 = (0, 0, 0). See Figure 35 . x+2y+2z= 0 P', parallel to P, does contain the origi n x+2y+2z = 4 P does no t contain the origin y
■ Figure 35
■
CLIFFS QUICK REVIEW
46
MATRIX ALGEBRA
Much of the machinery of linear algebra involves matrices, which are rectangular arrays of numbers . In this chapter, th e fundamental definitions and operations involving matrices wil l be stated and illustrated . The rest of the book will then b e primarily devoted to applying what is learned here .
Matrice s A rectangular array of numbers, enclosed in a large pair o f either parentheses or brackets, such a s C 1 0 -31 -2 2 44 11J
r
or~
-31
10
-2 2 L—
4 4
1
is called a matrix . The size or dimensions of a matrix are specified by stating the number of rows and the number o f columns it contains . If the matrix consists of m rows and n columns, it is said to be an m by n (written m x n) matrix . For example, the matrices above are 2 by 3, since they contain 2 rows and 3 columns: row 1
0
-3
row 2
-2 4
1
11
1
columns 1, 2, 3 Note that the rows are counted from top to bottom, and th e columns are counted from left to right .
LINEAR ALGEBRA
47
MATRIX ALGEBRA
The numbers in the array are called the entries of the matrix, and the location of a particular entry is specified by giving first the row and then the column where it resides . The entry in row i, column j is called the (i, j) entry. For example , since the entry -2 in the matrix above is in row 2, column 1, i t is the (2, 1) entry. The (1, 2) entry is 0, the (2, 3) entry is 1 , and so forth. In general, the (i, j) entry of a matrix A is written alp, and the statement A = [alj]mx n indicates that A is the m x n matrix whose (i, j) entry is alp. Example 1 : The set of all m x n matrices whose entries ar e real numbers is denoted Mmxn(R) . If A E M2.3(R), how man y entries does the matrix A contain? Since every matrix in MZX3(R ) consists of 2 rows and 3 columns, A will contain 2 x 3 = 6 entries . An example of suc h a matrix is A
10 3 -2 4 1
■
Example 2 : If B is the 2 x 2 matrix whose (i, j) entry is given by the formula bu = (—1)'+'(i +j), explicitly determine B. The (1, 1) entry of B is b ll = (—1)' +'(1 + 1) = 2 ; the (1, 2 ) entry is b l2 = (—1) 1+2(1 + 2) = -3 ; the (2, 1) entry is also -3 ; and the (2, 2) entry is b 22 = (—1)Z+2 (2 + 2) = 4 . Therefore ,
48
CLIFFS QUICK REVIEW
MATRIX ALGEBRA
r bZ] 2
B _ [bzi
2 4]
■
Example 3 : Give the 3 x 3 matrix whose (i, j) entry is ex pressed by the formula 1 ifi= j
(S;; is called the Kronecker delta .) The (1, 1), (2, 2), and (3, 3) entries are each equal to 1, but al l other entries are 0 . Thus, the matrix i s 1 0 0[S''] I 3x3
01 0 0 0 1
■
Entries along the diagonal. Any entry whose column numbe r matches its row number is called a diagonal entry; all othe r entries are called off-diagonal . The diagonal entries in each o f the following matrices are highlighted: _
1 -2
0 -3
F4
1
B
-3
0
0
0
1
0
0
0
1
- C
© -3 '
[803x3 =
In the matrix A, the diagonal entries are a ll = 1 and a22 = 4; in B, the diagonal entries are b 11 = 2 and b22 = 4 ; and in th e matrix [5 j] 3 X3, the diagonal entries are 811 = 822 = 833 = 1 .
LINEAR ALGEBRA
49
MATRI X ALGEBRA
If every off-diagonal entry of a matrix equals zero, the n the matrix is called a diagonal matrix . For example, the matrix [5 j]3 X 3 above is a diagonal matrix . It is not uncommon i n such cases, particularly with large matrices, to simply leave blank any entry that equals zero . For example , 1 0 0 0 1 0 001
1 and
1 1
are two ways of writing exactly the same matrix . Blocks o f zeros are often left blank in nondiagonal matrices also . A n n x n diagonal matrix whose entries from the upper-left to the lower-right—are a l I, a22, . . ., ann is often written Diag(a 11 , a22 , . . ., anti) * Square matrices . Any matrix which has as many columns a s rows is called a square matrix . The 2 x 2 matrix in Example 2 and the 3 x 3 matrix in Example 3 are square . If a square matrix has n rows and n columns, that is, if its size is n x n, then the matrix is said to be of order n . Triangular matrices . If all the entries below the diagonal of a square matrix are zero, then the matrix is said to be upper triangular . The following matrix, U, is an example of an uppe r triangular matrix of order 3 : 1 -3 2 U= 0 4 1 0 0 -1
50
CLIFFS QUICK REVIEW
MATRIX ALGEBRA
If all the entries above the diagonal of a square matrix ar e zero, then the matrix is said to be lower triangular . The following matrix, L, is an example of a lower triangular matrix o f order 4: _2 -2 0 0 0L=
1 4 0
0
1
4
0 5 -1 1 -2 1 -3
0 5 -1 0 1 -2 1 -3
A matrix is called triangular if it is either upper triangular o r lower triangular . A diagonal matrix is one that is both uppe r and lower triangular . The transpose of a matrix. One of the most basic operation s that can be performed on a matrix is to form its transpose . Le t A be a matrix ; then the transpose of A, a matrix denoted b y AT, is obtained by writing the rows of A as columns . More pre cisely, row i ofA is column i ofAT (which implies that colum n j ofA is rowj of AT). IfA is m x n, then A T will be n x m. Also, it follows immediately from the definition that (AT)T = A . Example 4 : The transpose of the 2 x 3 matrix A
10 -2 4
3 1
is the 3 x 2 matrix AT
1 -2 0 4 -3 1
■
LINEAR ALGEBRA 51
MATRIX ALGEBRA
Example 5 : Note that each of the matrices in Examples 2 an d 3 is equal to its own transpose : 2 [ -3
-3] T 2 4 — [—3
1 0 1 0 0 V -T
-3
and
4]
0 0
1 0
0 1
1 = 0 0
0 01 0 0 1
Any matrix which equals its own transpose is called a symmetric matrix . ■ Row and column matrices . A matrix that consists of precisely one row is called a row matrix, and a matrix that consists o f precisely one column is called a column matrix . For example , R = [l
-31
0
is a row matrix, while
C=
1 -2 0 4
is a column matrix . Note that the transpose of a row matrix i s a column matrix and vice versa ; for example , 1 RT = 0 -3
and
T = [1 -2 0
41
0C
Row and column matrices provide alternate notations for a vector. For example, the vector v = (2, -1, 6) in R 3 can be expressed as either a row matrix or a column matrix :
52
CLIFFS QUICK REVIE W
MATRI X ALGEBRA
v = [2
-1
61
or v =
It is common to denote such a matrix by a bold, lower-cas e (rather than an italic, upper-case) letter and to refer to it a s either a row vector or a column vector . Zero matrices . Any matrix all of whose entries are zero i s called a zero matrix and is generically denoted 0 . If it is important to explicitly indicate the size of a zero matrix, the n subscript notation is used . For example, the 2 x 3 zero matri x [0
0 0]
0 0
0
would be written 0 2x3 . If a zero matrix is a row or column matrix, it is usually denoted 0, which is consistent with the designation of 0 as the zero vector.
Operations with Matrices As far as linear algebra is concerned, the two most importan t operations with vectors are vector addition [adding two (o r more) vectors] and scalar multiplication (multiplying a vecto r by a scalar) . Analogous operations are defined for matrices . Matrix addition. If A and B are matrices of the same size, then they can be added . (This is similar to the restriction o n adding vectors, namely, only vectors from the same space R n
LINEAR ALGEBRA
53
MA MIX ALGEBRA
can be added ; you cannot add a 2-vector to a 3-vector, fo r example .) If A = [alb ] and B = [b u] are both m x n matrices , then their sum, C = A + B, is also an m x n matrix, and its entries are given by the formula c . .=a . +b ~.i Thus, to find the entries of A + B, simply add the corresponding entries of A and B .
Example 6 : Consider the following matrices : 2 -1 [4 4 - 3 F= 3 0 G =
H=
-5 2 Which two can be added? What is their sum ? Since only matrices of the same size can be added, onl y the sum F + H is defined (G cannot be added to either F o r H) . The sum of F and H i s
[1
1 6 2+1 -1+6 3 5 F+H= 3 0 + -1 -2 = 3—1 0—2 = 2 - 2 -5 2 0 -3 -5+0 2—3 -5 - 1
2 -1
_
n
Since addition of real numbers is commutative, it follows that addition of matrices (when it is defined) is also commutative; that is, for any matrices A and B of the same size, A + B will always equal B + A .
54
CLIFFS QUICK REVIEW
MATRIX ALGEBR A
Example 7 : If any matrix A is added to the zero matrix of th e same size, the result is clearly equal to A : A +0=0+A = A This is the matrix analog of the statement a + 0 = 0 + a = a, which expresses the fact that the number 0 is the additive iden■ tity in the set of real numbers . Example 8 : Find the matrix B such that A + B = C, where 2 A = [ 1 4J and C =
-11
3 [—2
2
If B=
rb~ ~
b, 2 1
[b2 ,
112 2
then the matrix equation A + B = C become s
[2+bll
0+4 2
3
1
4 + b22
-2
2
Since two matrices are equal if and only if they are of th e same size and their corresponding entries are equal, this las t equation implies b 11
= 1,
b 12 = -1, b 21 = -3,
and b22
=
2
Therefore,
1 -1 B — [—3 -2]
LINEAR ALGEBRA
55
MATRI X ALGEBRA
This example motivates the definition of matrix subtraction : If A and B are matrices of the same size, then the entries of A — B are found by simply subtracting the entries of B from the corresponding entries of A . Since the equation A + B = C is equivalent to B = C — A, employing matrix subtraction abov e would yield the same result :
L
3 B=C—A=I -2 2]
3— 2 — [1 4 ] [2 -1 2—4] [—3 -21 ,
■
Scalar multiplication. A matrix can be multiplied by a scalar as follows. If A = [au] is a matrix and k is a scalar, then
kA = [kay] That is, the matrix kA is obtained by multiplying each entry o f A by k. Example 9 : If A=
[ 10 -2
4
-31 1J
then the scalar multiple 2A is obtained by multiplying ever y entry of A by 2 : 2A =
[ 2 0 -6 1 -4
8
2J
Example 10 : If A and B are matrices of the same size, the n A — B = A + (—B), where —B is the scalar multiple (—1)B . If
56
CLIFFS QUICK REVIEW
MATRIX ALGEBRA
1] and
A – [–2 4
13-
B=
L
2 -1 –WS J
then 1 0 -31 A–B=A+(–B)=[–2 4 1]
-3 2 + [–1 1
-2 2 -3 5] – [–3 5 6]
This definition of matrix subtraction is consistent with th e ■ definition illustrated in Example 8 . Example 11 : If
A – [–2
4
11
and
B=
then 3AT +4B=
4 2 8 -1 6 12 -4
7 8
-4 -4
3 -1 _
Matrix multiplication . By far the most important operatio n involving matrices is matrix multiplication, the process o f multiplying one matrix by another . The first step in definin g matrix multiplication is to recall the definition of the do t product of two vectors . Let r and c be two n-vectors . Writing r as a 1 x n row matrix and c as an n x 1 column matrix, the dot product of r and c i s
LINEAR ALGEBRA 57
MATRIX ; ALGEBRA
cl r•c=[II
r2
•• r„~• C2 = l"1c1 + r2c2 -}. . . . + rn cn cn
Note that in order for the dot product of r and c to be defined , both must contain the same number .of entries . Also, the orde r in which these matrices are written in this product is importan t here : The row vector comes first, the column vector second . Now, for the final step : How are two general matrices multiplied? First, in order to form the product AB, the number of columns of A must match the number of rows of B ; if this condition does not hold, then the product AB is not defined . This criterion follows from the restriction stated above for multi plying a row matrix r by a column matrix c, namely that th e number of entries in r must match the number of entries in c . If A is m x n and B is n x p, then the product AB is defined, and the size of the product matrix AB will be m x p . The following diagram is helpful in determining if a matrix produc t is defined, and if so, the dimensions of the product : A dimensions of matrix :
mxn t
B = AB nxp f
mxp
these must match
Thinking of the m x n matrix A as composed of the row vec tors r1 , r2, . . ., rm from Rn and the n x p matrix B as composed of the column vectors c 1, c2, . . ., cp from Rn,
CLIFFS QUICK REVIEW 58
MATRIX ALGEBRA
A=
rl
-4
r2
–>
and B= col 1 col 2
col p
the rule for computing the entries of the matrix product AB is rl • cj = (AB) y , that is, The dot product of row i in A and column j in B gives the (i, j) entry of A B Example 12 : Given the two matrices A=
L
_2
1 0 4 0 1J and B= -2 3 -1 2 0 -1
15 1
determine which matrix product, AB or BA, is defined an d evaluate it . Since A is 2 x 3 and B is 3 x 4, the product AB, in that order, is defined, and the size of the product matrix AB will be 2 x 4. The product BA is not defined, since the first factor (B ) has 4 columns but the second factor (A) has only 2 rows . Th e number of columns of the first matrix must match the number of rows of the second matrix in order for their product to b e defined .
LINEAR ALGEBRA 59
MATRI X ALGEBRA
Taking the dot product of row 1 in A and column 1 in B gives the (1, 1) entry in AB . Since 1 [1 0 -3]• -2 = (1)(1)+(0)(—2)+(—3)(0) = 1 0 the (1, 1) entry in AB is 1 : 1 -2
0 4
-3 1
1
0
4
1
-2 0
3 -1
-1 2
5 1
1
J
The dot product of row 1 in A and column 2 in B gives the (1 , 2) entry in AB, 1 1 0 1 0 -33 [—2 4 1
-2 0
3 -1
and the dot product of row 1 in A and column 3 in B gives th e (1, 3) entry in AB : 1 0 1 0 -3 1 J
4 -1 2
1 5 1
-2
The first row of the product is completed by taking the do t product of row 1 in A and column 4 in B, which gives the (1 , 4) entry in AB :
60
CLIFFS QUICK REVIE W
MATRIX ALGEBR A
1 1 0 -3 -2 4 1
-2
5 1
J
Now for the second row of AB: The dot product of row 2 in and column 1 in B gives the (2, 1) entry in AB , 1 0 -3 -2 4 1
J
1
0 4
1
-2 0
3 -1 -1 2
5 1
1 -1 0
and the dot product of row 2 in A and column 2 in (2, 2) entry in AB :
B
gives th e
0
1 0 -3 -2 4 1
A
-2 0
3 -1
Finally, taking the dot product of row 2 in A with columns 3 and 4 in B gives (respectively) the (2, 3) and (2, 4) entries i n AB:
-2
1 -2
0 4
0 4
-3 1
-3 1
1
0
-2 J 0
3 -1
1 -2 J 0
0 3 -1
4 -1
1 5
2
1
4 -1 2
5 1
1 -10
3 11
1 -10
3 11
-2 -1 0
-2
-10
-2
-2 19 19
J
LINEAR ALGEBRA 61
MATRIX ALGEBRA
Therefore , 1 AB = [—2
0 4
1 -2
3 1
0
0
4
1
3 -1
-1 2
5 1
1 -10
3 11
-2
2
-10
19] ■
Example 13 : If
C=
0 1
-2 6
1 -4
5 8
-3 0
2 10
0 7
-1 1
32 -1 3
and
D=
4
12
0
-2
-7
-3
0 2 6
-1 2
3 -4 2
5 0
-2 9
1 4
3 -1
-1 8
0 -1
-3
-5 4
1
compute the (3, 5) entry of the product CD. First, note that since C is 4 x 5 and D is 5 x 6, the produc t CD is indeed defined, and its size is 4 x 6. However, there is no need to compute all twenty-four entries of CD if only one particular entry is desired . The (3, 5) entry of CD is the do t product of row 3 in C and column 5 in D :
62
CLIFFS QUICK REVIE W
MATRI X ALGEBRA
- -7-2
[—3 2 0 -1 -1] .
9 = (—3)(—7) + (2)(—2) + (0)(9) - 1 + (—l)(—l) + (—1)(8) = 1 0 8
■
Example 14 : If
A
10
3
-2 4
1
and B =
verify tha t [17
AB =
3
7 4]
but 2
BA=
-1-
3 0 -5 2
10
3
-2 4
1
In particular, note that even though both products AB and BA are defined, AB does not equal BA ; indeed, they're not even the same size! ■
The previous example gives one illustration of what i s perhaps the most important distinction between the multiplication of scalars and the multiplication of matrices . For rea l
LINEAR ALGEBRA
63
MATRI X ALGEBRA
numbers a and b, the equation ab = ba always holds, that is, multiplication of real numbers is commutative ; the order i n which the factors are written is irrelevant . However, it is decidedly false that matrix multiplication is commutative . For th e matrices A and B given in Example 14, both products AB and BA were defined, but they certainly were not identical . In fact, the matrix AB was 2 x 2, while the matrix BA was 3 x 3 . Here is another illustration of the noncommutativity of matrix multiplication : Consider the matrices 2 -1 and D=[3 0
C=
]
Since C is 3 x 2 and D is 2 x 2, the product CD is defined, it s size is 3 x 2, and 1 -2 CD = 0 -3
4 1
The product DC, however, is not defined, since the number o f columns of D (which is 2) does not equal the number of row s of C (which is 3) . Therefore, CD DC, since DC doesn't even exist . Because of the sensitivity to the order in which the factors are written, one does not typically say simply, "Multiply th e matrices A and B ." It is usually important to indicate whic h matrix comes first and which comes second in the product. For this reason, the statement "Multiply A on the right by B " means to form the product AB, while "Multiply A on the left by B" means to form the product BA .
CLIFFS QUICK REVIEW
64
MATRIX ALGEBRA
Example 15 : If
and x is the vector (–2, 3), show how A can be multiplied o n the right by x and compute the product. Since A is 2 x 2, in order to multiply A on the right by a matrix, that matrix must have 2 rows . Therefore, if x is written as the 2 x 1 column matrix 2 x=[— then the product Ax can be computed, and the result is an other 2 x 1 column matrix :1-
Ax =
2 [–4 5][
■
Example 16 : Consider the matrice s A
=L
3
~1
and
B=
[4
0 2
1
IfA is multiplied on the right by B, the result i AB=[3
2 s17][ 4 2 ] – [25 14]
but if A is multiplied on the left by B, the result i 2 1 slOy1-2r– BA=[4 2][3
7]–[10
6
LINEAR ALGEBRA
65
MATRIX ALGEBRA
Note that both products are defined and of the same size, bu t ■ they are not equal . Example 17 : If A and B are square matrices such that A B = BA, then A and B are said to commute . Show that any two square diagonal matrices of order 2 commute . Let A
_ [au ~
oi
B . rbii O
and
L o0
a 22
b b22 2 J
be two arbitrary 2 x 2 diagonal matrices . The n AB=
0 a~~
0 a022
0
1_['ii'-'ii
b22 ~
a220[ b22
0
and BA =
[bii
0
0 bo22 JCai1 0 a 22
Since a 11 b 11 = b 11 a 11 and ■ equal BA, as desired.
J [bii aii
0
bb22 22 ca22
00
a22b22 = b22a22,
AB does indee d
Although matrix multiplication is usually not commutative, it is sometimes commutative; for example, i f E=[~ 4J and F= [
0
~
J
then
CLIFFS QUICK REVIE W
66
MATRIX ALGEBRA
EF=[04
~IL
0
3] — [
0
12J — ~ 0
OJLp
E 4] =F
Despite examples such as these, it must be stated that in general, matrix multiplication is not commutative . There is another difference between the multiplication o f scalars and the multiplication of matrices . If a and b are rea l numbers, then the equation ab = 0 implies that a = 0 or b = 0 . That is, the only way a product of real numbers can equal 0 i s if at least one of the factors is itself 0 . The analogous statement for matrices, however, is not true. For instance, if G= then GH =
11 11
l 1 11
and H =
1 _1
CJ L -1
1=
1J
1 -1 -1
0 r~0
1
0~_ 0 0
Note that even though neither G nor H is a zero matrix, th e product GH is. Yet another difference between the multiplication of scalars and the multiplication of matrices is the lack of a general cancellation law for matrix multiplication . If a, b, and c are real numbers with a 0, then, by canceling out the factor a, the equation ab = ac implies b = c. No such law exists fo r matrix multiplication ; that is, the statement AB = AC does no t imply B = C, even if A is nonzero. For example, if A=
ri 2 B= 3 4
and
c .[331
4 2
LINEAR ALGEBRA 67
MATRIX ALGEBRA
then both AB =
Ii [1 1]
2 4 4] — [4 6]
and AC =
Ll
4 6 4 1][ 1 3 2] — [4 6]
Thus, even though AB = AC and A is not a zero matrix, B does not equal C. ■ Example 18 : Although matrix multiplication is not alway s commutative, it is always associative . That is, if A, B, and C ar e any three matrices such that the product (AB)C is defined , then the product A(BC) is also defined, and (AB)C = A(BG ) That is, as long as the order of the factors is unchanged, how they are grouped is irrelevant. Verify the associative law for the matrices
A
1 2 [], 0 -1
r-i B
3 0 4 1 -6 ' and C =
First, since AB =
1 2 -1 3 0
-1
4 1
0 -6
_
7 5 -1 2 -4
-1
6
CLIFFS QM/CM-VIEW-
68
MATRIX ALGEBRA
the product (AB)C i s (AB)C =
7 5 -1 2 -4 4 -1 -4
55
6
-27
Now, since BC =
r-i
3
0
4 1 -6
the product A(BC) is [1
21F 1
0
-1127]
[55 -27]
Therefore, (AB)C = A(BC), as expected. Note that the associative law implies that the product of A, B, and C (in that order ) can be written simply as ABC; parentheses are not needed t o resolve any ambiguity, because there is no ambiguity . ■ Example 19 : For the matrice s 1 A= O
21 1~ and
B=
[—1 3 41
6
verify the equation (AB)T = BTA T. First, AB =
J 0 1 [ 4 1 -6
-4 7
5 -1
-1 6
LINEAR ALGEBRA 69
MATRI X ALGEBRA
implies 7 -45 -1
(AB)T =
-12
6
Now, since -1 BT AT =
3 0
4[1 1 2 -6
°J
-1
7 5 -12
-4-1 6
BTAT does indeed equal (AB) T. In fact, the equation (AB)T - B TA T holds true for any two matrices for which the product AB is defined . This says that if the product AB is defined, then th e transpose of the product is equal to the product of the transposes in the reverse order . ■ Identity matrices. The zero matrix 0mxn plays the role of th e additive identity in the set of m x n matrices in the same wa y that the number 0 does in the set of real numbers (recall Ex ample 7) . That is, if A is an m x n matrix and 0 = Omxn, then A+0=0+A=A This is the matrix analog of the statement that for any real number a, a+ 0 =0 +a=a
CLIFFS QUICK REVIE W
70
MATRIX ALGEBRA
With an additive identity in hand, you may ask, "What about a multiplicative identity?" In the set of real numbers, the multiplicative identity is the number 1, sinc e a•1=1•a= a Is there a matrix that plays this role? Consider the matrice s A=
ri 2
and
34
I=
ri
0
0
1
and verify that 1 2 1 01 Al= [ l 3 4 0 1
ri
2
3
4
1 _ 0 1 3 4] 3
2
= A
and IA=
1 01[1
21
=A
4
Thus, AI = IA = A . In fact, it can be easily shown that for this matrix I, both products AI and IA will equal A for any 2 x 2 matrix A . Therefore, IZ
=L
~ 0J = 1 L
1 J
is the multiplicative identity in the set of -2 x 2 matrices . Similarly, the matrix 1 0 001 0 00 1
1 1 1
is the multiplicative identity in the set of 3 x 3 matrices, and s o on . (Note that 13 is the matrix [5 j]3x3 encountered in Exampl e
LINEAR ALGEBRA
71
MATRI X ALGEBRA
3 above .) In general, the matrix In—the n x n diagonal matri x with every diagonal entry equal to 1 —is called the identit y matrix of order n and serves as the multiplicative identity in the set of all n x n matrices. Is there a multiplicative identity in the set of all m x n matrices if m ~ n? For any matrix A in Mmxn(R), the matrix lm is the left identity (I,nA = A), and In is the right identity (AIn = A). Thus, unlike the set of n x n matrices, the set of nonsquar e m x n matrices does not possess a unique two-sided identity , because lm In if m ~ n . Example 20 : If A is a square matrix, then A 2 denotes th e product AA, A3 denotes the product AAA, and so forth . If A is the matrix
show that A3 = A. The calculation ] AZ = C° ~JC°
~J
~
°
shows that A2 = I. Multiplying both sides of this equation b y A yields A 3 = A, as desired . [Technical note : It can be shown that in a certain precise sense, the collection of matrices of th e form a _bl b aJ where a and b are real numbers, is structurally identical to th e
CLIFFS QUICK REVIEW
72
MATRIX ALGEBRA
collection of complex numbers, a + bi. Since the matrix A in this example is of this form (with a = 0 and b = 1), A corresponds to the complex number 0 + 1 i = i, and the analog o f the matrix equation A 2 = —I derived above is i2 = - 1, an equation which defines the imaginary unit, i .] ■ Example 21 : Find a nondiagonal matrix that commutes wit h A=
ri
2
3
4
The problem is asking for a nondiagonal matrix B suc h that AB = BA . Like A, the matrix B must be 2 x 2 . One way t o produce such a matrix B is to form A2, for if B = A 2, associativity implie s AB=A•AZ = A(AA) = (AA)A = A2 •A=BA (This equation proves that A 2 will commute with A for any square matrix A ; furthermore, it suggests how one can prov e that every integral power of a square matrix A will commute with A .) In this case,
B= A 2
1 7 [3 4][3 4] — [15 22]
which is nondiagonal . This matrix B does indeed commut e with A, as verified by the calculation s
AB=
[1
71 4][15
J3 7 0 22]
81
118]
LINEAR ALGEBRA 73
MATRIX ALGEBRA
and 7 10 1 2
37 54
15 22 3 4
81 11 8
A=
An =
■
0 11 0
I [0
ft
for every positive integer n . A few preliminary calculations illustrate that the give n formula does hold true : A2
=
1 1
[0
1][0
1] — [0
1]
1
2 1 1][0
1] — [0
1
A 3 = A2 • A = [0
ri A4 — A3 A — [0
1] [0
1] —
1
However, to establish that the formula holds for all positive integers n, a general proof must be given . This will be done here using the principle of mathematical induction, which reads as follows . Let P(n) denote a proposition concerning a positive integer n . If it can be shown that
CLIFFS QUICK REVIEW 74
MA TRIX ALGEBRA
P(1) is true and P(n) is true
~ P(n + 1) is tru e
then the statement P(n) is valid for all positive integers n . In the present case, the statement P(n) is the assertion 1
n
An 0
1
Because A l = A, the statement P(1) is certainly true, since A'
_ ri
1
0
1
Now, assuming that P(n) is true, that is, assuming An
=
1 0
n 1
it is now necessary to establish the validity of the statement P(n + 1), which is
But this statement does indeed hold, because An+l = A n . A = [ 1 n
r
II= [1
By the principle of mathematical induction, the proof is com■ plete.
LINEAR ALGEBRA
75
MATRIX ALGEBRA
The inverse of a matrix. Let a be a given real number . Sinc e 1 is the multiplicative identity in the set of real numbers, if a number b exists such that
ab=ba = 1 then b is called the reciprocal or multiplicative inverse of a and denoted a-1 (or 1/a). The analog of this statement fo r square matrices reads as follows . Let A be a given n x n matrix. Since I= In is the multiplicative identity in the set of n x n matrices, if a matrix B exists such that AB =BA = I then B is called the (multiplicative) inverse of A and denote d A-' (read "A inverse") .
A=
[2 31 5 8j
A-' — L5
AA - 1 — S
L
2]
81_5 -21— [ol
~I= i J
and A-lA — [—S
3 J1 2][5 8] O
01 = 1 O
■
CLIFFS QUICK REVIEW
76
MA TRIX ALGEBRA
Yet another distinction between the multiplication of scalars and the multiplication of matrices is provided by the existence of inverses . Although every nonzero real number ha s an inverse, there exist nonzero matrices that have no inverse . Example 24 : Show that the nonzero matrix 1
A
1
0 0,
has no inverse. If this matrix had an inverse, the n
[0 0][c d
]
would equal
[0 0l J
for some values of a, b, c, and d . However, since the second row of A is a zero row, you can see that the second row of th e product must also be a zero row :
0lra [0
d) 0
0
(When an asterisk, *, appears as an entry in a matrix, it implie s that the actual value of this entry is irrelevant to the presen t discussion .) Since the (2, 2) entry of the product cannot equa l 1, the product cannot equal the identity matrix . Therefore, it is impossible to construct a matrix that can serve as the invers e for A . ■
If a matrix has an inverse, it is said to be invertible . Th e matrix in Example 23 is invertible, but the one in Example 2 4
LINEAR ALGEBRA 77
MATRI X ALGEBRA
is not . Later, you will learn various criteria for determining whether a given square matrix is invertible . Example 25 : Example 23 showed that 3 A= [2 58
A-1 = [ ' -3 -5 2
1
Given that B=
[1 -21 2 -3
=
21
[–3 -2
1
verify the equation (AB) --1 = B-1A-1 . First, compute AB: AB =
[2
1
3 8]1r l2 -2 1 -3 j
5
[21 -13 -34
Next, compute B-IA-1 : B1A1 [–2
-5
2
1 8
-21
Now, since the product of AB and B-1 A' is I, (AB)(B 'A ' ) =
[28
-13–34 -34][–21 138] ]
J1 LO
O
1
B-lA-l is indeed the inverse of AB . In fact, the equatio n (AB)-1
CLIFFS QUICK REVIE W
78
MA TRIX ALGEBRA
holds true for any invertible square matrices of the same size . This says that if A and B are invertible matrices of the sam e size, then their product AB is also invertible, and the inverse o f the product is equal to the product of the inverses in the reverse order . (Compare this equation with the one involvin g transposes in Example 19 above .) This result can be proved i n general by applying the associative law for matrix multiplication. Sinc e (AB)(B ' A ' ) = A(BB-' )A -' = AIA -' = AA -' = I and (B ' K ' )(AB) = B" ' (A -' A)B = B- 11B = B- 'B = I it follows that (AB) -1 = B- 'A -` , as desired.
■
Example 26 : The inverse of the matrix 1 -1 2 B= 2 0 3 0 1 -1 is 3 -1 3 B-1 = -2 1 -1 - 2 1 -2 Show that the inverse of B T is (ff l)T .
LINEAR ALGEBRA
79
MATRI X ALGEBRA
Form B T
and (B-1)T and multiply :
BTlB-1 )T = t
1 2 0 3 -2 -2 10 0 1 1 = 0 1 0= I -1 0 1 -1 2 3 -1 3 -1 -2 00 1
This calculation shows that (B_l)T is the inverse of BT. [Strictly speaking, it shows only that (B-1 )T is the right inverse of BT, that is, when it multiplies B T on the right, the product is th e identity . It is also true that (B-' )TBT = I, which means (B ' )T is the left inverse of B T. However, it is not necessary to explicitl y check both equations : If a square matrix has an inverse, there is no distinction between a left inverse and a right inverse .] Thus, (BT)_l _ rB-1 )T l an equation which actually holds for any invertible squar e matrix B . This equation says that if a matrix is invertible, then so is its transpose, and the inverse of the transpose is the trans ■ pose of the inverse . Example 27 : Use the distributive property for matrix multiplication, A(B ± C) = AB ± AC, to answer this question : If a 2 x 2 matrix D satisfies the equation D2 — D — 61 = 0, what is an expression for D-1 ? By the distributive property quoted above, D2 — D = D2 — DI = D(D -1) . Therefore, the equation D2 — D — 61 = 0 implies D(D — I) = 61. Multiplying both sides of this equation b y 1/6 gives 1 D .[J-16-(D—
CLIFFS QUICK REVIEW 80
MATRIX ALGEBRA
which implies D -1
= -k( D - I)
As an illustration of this result, the matri x [4 D
2
3 -3)
satisfies the equation D2 - D - 61 = 0, as you may verify . Since 6 (D-I )- 6
I3 L -3] - [0
3 6 [3
1]
-4
1 1 2
3 2 3
and D . R(D- I)] =
[4 3 -3]
2
3
-
F1
2 7--- LO
01
=I
1
the matrix k(D-I) does indeed equal D -l , as claimed.
■
Example 28 : The equation (a + b) 2 = a2 +tab + b2 is an identity if a and b are real numbers . Show, however, that (A + B)2 = A2 + 2AB + B 2 is not an identity if A and B are 2 x 2 matrices . [Note: The distributive laws for matrix multiplication are A(B ± C) = AB ± AC, given in Example 27, and the companion law, (A ± B)C = AC ± BC .]
LINEAR ALGEBRA
81
MATRIX ALGEBR A
The distributive laws for matrix multiplication imply (A+B)2
(A+B)(A+B) = (A + B)A + (A + B) B _ (AA + BA) + (AB + BB) =A Z +BA+AB+B2
Since matrix multiplication is not commutative, BA will usually not equal AB, so the sum BA + AB cannot be written as 2AB. In general, then, (A + B) 2 ~ A 2 + 2AB + B2. [Any matrices A and B that do not commute (for example, the matrices i n Example 16 above) would provide a specific counterexampl e to the statement (A + B) 2 = A 2 + 2AB + B2, which would also ■ establish that this is not an identity .] Example 29 : Assume that B is invertible. If A commutes with B, show that A will also commute with B-1. Proof. To say "A commutes with B" means AB = BA . B-l on the left and on the right an d Multiply this equation by use associativity : B-' (AB)B- 1 = B-' (BA)B-1 (B-1A)(BB- 1 ) = (B-1B)(AB-1 ) B-IA = AB-'
■
Example 30 : The number 0 has just one square root : 0. Show , however, that the (2 by 2) zero matrix has infinitely man y square roots by finding all 2 x 2 matrices A such that A 2 = 0 .
CLIFFS QUICK REVIEW
82
MATRIX ALGEBRA
In the same way that a number a is called a square root o f b if a 2 = b, a matrix A is said to be a square root of B if A2 = B. Let A= be an arbitrary 2 x 2 matrix . Squaring it and setting the resul t equal to 0 give s A
Z =
[a
ba
b
a2 + bc b(a + d) set [O 0 _ c d c d] c(a+d) be+d 2 [0 0]
The (1, 2) entries in the last equation imply b(a + d) = 0, which holds if (Case 1) b = 0 or (Case 2) d = -a. • Case 1 . If b = 0, the diagonal entries then imply a = 0 and d = 0, and the (2, 1) entries imply that c is arbitrary . Thus , for any value of c, every matrix of the for m
is a square root of 02x2 . • Case 2 . If d = –a, then the off-diagonal entries will bot h be 0, and the diagonal entries will both equal a2 + bc. Thus, as long as b and c are chosen so that bc = -a2, A 2 will equal O . A similar chain of reasoning beginning with the (2, 1) en tries leads to either a = c = d = 0 (and b arbitrary) or the sam e conclusion as before : as long as b and c are chosen so that bc = -a2, the matrix A 2 will equal 0.
LINEAR ALGEBRA 83
MATRI X ALGEBR A
All these cases can be summarized as follows . Any matri x of the following form will have the property that its square i s the 2 by 2 zero matrix :
ra c —a] b
with
be = —a 2
Since there are infinitely many values of a, b, and c such tha t = -a2 , the zero matrix 02X2 has infinitely many square roots . For example, choosing a = 4, b = 2, and c = -8 give s the nonzero matrix be
S= whose square is 4 SZ — [- 8
4 -4][-8 -4]
JOO
0 0] = 0
■
CLIFFS QUICK REVIEW
84
LINEAR SYSTEM S
The basic problem of linear algebra is to solve a system o f linear equations. A linear equation in the n variables—or unknowns x1 , x2, . . ., and xn is an equation of the form a1xl + a2x2 + . . .+anxn = b where b and the coefficients al are constants . A finite collection of such linear equations is called a linear system . To solve a system means to find all values of the variables 'tha t satisfy all the equations in the system simultaneously. For ex ample, consider the following system, which consists of two linear equations in two unknowns : xl + x2 = 3 3xl – 2x2 = 4 Although there are infinitely many solutions to each equatio n separately, there is only one pair of numbers x1 and x2 which satisfies both equations at the same time . This ordered pair , (xl, x2) = (2, 1), is called the solution to the system.
Solutions to Linear System s The analysis of linear systems will begin by determining the possibilities for the solutions . Despite the fact that the system can contain any number of equations, each of which can involve any number of unknowns, the result that describes th e possible number of solutions to a linear system is simple an d definitive. The fundamental ideas will be illustrated in th e following examples.
LINEAR ALGEBRA 85
LINEA R SYSTEM S
Example 1 : Interpret the following system graphically : x+y= 3 3x—2y= 4 Each of these equations specifies a line in the x -y plane , and every point on each line represents a solution to its equation . Therefore, the point where the lines cross-(2, 1)— satisfies both equations simultaneously ; this is the solution t o the system. See Figure 36 .
x -2y = 4
x +y =
(x, y) _ (2, 1 )
■ Figure 36
■
Example 2 : Interpret this system graphically : x+y= 3 x +y=— 2 The lines specified by these equations are parallel and d o not intersect, as shown in Figure 37 . Since there is no point o f intersection, there is no solution to this system . (Clearly, th e
CLIFFS QUICK REVIE W
86
LINEA R SYSTEMS
sum of two numbers cannot be both 3 and -2 .) A syste m which has no solutions—such as this one—is said to be inconsistent .
x+y =
■ Figure 37
■
Example 3 : Interpret the following system graphically : 3x—2y= 4
6x—4y= 8 Since the second equation is merely a constant multiple o f the first, the lines specified by these equations are identical, a s shown in Figure 38 . Clearly then, every solution to the firs t equation is automatically a solution to the second as well, s o this system has infinitely many solutions .
LINEAR ALGEBRA
87
LINEA R SYSTEMS
■ Figure 38
■
Example 4 : Discuss the following system graphically :
x—2y+z= 0 2x + y — 3z = -5 Each of these equations specifies a plane in R 3. Two such planes either coincide, intersect in a line, or are distinct an d parallel . Therefore, a system of two equations in three unknowns has either no solutions or infinitely many . For thi s particular system, the planes do not coincide, as can be seen , for example, by noting that the first plane passes through th e origin while the second does not . These planes are not parallel, since v 1 = (1, 2, 1) is normal to the first and v 2 = (2, 1, -3) is normal to the second, and neither of these vectors is a scala r multiple of the other . Therefore, these planes intersect in a line, and the system has infinitely many solutions . ■
CLIFFS QUICK REVIEW 88
LINEA R SYSTEM S
Example 5 : Interpret the following system graphically : x+y= 3 3x—2y= 4 x+3y= 9 Each of these equations specifies a line in the x -y plane, a s sketched in Figure 39 . Note that while any two of these line s have a point of intersection, there is no point common to al l three lines. This system is inconsistent.
x+y= 3
■ Figure 39 ■ These examples illustrate the three possibilities for the solutions to a linear system: Theorem A . Regardless of its size or the number of unknown s its equations contain, a linear system will have either no solutions, exactly one solution, or infinitely many solutions .
LINEAR ALGEBRA
89
LINEA R SYSTEMS
This will be proved in Example 18 below . Example 4 illustrated the following additional fact about the solutions to a linear system : Theorem B. If there are fewer equations than unknowns, the n the system will have either no solutions or infinitely many .
Gaussian Elimination The purpose of this section is to describe how the solutions t o a linear system are actually found . The fundamental idea is to add multiples of one equation to the others in order to eliminate a variable and to continue this process until only on e variable is left. Once this final variable is determined, its value is substituted back into the other equations in order to evaluat e the remaining unknowns . This method, characterized by step by-step elimination of the variables, is called Gaussian elimination . Example 6 : Solve this system: x+y= 3 3x—2y= 4
Multiplying the first equation by -3 and adding the result to the second equation eliminates the variable x : -3x—3y=— 9 3x—2y= 4 -5y = -5
90
CLIFFS QUICK REVIE W
LINEA R SYSTEMS
This final equation, -5y = -5, immediately implies y = 1 . Back-substitution of y = 1 into the original first equation, x + y = 3, yields x = 2. (Back-substitution of y = 1 into the original second equation, 3x — 2y = 4, would also yield x = 2 .) Th e solution of this system is therefore (x, y) = (2, 1), as noted i n Example 1 . ■
Gaussian elimination is usually carried out using matrices . This method reduces the effort in finding the solutions b y eliminating the need to explicitly write the variables at eac h step. The previous example will be redone using matrices . Example 7 : Solve this system: x+y= 3 3x—2y= 4 The first step is to write the coefficients of the unknown s in a matrix :
This is called the coefficient matrix of the system . Next, th e coefficient matrix is augmented by writing the constants tha t appear on the right-hand sides of the equations as an additional column : 31 4J This is called the augmented matrix, and each row corresponds to an equation in the given system . The first row, 1. 1 =
LINEAR ALGEBRA 91
LINEA R SYSTEM S
(1, 1, 3), corresponds to the first equation, lx + l y = 3, and th e second row, r 2 = (3, 2, 4), corresponds to the second equation, 3x — 2y = 4 . You may choose to include a vertical line — as shown above—to separate the coefficients of the unknowns from the extra column representing the constants . Now, the counterpart of eliminating a variable from a n equation in the system is changing one of the entries in th e coefficient matrix to zero . Likewise, the counterpart of addin g a multiple of one equation to another is adding a multiple o f one row to another row . Adding -3 times the first row of th e augmented matrix to the second row yield s -3 ri added to r2
33 -2 44
[11
3
0 -5 -5
The new second row translates into —Sy = 5, which means y = 1 . Back-substitution into the first row (that is, into the equatio n that represents the first row) yields x = 2 and, therefore, the solution to the system : (x, y) = (2, 1) . ■
Gaussian elimination can be summarized as follows . Given a linear system expressed in matrix form, A x = b, first write down the corresponding augmented matrix : [Alb] Then, perform a sequence of elementary row operations , which are any of the following : Type 1 . Interchange any two rows . Type 2 . Multiply a row by a nonzero constant . Type 3 . Add a multiple of one row to another row .
CLIFFS QUICK REVIEW
92
LINEA R SYSTEM S
The goal of these operations is to transform or reduce—th e original augmented matrix into one of the form [A'1 b ►
where A' is upper triangular ( alp = 0 for i > j), any zero rows appear at the bottom of the matrix, and the .first nonzero entry in any row is to the right of the first nonzero entry in an y higher row ; such a matrix is said to be in echelon form . The solutions of the system represented by the simpler augmente d matrix, [A')b'], can be found by inspection of the botto m rows and back-substitution into the higher rows . Since elementary row operations do not change the solutions of th e system, the vectors x which satisfy the simpler system A'x = b ' are precisely those that satisfy the original system, A x = b . Example 8 : Solve the following system using Gaussian elimination: x – 2y + z = 0 2x + 4x –
y– 7y +
3z = 5 z = -1
The augmented matrix which represents this system i s 05 -1 The first goal is to produce zeros below the first entry in th e first column, which translates into eliminating the first variable, x, from the second and third equations . The row operation s which accomplish this are as follows :
LINEAR ALGEBRA
93
LINEA R SYSTEMS
1
-2
1
0-
2
1
4
-7
-3 1
5 -1_
-2 r1 added to r2 -4r1 added to r3
1
-2
1
0 0
5
-5
5
1
-3
-1
0-
The second goal is to produce a zero below the second entr y in the second column, which translates into eliminating th e second variable, y, from the third equation . One way to accomplish this would be to add -1/5 times the second row t o the third row . However, to avoid fractions, there is another option: first interchange rows two and three . Interchanging tw o rows merely interchanges the equations, which clearly will no t alter the solution of the system : 1 -2
1 -2
1 0
0
5 -5 5
0
1 -3
r2 H r3
0 1 -3 - 1
> 0
0 5
-1
1
-5
5
Now, add -5 times the second row to the third row : 1 -2 0
1 -3 - 1
0 5 -5 5
-5r2 added to r3
0 0
1 01 -3 - 1 0 10 10
echelon form
Since the coefficient matrix has been transformed into echelon form, the "forward" part of Gaussian elimination is complete . What remains now is to use the third row to evaluate th e third unknown, then to back-substitute into the second row t o evaluate the second unknown, and, finally, to back-substitut e into the first row to evaluate the first unknown . The third row of the final matrix translates into 10z = 10 , which gives z = 1 . Back-substitution of this value into the sec -
CLIFFS QUICK REVIEW 94
LINEA R SYSTEM S
and row, which represents the equation y - 3z = -1, yields y = 2. Back-substitution of both these values into the first row , which represents the equation x - 2y + z = 0, gives x = 3 . The solution of this system is therefore (x, y, z) = (3, 2, 1) . ■ Example 9 : Solve the following system using Gaussian elimination : 2x - 2y
= -6 1
x - y + z =
3y - 2z = -5
For this system, the augmented matrix (vertical line omitted) is 2 -2 0 -6 1 -1 1 1 0 3 -2 -5 First, multiply row 1 by i : _2
-2
0
-6 -
1
-1
1
1
0
3
-2
-5
1
-1
0
-3-
1
-1
1
1
0
3
-2
-5
Multiply r1 by 4-
Now, adding -1 times the first row to the second row yield s zeros below the first entry in the first column : 1 -1 0
-3
1 -1 1 1 0 3 -2 -5
-1
-rl added ro r2
~
-1
0 0
0 -3 1
4
0 3 -2 -5
LINEAR ALGEBRA 95
LINEA R SYSTEM S
Interchanging the second and third rows then gives the desire d upper-triangular coefficient matrix : 1 -1 0 -3 1 4 0 0 0 3 -2 -5
1 -1 0 -3 r2 H r3 > 0 3 -2 -5 1 4 0 0
The third row now says z = 4. Back-substituting this value int o the second row gives y = 1, and back-substitution of bot h these values into the first row yields x = -2 . The solution o f ■ this system is therefore (x, y, z) = (–2, 1, 4). Gauss-Jordan elimination . Gaussian elimination proceeds b y performing elementary row operations to produce zeros belo w the diagonal of the coefficient matrix to reduce it to echelo n form . (Recall that a matrix A' =[ alp ] is in echelon form whe n alp = 0 for i > j, any zero rows appear at the bottom of the matrix, and the first nonzero entry in any row is to the right o f the first nonzero entry in any higher row .) Once this is done, inspection of the bottom row(s) and back-substitution into th e upper rows determine the values of the unknowns . However, it is possible to reduce (or eliminate entirely) th e computations involved in back-substitution by performin g additional row operations to transform the matrix from echelon form to reduced echelon form . A matrix is in reduce d echelon form when, in addition to being in echelon form, eac h column that contains a nonzero entry (usually made to be 1 ) has zeros not just below that entry but also above that entry . Loosely speaking, Gaussian elimination works from the to p down, to produce a matrix in echelon form, whereas GaussJordan elimination continues where Gaussian left off by the n working from the bottom up to produce a matrix in reduce d
CLIFFS QUICK REVIEW
96
LINEAR SYSTEMS
echelon form. The technique will be illustrated in the following example . Example 10 : The height, y, of an object thrown into the air i s known to be given by a quadratic function of t (time) of th e form y = at2 + bt + c. If the object is at height y = 23/4 at time t = 1/2, at y = 7 at time t = 1, and at y = 2 at t = 2, determine the coefficients a, b, and c . Since t = 1/2 gives y = 23/4,
43 = a(2) 2 +b(Z)+ c = + a+ + b+ c while the other two conditions, y(t = 1) = 7 and y(t = 2) = 2 , give the following equations for a, b, and c : 7=a+b+ c 2 = 4a+2b+ c Therefore, the goal is solve the syste m
2a+2b+c= 43 a+b+c = 7 4a+2b+c = 2 The augmented matrix for this system is reduced as follows :
LINEAR ALGEBRA
97
LINEA R SYSTEMS
1
1 2 4 23 4rl
—r1 added to r2 -4 rl added to r3
1 1 1
7
4 2 1
2
1
4
2
> 0 -1
23
-3 -1 6
0 -6 . —15 -90 -6r2 —r2
added to r3
1 2 4 23 > 0 1 3 16 0 0 3
6
At this point, the forward part of Gaussian elimination is finished, since the coefficient matrix has been reduced to echelo n form . However, to illustrate Gauss-Jordan elimination, th e following additional elementary row operations are performed: 1
2
4
23
0
1
3
16
0
0
3
6
—r3 added to r2 1 r3
-4r3 added to r1
-2r2 added to r1
>
>
>
1
2
4
23 -
0
1
0
10
0
0
1
2
1
2
0
0
1
0
1510
0
0
1
2
1
0
0
0
1
0
-510
0
0
1
2
This final matrix immediately gives the solution : a = 5, b = 10, and c = 2 . ■
CLIFFS QUICK REVIEW
98
LINEA R SYSTEM S
Example 11 : Solve the following system using Gaussian elimination: x + y — 3z = 4 2x + y — z =2 3x + 2y — 4 z 7 The augmented matrix for this system i s 1 1 -3 4 2 1 -1 3 2 -4
2 7
Multiples of the first row are added to the other rows to pro duce zeros below the first entry in the first column : 1 2
1 1
3
2
-3
4-
-1 2 -4 7
-2rl -3rl
added to r2 added to r3
>
1
1
-3
4
0 0
-1 -1
5 5
-6 -5
Next, -1 times the second row is added to the third row : 1 0 0
1 -1 -1
-3 45 -6 5 -5
—r2 added to r3
>
1 0 0
1 -1 0
-3 5 0
4-6 1
The third row now says Ox + Oy + Oz = 1, an equation that cannot be satisfied by any values of x, y, and z. The proces s stops : this system has no solutions . ■
The previous example shows how Gaussian eliminatio n reveals an inconsistent system . A slight alteration of that sys -
LINEAR ALGEBRA 99
LINEA R SYSTEMS
tern (for example, changing the constant term "7" in th e third equation to a "6") will illustrate a system with infinitel y many solutions . Example 12 : Solve the following system using Gaussian elimination: x +
y—
2x +
Y-
3x + 2y —
3z
=
4
z =
2
4z
=
6
The same operations applied to the augmented matrix o f the system in Example 11 are applied to the augmented matrix for the present system: 1 1 -3 4 2 1 -1 2
- 2 rl added to r2 - 3r1 added to r3
4 1 1 -3 0 -1 5 -6 0 -1 5 -6
—r2 added to r3
0 -1 5 -6 0 0 0 0
3 2 -4 6
Here, the third row translates into Ox + Oy + Oz = 0, an equation which is satisfied by any x, y, and z . Since this offers n o constraint on the unknowns, there are not three conditions o n the unknowns, only two (represented by the two nonzero row s in the final augmented matrix) . Since there are 3 unknowns but only 2 constraints, 3 — 2 = 1 of the unknowns, z say, is arbitrary ; this is called a free variable . Let z = t, where t is an y real number . Back-substitution of z = t into the second ro w (—y + 5z = -6) gives
CLIFFS QUICK REVIE W
100
LINEA R SYSTEM S
—y+5t=—6
y=6+5t
Back substituting z = t and y = 6 + 5t into the first row (x + y — 3z = 4) determines x : x+(6+5t)—3t=4
~ x=—2—2 t
Therefore, every solution of the system has the for m (x,y,z)(—2—2t,6+5t,t)(—2t,5t,t)+(—2,6,O)
(*)
where t is any real number . There are infinitely many solutions, since every real value of t gives a different particula r solution. For example, choosing t = 1 gives (x, y, z) = (—4, 11 , 1), while t = -3 gives (x, y, z) = (4, 9, 3 ), and so on. Geomet rically, this system represents three planes in R 3 that intersec t in a line, and (*) is a parametric equation for this line . ■
Example 12 provided an illustration of a system with infinitely many solutions, how this case arises, and how the solution is written . Every linear system that possesses infinitel y many solutions must contain at least one arbitrary paramete r (free variable) . Once the augmented matrix has been reduce d to echelon form, the number of free variables is equal to th e total number of unknowns minus the number of nonzer o rows : # free variables = # unknowns — # nonzero rows in echelon for m This agrees with Theorem B above, which states that a linea r system with fewer equations than unknowns, if consistent, has infinitely many solutions. The condition "fewer equations than unknowns" means that the number of rows in the coeffi -
LINEAR ALGEBRA
101
LINEA R SYSTEA'1S
cient matrix is less than the number of unknowns . Therefore , the boxed equation above implies that there must be at leas t one free variable . Since such a variable can, by definition, tak e on infinitely many values, the system will have infinitely man y solutions . Example 13 : Find all solutions to the syste m w — x+ y — 2w + x - 3y 5w - 2x
1
z= =
2
- 3z =
5
First, note that there are four unknowns, but only three equations. Therefore, if the system is consistent, it is guaranteed to have infinitely many solutions, a condition characterized by at least one parameter in the general solution . After the corresponding augmented matrix is constructed, Gaussia n elimination yields 1
-1
1
-1
1
2
1
-3
0
2
5
-2
0
-3
5
-2r3 added to r2 -5r, added to r3
—r2
added to r3
>
>
1
-1
1
-1
1-
0
3
-5
2
0
0
3
-5
2
0
1 0
-1 3
1 -5
-1 2
1 0
0
0
0
0
0
r
The fact that only two nonzero rows remain in the echelo n form of the augmented matrix means that 4 - 2 = 2 of th e variables are free :
CLIFFS QUICK REVIEW
10 2
LINEA R SYSTEMS
# free variables = # unknowns — # nonzero rows in echelon for m = 4— 2 =2 Therefore, selecting y and z as the free variables, let y = t l and z = t2 . The second row of the reduced augmented matrix implies 3z—St1 +2t2 =0
~ x=3(Stl —2t2 )
and the first row then gives
w—3(Sti —2t2 )+t~—t2 =1 ~ w=1+3(2t, +t2) Thus, the solutions of the system have the for m (x', x, y, z)=(1+'—3(2ti+t2), +(5t1 -2t2),
t>> t
z)
where ti and t2 are allowed to take on any real values .
■
Example 14 : Let b = ( b,, b2 , b3) T and let A be the matrix
For what values of b l , b2, and b 3 will the system Ax = b b e consistent ? The augmented matrix for the system A x = b read s 2 1 -1 b~ -1 -3 1 b2 1 8 -2 b3
LINEAR ALGEBRA 103
LINEA R SYSTEMS
which Gaussian elimination reduces as follows : -1 -3 rl H r2
2 1 -1 b1 1 8 -2 b3 _
1 8 -2 b3 2rl added to r2 rl added to r3
1 b2 -
>
- 1 -3 1 b2 0 -5 1 b1 + 2b2 0 5 -1 b2 +b3_
r2 added to r3
>
- 1 -3 1 b2 b1 +2b2 0 -5 1 0 0 0 b1 + 3b2 + b3 _
The bottom row now implies that b 1 + 3b2 + b3 must be zero i f this system is to be consistent . Therefore, the given system has solutions (infinitely many, in fact) only for those colum n vectors b = (b 1 , b2, b 3)T for which b1 + 3b 2 + b3 = 0. ■ Example 15 : Solve the following system (compare to Example 12) : x + y – 3z = 0 z= 0 2x + y – 3x + 2y – 4z = 0 A system such as this one, where the constant term on th e right-hand side of every equation is 0, is called a homogeneous system . In matrix form it reads Ax = 0 . Since every homogeneous system is consistent—because x = 0 is always a solution—a homogeneous system has either exactly one solution (the trivial solution, x = 0) or infinitely many . The row-
CLIFFS QUICK REVIEW 104
LINEA R SYSTEMS
reduction of the coefficient matrix for this system has alread y been performed in Example 12 . It is not necessary to explicitly augment the coefficient matrix with the column b = 0 , since no elementary row operation can affect these zeros . That is, if A' is an echelon form of A, then elementary row operations will transform [AI 0] into [A' 0 ]. From the result of Ex ample 12, 1
1 -3
[A' Oi = 0 -1 5 0 0 0
00 0
Since the last row again implies that z can be taken as a fre e variable, let z = t, where t is any real number . Back-substitution of z = t into the second row (—y + 5z = 0) gives —y+5t=0 = y=5t and back-substitution of z = t and y = St into the first row (x + y — 3z = 0) determines x : x+5t—3t=0 ~ x=—2t Therefore, every solution of this system has the form (x, y, z) = (-2t, 5t, t), where t is any real number. There are infinitel y many solutions, since every real value of t gives a unique particular solution. Note carefully the difference between the set of solution s to the system in Example 12 and the one here . Although bot h had the same coefficient matrix A, the system in Example 1 2 was nonhomogeneous (A x = b, where b 0), while the one here is the corresponding homogeneous system, A x = 0 . Placing their solutions side by side ,
LINEAR ALGEBRA
105
LINEA R SYSTEM S
general solution to Ax = 0 : (x, y, z) = (-2t, 5t, t) general solution to Ax = b : (x, y, z) = (-2t, 5t, t) + (-2, 6, 0) illustrates an important fact : Theorem C. The general solution to a consistent nonhomogeneous linear system, A x = b, is equal to the general solution o f the corresponding homogeneous system, A x = 0, plus a particular solution of the nonhomogeneous system . That is, if x = X h represents the general solution of Ax = 0, then x = x h + x represents the general solution of Ax = b, where i is any particular solution of the (consistent) nonhomogeneous syste m Ax = b . [Technical note : Theorem C, which concerns a linear system , has a counterpart in the theory of linear differential equations . Let L be a linear differential operator; then the general solution of a solvable nonhomogeneous linear differential equation, L(y) = d (where d 0), is equal to the general solutio n of the corresponding homogeneous equation, L(y) = 0, plus a particular solution of the nonhomogeneous equation . That is, if y = yh represents the general solution of L(y) = 0, the n y — yh + y represents the general solution of L(y) = d, where y is any particular solution of the (solvable) nonhomogeneou s linear equation L(y) = d. (This is Theorem B on page 68 i n the present author's Differential Equations, © 1995 Cliffs Notes, Inc .)] ■ Example 16 : Determine all solutions of the syste m x
3y +
2w
—
2x +
2w
--
x—
-6w
+
4x
+
4z =
1
=
-1
y 2y +
4z
=
0
3y
8z
=
1
CLIFFS QUICK REVIEW
106
LINEA R SYSTEMS
Write down the augmented matrix and perform the following sequence of operations : 0
1 -3 4
1-
2 -2 1 0 -1 2 -1 -2 4 0
2 -2 ++ r2
-6 4 3 -8 1 —rl added to r3
3rl added to r4
1 0 -1 -
0 1 -3 4 2 -1 -2 4
1 0
-6 4 3 -8
1
2 -2 0 0
1
1.-
1 -3 4 1 -3 4
0 -2 6 2 -2 —r2 added to r3 2r2 added to r4
0
-8
1 1 -2
1 0 -1-
0 1 -3 4 0 0 0 0
1 0
0 0 0 0 0_ Since only 2 nonzero rows remain in this final (echelon) matrix, there are only 2 constraints, and, consequently, 4 — 2 = 2 of the unknowns —y and z say are free variables . Let y = t l and z = t2 . Back-substitution of y = t 1 and z = t2 into the second row (x — 3y + 4z = 1) give s x—3t1 +4t2 =1
= x =1+3t1 —4t2
Finally, back-substituting x = 1 + 3t1 — 4t2, y = to and z = t2 into the first row (2w — 2x + y = -1) determines w : 2w—2(1+3ti —4t2 )+t~=—1 ~
w=Z+2t1 —4t2
Therefore, every solution of this system has the for m (w, x, y, z) = (. . + 2 t1 - 4t2, 1 + 3 t1 - 4t2 ,
LINEAR ALGEBRA
t 1 , t2 )
107
LINEA R SYSTEM S
where t 1 and t2 are any real numbers . Another way to write the solution is as follows : (w, x, y, z)=(2+2t l -4t2 , 1+3tl -4t2 , t1 , t2 ) = (2 t1 , 3t 1 , t1 , 0)+(—4t2 , — 4t2 , 0, t2 )+(+, 1, 0, 0 ) = t 1 (, 3, 1, 0) + t2 (—4, — 4, 0, 1) + (2, 1, 0, 0 ) where tl , t2 ER .
■
Example 17 : Determine the general solution o f x 3y + 4z = 2x + y =
0 0
2w x 2y + 4z = -6w + 4x + 3y 8z =
0 0
2w
which is the homogeneous system corresponding to the non homogeneous one in Example 16 above . Since the solution to the nonhomogeneous system in Ex ample 16 i s (w, x, y, z)=t1(2,3,1,0)+t2(—4,—4,0,1)+(2,1,0, 0) (* ) Xh x Theorem C implies that the solution of the correspondin g homogeneous system i s (w, x, y, z) = t1 (2 , 3, 1, 0) + t2 (—4, — 4, 0, 1)
(where to t2 1Z), which is obtained from (*) by simply discarding the particular solution, X = (4, 1, 0, 0), of the non homogeneous system . ■
108
CLIFFS QUICK REVIE W
LINEA R SYSTEMS
Example 18 : Prove Theorem A : Regardless of its size or th e number of unknowns its equations contain, a linear system wil l have either no solutions, exactly one solution, or infinitel y many solutions. Proof. Let the given linear system be written in matri x form, Ax = b . The theorem really comes down to this : if Ax = b has more than one solution, then it actually has infinitel y many . To establish this, let x~ and x 2 be two distinct solution s of Ax = b . It will now be shown that for any real value of t, th e vector x~ + t(x 1 — x 2) is also a solution of Ax = b ; because t can take on infinitely many different values, the desired conclusion will follow. Since Ax e = b and Ax 2 = b , A[x, +t(x i —x 2 )] =Ax, +A[t(x i — xZ )] = Ax e +tA(xl —x 2) =Axe + t(Ax I —Ax 2 ) =b+t(b—b ) =b Therefore, xl + t(x l — x 2 ) is indeed a solution of Ax = b, an d the theorem is proved . ■
Using Elementary Row Operations to Determine A -l A linear system is said to be square if the number of equations matches the number of unknowns . The systems in Examples 1 and 9 were square, for example . If the system A x = b is square, then the coefficient matrix, A, is square . If A has an inverse, then the solution to the system A x = b can be foun d by multiplying both sides by A -l :
LINEAR ALGEBRA
109
LINEA R SYSTEMS
x = A'b
A'Ax = A'b
Ax = b =
This calculation establishes the following result : Theorem D . If A is an invertible n by n matrix, then the system A x = b has a unique solution for every n-vector b, and thi s solution equals A-l b. A-' typically requires mor e Since the determination of calculation than performing Gaussian elimination and back substitution, this is not necessarily an improved method o f solving A x = b . (And, of course, if A is not square, then it ha s no inverse, so this method is not even an option for nonsquar e systems .) However, if the coefficient matrix A is square, and if Kl is known or the solution of A x = b is required for several different b's, then this method is indeed useful, from both a theoretical and a practical point of view . The purpose of thi s section is to show how the elementary row operations tha t characterize Gauss-Jordan elimination can be applied to compute the inverse of a square matrix . First, a definition : If an elementary row operation (the interchange of two rows, the multiplication of a row by a nonzero constant, or the addition of a multiple of one row t o another) is applied to the identity matrix, I, the result is calle d an elementary matrix . To illustrate, consider the 3 by 3 identity matrix . If the first and third rows are interchanged , _ _1 _ 1rl H r3
1 1
1 1
CLIFFS QUICK REVIE W
110
LINEA R SYSTEM S
or if the second row of I is multiplied by 2, 1 -2 r2
1
-2
~
1
1
or if 2 times the first row is added to the second row , 1 -2 rl added to r2
11
~
-2
1
1
1
all of these resulting matrices are examples of elementary matrices . The first fact that will be needed to compute A-' reads as follows : If E is the elementary matrix that results when a particular elementary row operation is performed on I, the n the product EA is equal to the matrix that would result if tha t same elementary row operation were applied to A. In other words, an elementary row operation on a matrix A can be per formed by multiplying A on the left by the correspondin g elementary matrix . For example, consider the matrix 1 -1
2
A=
Adding -2 times the first row to the second row yield s 1
A= 2 0
-1
0 1
2
3 -1
-2 rl added to r2
1
-1
0 0
2 1
2-1 = At -1
LINEAR ALGEBRA
11 1
LINEA R SYSTEM S
If this same elementary row operation is applied to I , 1
1 I=
-2 rl added to r2
1
-2 1
=E 1
1
then the result above guarantees that EA should equal A' . You may verify that 1
EA = -2
1 1
1
-1
2
2 0
0 1
3 -1
1 0 0
-1 2 1
2-1 = A ' -1
is indeed true . If A is an invertible matrix, then some sequence of elementary row operations will transform A into the identity matrix, I. Since each of these operations is equivalent to left multiplication by an elementary matrix, the first step in the reduction of A to I would be given by the product EIA, the secon d step would be given by E2EIA, and so on . Thus, there exist elementary matrices EI , E2, . . ., Ek such that Ek . . . E2El A = I But this equation makes it clear that Ek • • • E2EI Ek . . . E2El A = I A- 1
=
Since Ek • • • E2EI = Ek • • • E2EII, where the right-hand side explicitly denotes the elementary row operations applied to th e identity matrix I, the same elementary row operations tha t transform A into I will transform I into A -I . For n by n matrices A with n 3, this describes the most efficient method fo r determining A-l .
CLIFFS QUICK REVIE W 112
LINEA R SYSTEM S
Example 19 : Determine the inverse of the matri x A=
Since the elementary row operations that will be applied to A will be applied to I as well, it is convenient here to augmen t the matrix A with the identity matrix I : 1 -1 [A I
2 3
10 01
0 0
1 -1
00
1
1]= 2 0 0
Then, as A is transformed into I, I will be transformed into A - l : A-i ] [A(I] >[11 Now for a sequence of elementary row operations that wil l effect this transformation: 1
1
2 1 0 0-
2 0 3 0 1 0 0 1 -1 0 0 1 1
added to r2
2
1 0 0-
0 2 -1 -2 1 0
r2 H r3
-1
1 -1
1 -1 2 0 1 -1
00
0 1
1 0 000 1
0 2 -1 -2 1
0
LINEAR ALGEBRA 113
LINEA R SYSTEMS
1 -1 2 0 1 -1
-2r2 added to r3
10
0-
00
1
0 1 -1 0 5 -2 4 1 0 -2 1 -1 0 1 -2 0 0 1 -2
r3 added to r2 -2r3 added to rl
3100 3 -1 0 1 0 -2 1 -1 1 -2 0 0 1 -2
r2 added to rl
Since the transformation
[A
2 1 0
0
2 0 3 0 1 1 -1 0 0 0
0 1
1
-1
A
~ I]
—* [I ~ A-' ] reads
\
31 0 0 3 -1 0 1 0 -2 1 -1 0 0 1 -2 1 -2 Y
J
T A -1
I
~
the inverse of the given matrix A i s 3 -1 3A' = -2 1 - 1 -2 1 -2
■
Example 20 : What condition must the entries of a general 2 by 2 matrix A= satisfy in order for A to be invertible? What is the inverse of in this case?
114
A
CLIFFS QUICK REVIEW
LINEA R SYSTEMS
The goal is to effect the transformation [A 1 1] -~ [I ~ A - 1 ] . First, augment A with the 2 by 2 identity matrix : [A
1]_
[a b 1 Ol c d 0 1J
Now, if a = 0, switch the rows . If c is also 0, then the proces s of reducing A to I cannot even begin . So, one necessary condition for A to be invertible is that the entries a and c are no t both 0 . Assume that a 0. Then a b 1 0
art
c d 0 1 added to r2
a
0
c d 0
1
1
ba
> [ a 1
0
ad —bc a
Q
0
_ c a
1
1
Next, assuming that ad— be ~ 0 , 1 0
b a ad —bc a
1 a
_c a
0 1 ad-Gc
d —b ad —bc ad —bc
r2 added to rl
a
-
r2
>
1
d —b ad — be ad — be c a ad —bc ad —bc
Therefore, if ad — be ~ 0, then the matrix A is invertible, an d its inverse is given by
LINEAR ALGEBRA 115
LINEA R SYSTEMS
(The requirement that a and c are not both 0 is automaticall y included in the condition ad — bc ~ 0 .) In words, the inverse i s obtained from the given matrix by interchanging the diagona l entries, changing the signs of the off-diagonal entries, an d then dividing by the quantity ad — bc. This formula for th e inverse of a 2 x 2 matrix should be memorized. To illustrate, consider the matri x A= L
4
~ 5
Since ad — bc = (—2)(5) — (—3)(4) = 2 0, the matrix is invertible, and its inverse is s 3 3 5 A _ 1_2
= 2
i
-4 -2 -2 - 1 You may verify that `4`4
1—
[—2 3 2 4 5] -2
and that A-'A = I also.
2
1
0
-1] LO 1
I
■
Example 21 : Let A be the matrix 2 1 -1 -1 -3 1 1 8 -2 in Example 14 above . Is A invertible? No . Recall that row reduction of A produced the matri x
CLIFFS QUICK REVIEW
116
LINEA R SYSTEM S
-1
-3
1
A' = 0 -5
1
0 0
0
The row of zeros signifies that A cannot be transformed to th e identity matrix by a sequence of elementary row operations ; A is noninvertible . Another argument for the noninvertibility o f A follows from the result of Example 14 and Theorem D . If A were invertible, then Theorem D would guarantee the existence of a solution to A x = b for every column vector b = (b l , b2 , b 3 )T. But Example 14 showed that A x = b is consistent only for those vectors b for which b 1 + 3b 2 + b 3 = 0. Clearly, then, there exist (infinitely many) vectors b for which A x = b is inconsistent; thus, A cannot be invertible . ■ Example 22 : What can you say about the solutions of th e homogeneous system A x = 0 if the matrix A is invertible ? Theorem D guarantees that for an invertible matrix A, th e system A x = b is consistent for every possible choice of the column vector b and that the unique solution is given by A -1b. In the case of a homogeneous system, the vector b is 0, so th e system has only the trivial solution : x = A -10 = 0 . ■ Example 23 : Solve the matrix equation AX = B, wher e 1 12 -
1 4 -2
A= -1 1 - 1 3 0
1
and
B= -7
17
2 3
LINEAR ALGEBRA 117
LINEA R SYSTEMS
Solution 1 . Since A is 3 x 3 and B is 3 x 2, if a matrix X exists such that AX = B, then X must be 3 x 2. If A is invertible, one way to find X is to determine A -' and then to compute X = A -' B . The algorithm [A 11] . [I ~ K1] to find A-' yields
1 4 -2 1 0 0 -1 1 -1 0 1 0 30 1 0 0 1 1 0
1 0 011 0 0 -12 7 -3 0 1
r, added to r2 -3r, added to r3
2 r2 added to r3
4 -2 5 -3
1 4 -2 > 0 5 -3 0 -2
2r3 added to r2
^1 > 0
1 -1 2
4 -2 10 1 -1 -1 5
0 -2 2 r2added to r3
added to r2 - 2r3 added to r,
o 2
1 0
0
> 0 1 -1 -1 5 0 0 -1 -3 12
2 5
1 4 0
7 -24 -1 0
> 0 1 0 2 -7 - 3 _
- r3
1_
1
—r3
- 4r2 added to r,
0 0
1 -1 2
1 4 -2
10 11
0 0 -1 -3 12
5
r
1 0 0 -1 4 2 > 0 1 0 2 -7 -3 0 0 1
3 -12 - 5
CLIFFS QUICK REVIEW 118
LINEA R SYSTEM S
Therefore, -1 A-1
=
4 2-
2 -7 - 3 3 -12 - 5
so -1
X= A -1 B =
4 2
1 12
2 -7 -3 -7
2
3 -12 -5 17 3
5
2
0 1 2 -3
Solution 2. Let b 1 and b 2 denote, respectively, column 1 and column 2 of the matrix B . If the solution to Ax = b 1 is x 1 and the solution to Ax = b 2 is x2, then the solution to AX = B = [b 1 b2] is X = [x, x 2] . That is, the elimination procedure can be performed on the two systems (Ax = b1 and Ax = b2) simultaneously : 1 -1 3
4 -2 1 -1 0
1
1
12 -7 2 17 3_ rl added to r2 added to r3 >
2r2 added to r3 >
1 0 0
4 5 -12
-2 -3 7
1 —6 14
1 0
4
-2
1
5
—6
0
-2
-3 1
2
12~ 14 -33 1214 -5
LINEAR ALGEBRA
11 9
LINEA R SYSTEMS
2r3 added to r2 >
2r2 added to r3 >
1
4
-2
1
0 0
1
-1 1
-2 2
-2
12 -
4 -5
1 4
-2
1
12-
0
1
-1
-2
4
0
0
-1
-2
3
Gauss-Jordan elimination completes the evaluation of th e components of x 1 -and x2: 1
4
-2
1
0 0
1 0
-1 -1
-2 -2
12 4 3 -r3 added to r2 added to rl >
-2r3
-4r2
-r3
added to rl >
1 4
0
5
6
0
1
0
0
1
0
0 -1
-2
3
1 0
0 0 1 0 0 0 1
5 0 2
r
2^ 1 -3
It follows immediately from this final augmented matrix tha t 5 2X= 0 1 2 -3 as before.
120
CLIFFS QUICK REVIEW
LINEA R SYSTEMS
It is easy to verify that the matrix X does indeed satisfy th e equation AX = B : 1 4 -2 5
2 1 12 1 = -7 2 = B AX = -1 1 -1 0 30 1 2 -3 17 3 Note that the transformation in Solution 1 was [A ~ I] — > [I I A -'], from which A-'B was computed to give X. However, the transformation in Solution 2, [A ~ B] —> [I ~ X], gave X directly . ■
LINEAR ALGEBRA
121
REAL EUCLIDEAN VECTOR SPACES
The concept of a vector space is of fundamental importanc e throughout much of mathematics and physics . Although th e most general definition of a vector space is not needed in a n introduction to linear algebra, the particular type of vecto r space that will be studied here—the Euclidean vector space—i s the one most frequently used in applications of the subject . Since all scalars in this book are real, the resulting structure i s called a real Euclidean vector space.
Subspaces of R" Consider the collection of vectors V = { (x, 3x) : x
E
The endpoints of all such vectors lie on the line y = 3x in th e x-y plane . Now, choose any two vectors from V, say, u = (1, 3 ) and v = (-2, -6) . Note that the sum of u and v , u+v=(-1, 3) is also a vector in V, because its second component is thre e times the first. In fact, it can be easily shown that the sum o f any two vectors in V will produce a vector that again lies in V. The set V is therefore said to be closed under addition . Next, consider a scalar multiple of u, say , 5u = 5(1,3)=(5,15) It, too, is in V. In fact, every scalar multiple of any vector in V is itself an element of V . The set V is therefore said to b e closed under scalar multiplication .
LINEAR ALGEBRA 123
REA L EUCLIDEA N VECTO R SPACES
Thus, the elements in V enjoy the following two proper ties: (1) Closure under addition : The sum of any two elements in V is an element of V. (2) Closure under scalar multiplication : Every scalar multiple of an element in V is an elemen t of V. Any subset of R" that satisfies these two properties—with th e usual operations of addition and scalar multiplication—i s called a subspace of R" or a Euclidean vector space . The set V = {(x, 3x) : x ER} is a Euclidean vector space, a subspace o f R2. Example 1 : Is the following set a subspace of R z? A = (x, 3x + 1) : x ER) To establish that A is a subspace of R 2, it must be shown that A is closed under addition and scalar multiplication . If a counterexample to even one of these properties can be found , then the set is not a subspace . In the present case, it is ver y easy to find such a counterexample . For instance, both u = (1 , 4) and v = (2, 7) are in A, but their sum, u + v = (3, 11), is not . In order for a vector v = (v l, v 2) to be in A, the second component (v2) must be 1 more than three times the first component (vl). Since 11 3(3) + 1, (3, 11) A . Therefore, the set A is not closed under addition, so A cannot be a subspace . [Yo u could also show that this particular set is not a subspace of R 2 by exhibiting a counterexample to closure under scalar multi -
124
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTO R
SPACE S
plication. For example, although u = (1, 4) is in A, the scalar multiple 2u = (2, 8) is not] ■ Example 2 : Is the following set a subspace of R 3 ? B= {(JC,
x 2 , JC 3 ) : X ER }
In order for a subset of R 3 to be a subspace of R3 , both closure properties (1) and (2) must be satisfied . However, note that while u = (1, 1, 1) and v = (2, 4, 8) are both in B, their sum, (3, 5, 9), clearly is not . Since B is not closed under addition, B is not a subspace of R 3 . ■ Example 3 : Is the following set a subspace of R 4? C=
0, x3 , - SX i ) : JCS ,
x3 ER }
For a 4-vector to be in C, exactly two conditions must b e satisfied : Namely, its second component must be zero, and it s fourth component must be -5 times the first . Choosing particular vectors in C and checking closure under addition an d scalar multiplication would lead you to conjecture that C is indeed a subspace . However, no matter how many specific examples you provide showing that the closure properties ar e satisfied, the fact that C is a subspace is established only whe n a general proof is given . So let u = (u 1 , 0, u 3 , —Su l ) and v = (vl, 0, v3 , 5v 1 ) be arbitrary vectors in C. Then their sum, u + v = (ul + vl , 0, u3 + v3 , - 5(u l + v1) ) satisfies the conditions for membership in C, verifying closur e under addition . Finally, if k is a scalar, the n ku (ku l „ 0, ku3 , - 5(kul ))
LINEAR ALGEBRA 125
REA L EUCLIDEA N VECTOR SPACES
is in C, establishing closure under scalar multiplication . This ■ proves that C is a subspace of R 4. Example 4 : Show that if V is a subspace of R", then V must contain the zero vector . First, choose any vector v in V. Since V is a subspace, i t must be closed under scalar multiplication. By selecting 0 as the scalar, the vector Ov, which equals 0, must be in V . [Another method proceeds like this : If v is in V, then the scalar multiple (—1)v = —v must also be in V . But then the sum o f these two vectors, v + (—v) = 0, must be in V, since V is closed under addition.] This result can provide a quick way to conclude that a particular set is not a Euclidean space. If the set does not contain the zero vector, then it cannot be a subspace . For example, the set A in Example 1 above could not be a subspace of R 2 because it does not contain the vector 0 = (0, 0) . It is important to realize that containing the zero vector is a necessary condition for a set to be a Euclidean space, not a sufficient one. That is, just because a set contains the zero vector does not guarantee that it is a Euclidean space (for example, consider the set B in Example 2) ; the guarantee is that if the set does not contain 0, then it is not a Euclidean vector space . ■ As always, the distinction between vectors and points ca n be blurred, and sets consisting of points in R " can be considered for classification as subspaces .
CLIFFS QUICK REVIE W
126
REA L EUCLIDEA N VECTOR SPACE S
Example 5 : Is the following set a subspace of R Z? D = {(x, y) : x 0 and y 0 } As illustrated in Figure 40, this set consists of all points i n the first quadrant, including the points (x, 0) on the x axis wit h x 0 and the points (0, y) on the y axis with y 0:
■ Figure 40 ■ The set D is closed under addition since the sum of nonnegative numbers is nonnegative. That is, if (x l , y i ) and (x2 , y2 ) are in D, then x i, x 2 , y 1 , and y2 are all greater than or equal to 0, s o both sums x i + x2 and y 1 + y2 are greater than or equal to 0 . This implies that ( x1, yl)
+ (x2 , Y2) _ (xl -f x2, Yi + Y2) E D
However, D is not closed under scalar multiplication . If x and y are both positive, then (x, y) is in D, but for any negativ e scalar k, k( x , y) = (kx, ky)
D
LINEAR ALGEBRA 127
REA L EUCLIDEA N VECTOR SPACE S
since kx < 0 (and ky < 0) . Therefore, D is not a subspace o f R2 . ■ Example 6 : Is the following set a subspace of R Z ? E = 1(x, y) : xy 0} As illustrated in Figure 41, this set consists of all points i n the first and third quadrants, including the axes :
■ Figure 41
■
The set E is closed under scalar multiplication, since if k is an y scalar, then k(x, y) = (kx, ky) is in E. The proof of this las t statement follows immediately from the condition for membership in E . A point is in E if the product of its two coordinates is nonnegative. Since k2 0 for any real k,
(x,
y) E E ~ xy 0 ~ k2 x3' =
(lcx)(ky) >_ 0
(k,ky)
E
CLIFFS QUICK REVIEW
128
REA L EUCLIDEA N VECTO R SPACES
However, although E is closed under scalar multiplication, it i s not closed under addition. For example, although u = (4, 1 ) and v = (—2, -6) are both in E, their sum, (2, 5), is not. Thus , E is not a subspace of R2. ■ Example 7 : Does the plane P given by the equation 2x + y – 3z = 0 form a subspace of R 3? One way to characterize P is to solve the given equatio n for y, y=3z—2x
and write P = {(x, 3z — 2x, z) : x, z ER}
If p 1 = (x l, 3zl – 2x 1, z l ) and p2 = (x2 , 3z2 – 2x2 , z2 ) are points in P, then their sum, PI + P2 = (x1
+x2 , 3(z1 +z2 )-2(x 1 +x2), zl +z2 )
is also in P, so P is closed under addition . Furthermore, if p = (x, 3z — 2x, z) is a point in P, then any scalar multiple, kp = (la, 3(kz) — 2(/), kz) is also in P, so P is also closed under scalar multiplication . Therefore, P does indeed form a subspace of R 3 . Note that P contains the origin . By contrast, the plane 2x + y — 3z = 1, although parallel to P, is not a subspace of R3 because it doe s not contain (0, 0, 0) ; recall . Example 4 above . In fact, a plane in R 3 is a subspace of R3 if and only if it contains the origin . ■
LINEAR ALGEBRA
129
REA L EUCLIDEA N VECTO R SPACES
The Nullspace of a Matri x The solution sets of homogeneous linear systems provide a n important source of vector spaces . Let A be an m by n matrix , and consider the homogeneous syste m Ax= 0 Since A is m by n, the set of all vectors x which satisfy thi s equation forms a subset of R . (This subset is nonempty, sinc e it clearly contains the zero vector : x = 0 always satisfies Ax = 0 .) This subset actually forms a subspace of called th e nullspace of the matrix A and denoted N(A) . To prove that N(A) is a subspace of closure under both addition an d scalar multiplication must be established . If x and x 2 are in N(A), then, by definition, A xi = 0 and A x2 = 0 . Adding these equations yields Axe+Ax2 =0 ~ A(x l +x2 )=0 ~ z,+z 2 E N(A) which verifies closure under addition . Next, if x is in N(A) , then Ax = 0, so if k is any scalar, k(Ax) = 0 ~ A(kg) = 0 ~ lcg E N(A) verifying closure under scalar multiplication . Thus, the solution set of a homogeneous linear system forms a vector space . Note carefully that if the system is not homogeneous, then the set of solutions is not a vector space since the set will not contain the zero vector. Example 8 : The plane P in Example 7, given by 2x + y — 3z = 0, was shown to be a subspace of R 3 . Another proof that this defines a subspace of R3 follows from the observation that 2x + y — 3z = 0 is equivalent to the homogeneous syste m
CLIFFS QUICK REVIEW
130
REA L EUCLIDEA N VECTOR SPACE S
where A is the 1 x 3 matrix [2 1 -3] . P is the nullspace o f A. ■ Example 9 : The set of solutions of the homogeneous syste m
1- i 2 0 1 -7] x
]
[0 forms a subspace of Rn for some n . State the value of n and explicitly determine this subspace . Since the coefficient matrix is 2 by 4, x must be a 4 vector. Thus, n = 4 : The nullspace of this matrix is a subspac e of R4 . To determine this subspace, the equation is solved b y first row-reducing the given matrix : [—1 1 2 4 2 0 1 -7
2rl added to r2
(-or, 1)rl
1 -1 -2 0 2
5
-4 1
Therefore, the system is equivalent to 1
-1
-2
0 2 5
LINEAR ALGEBRA
4 x1
0
1] x2 — [ 0]
131
REA L EUCLIDEA N VECTOR SPACE S
that is,
x l — x2 — 2 x3 — 4x4 = 0 2x2 +5x3 +x4 = 0
If you let x3 and x4 be free variables, the second equation directly above implies XZ = - 21 (SX
3 + Xq )
Substituting this result into the other equation determines x 1 : x i — [— +(5x3 + x4 )] — 2x3 — 4x4 = 0 xl
=— 2t (x3 —7z4 )
Therefore, the set of solutions of the given homogeneous system can be written as (x3 — 7x4 )-
--1(5x3 +x4 ) 2
x3
x4 which is a subspace of R 4 . This is the nullspace of the matrix [—1 1 2 4 2 0 1 -7
~
Example 10 : Find the nullspace of the matrix
A= 2 1
L1 2J
CLIFFS QUICK REVIE W
132
REA L EUCLIDEA N VECTO R SPACE S
By definition, the nullspace of A consists of all vectors x such that Ax = 0. Perform the following elementary row operations on A,
[2 1 12
rl Hr2
[1
2
added to r2
21
1
2
0 -3
to conclude that Ax = 0 is equivalent to the simpler system
1 2 xl 0 -3 x 2
0 0
The second row implies that x2 = 0, and back-substituting this into the first row implies that x l = 0 also . Since the only solution of A x = 0 is x = 0, the nullspace of A consists of the zer o vector alone. This subspace, {0}, is called the trivial subspac e ■ (of R 2). Example 11 : Find the nullspace of the matri x B – [–42 -2] To solve Bx = 0, begin by row-reducing B: [2
_ 2 rl added to r2
>
0
0' J
The system Bx = 0 is therefore equivalent to the simpler system
] Co o~ Cx2 ~ = C~
LINEAR ALGEBRA
133
REA L EUCLIDEA N VECTOR SPACE S
Since the bottom row of this coefficient matrix contains onl y zeros, x 2 can be taken as a free variable . The first row then gives 2xl +x2 =0 = xl =- 2 x 2 so any vector of the form
satisfies Bx = O . The collection of all such vectors is the null space of B, a subspace of R2 : N(B) =
4x] x
: x ER} ■
Linear Combinations and the Span of a Collection of Vector s Let v l , v 2, . . ., y r be vectors in R n. A linear combination o f these vectors is any expression of the for m klv l +k2 v2 + . . . + kr vr where the coefficients k 1 , k2 , . . ., kr are scalars. Example 12 : The vector v = (–7, -6) is a linear combinatio n of the vectors v 1 = (–2, 3) and v 2 = (1, 4), since v = 2v 1 – 3v 2 . The zero vector is also a linear combination of v 1 and v 2 , since 0 = Ov 1 + Ov 2 . In fact, it is easy to see that the zero vector in Rn is always a linear combination of any collection of vectors v 1 , v 2 , . . ., y r from R n. ■
CLIFFS QUICK REVIEW 134
REA L EUCL/DEA N VECTO R SPACES
The set of all linear combinations of a collection of vectors v1 , v2, . . ., yr from Rn is called the span of {v,, v2, . . ., vr} . This set, denoted span {v,, v2, . . ., vr} , is always a subspace o f Rn, since it is clearly closed under addition and scalar multiplication (because it contains all linear combinations of v I , v2, . . v,.). If V = span{v,, v2, . . ., yr }, then V is said to be spanned by v l , v2, . . ., vr. Example 13 : The span of the set 1(2, 5, 3), (1, 1, 1) } is th e subspace of R 3 consisting of all linear combinations of th e vectors v 1 = (2, 5, 3) and v2 = (1, 1, 1) . This defines a plane in R3. Since a normal vector to this plane is n = v, x v2 = (2, 1, 3 ), the equation of this plane has the form 2x + y -- 3z = d for some constant d . Since the plane must contain the origin—it's a subspace—d must be 0 . This is the plane in Example 7. ■ Example 14 : The subspace of R 2 spanned by the vectors i = (1, 0) and j = (0, 1) is all of R2, because every vector in R 2 can be written as a linear combination of i and j : span{i, j} = R2
■
Let v l , v 2, . . ., v, -1, yr be vectors in R . If yr is a linear combination of v 1, v2, . . ., then span{v 1, v2 , . . .,
vr_1,
yr} = span{v 1 , v2 , . . .,
v r_1 }
That is, if any one of the vectors in a given collection is a linear combination of the others, then it can be discarded without affecting the span . Therefore, to arrive at the most "efficient "
LINEAR ALGEBRA
135
REA L EUCLIDEA N VECTO R SPACES
spanning set, seek out and eliminate any vectors that depen d on (that is, can be written as a linear combination of) the others. Example 15 : Let v l = (2, 5, 3), v2 7) . Since v3 = 4v1 — 5v 2,
=
(1, 1, 1), and v 3
=
(3, 15 ,
span{vl, v 2, v3 } = span{v 1 , v2 } That is, because v 3 is a linear combination of v 1 and v 2, it ca n be eliminated from the collection without affecting the span . Geometrically, the vector (3, 15, 7) lies in the plane spanne d by v 1 and v2 (see Example 7 above), so adding multiples of v 3 to linear combinations of v l and v2 would yield no vectors off this plane . Note that v 1 is a linear combination of v 2 and v 3 (since v 1 = 5/4v 2 + 1/4v3), and v 2 is a linear combination of v 1 and v3 (since v 2 = 4/5v l — 1/5v3). Therefore, any one of these vectors can be discarded without affecting the span : span{v 1 , v2, v 3 } = span{v 1, v2 } = span{v 2 , v3 } = span{v 1 ,
v3 }
•
Example 16 : Let v l = (2, 5, 3), v 2 = (1, 1, 1), and v 3 = (4, 2, 0). Because there exist no constants k l and k2 such that v3 = k1v 1 + k2 V 2 , V 3 is not a linear combination of v 1 and V 2. There fore, V3 does not lie in the plane spanned by v 1 and v2, as shown in Figure 42 :
CLIFFS QUICK REVIEW
136
REA L EUCLIDEA N VECTO R SPACES
plane =span{vi , v2 }
■ Figure 42 ■ Consequently, the span of v1 , v2, and v3 contains vectors not in the span of v 1 and v2 alone. In fact, span{ v1 , v2 } = (the plane 2x + y -- 3z = 0) span{v i , v 2 , v3 } = all of R3
■
Linear Independenc e Let A = {v l , v 2, . . ., NO be a collection of vectors from R ''. If r >_ 2 and at least one of the vectors in A can be written as a linear combination of the others, then A is said to be linearly dependent . The motivation for this description is simple : At least one of the vectors depends (linearly) on the others . On the other hand, if no vector in A is equal to a linear combination of the others, then A is said to be a linearly independen t set . It is also quite common to say that "the vectors are linearly dependent (or independent)" rather than "the set containing these vectors is linearly dependent (or independent) ." Example 17 : Are the vectors v 1 = (2, 5, 3), v 2 = (1, 1, 1), an d v3 = (4, 2, 0) linearly independent ?
LINEAR ALGEBRA
137
REA L EUCLIDEA N VECTO R SPACES
If none of these vectors can be expressed as a linear combination of the other two, then the vectors are independent ; otherwise, they are dependent . If, for example, v 3 were a linear combination of v 1 and v2 , then there would exist scalars k l and k2 such that k 1 v 1 + k2v2 = v3 . This equation reads kl (2, 5, 3) + k2 (1, 1, 1) = (4, 2, 0) which is equivalent to 2k1 + k2 = 4 5k1 +k2 = -2 Ski + k2 = 0 However, this is an inconsistent system . For instance, subtracting the first equation from the third yields kl = -4, and substituting this value into either the first or third equation give s k2 = 12. However, (k1 , k2 ) = (-4, 12) does not satisfy the second equation . The conclusion is that v 3 is not a linear combination of v 1 and v 2, as stated in Example 16 . A similar argument would show that v 1 is not a linear combination of v 2 and v 3 and that v 2 is not a linear combination of v 1 and v 3 . Thus, these three vectors are indeed linearly independent . ■
An alternative—but entirely equivalent and often simpler—definition of linear independence reads as follows . A collection of vectors v i , v2 , . . ., v,. from R" is linearly independent if the only scalars that satisfy k1 v 1 + k2 v 2 + . . . + krv,. = 0 are k 1 = k2 = • • • = kr = 0. This is called the trivial linear combination . If, on the other hand, there exists a nontrivial linear combination that gives the zero vector, then the vectors ar e dependent .
CLIFFS QUICK REVIEW
138
REA L EUCLIDEA N VECTO R SPACES
Example 18 : Use this second definition to show that the vectors from Example 17 v 1 = (2, 5, 3), v2 = (1, 1, 1), and v 3 = (4, -2, 0)—are linearly independent . These vectors are linearly independent if the only scalar s that satisfy /qv, + k2 v2 + k3 v 3 = 0 (* ) are k, = k2 = k3 = 0. But (*) is equivalent to the homogeneou s syste m
V1
V2
V3
iii
Row-reducing the coefficient matrix yield s 2 2ri added to r2
ri H r2
4
1 -1 -1 0 1 0 _3
,
1 -1 -10 0 3 24 0 4 30
2ri added to r2 added to r3
(—4/3)r2 added to r3
1
~
1 -1 -10 0 3 24 0 0 -2
LINEAR ALGEBRA
139
REA L EUCLIDEA N VECTOR SPACES
This echelon form of the matrix makes it easy to see that k3 = 0, from which follow k2 = 0 and k l = 0 . Thus, equation (**)_ and therefore (*)__is satisfied only by kl = k2 = k3 = 0, whic h ■ proves that the given vectors are linearly independent . Example 19 : Are the vectors v 1 = (4, 1, 2), v2 and v3 = (1, 2, 1) linearly independent ?
=
(—3, 0, 1) ,
The equation k1 V 1 + k2v2 + k3v 3 = 0 is equivalent to th e homogeneous syste m -I
I
VI
V2
V3
kl k2 = 0 (* )
Row-reduction of the coefficient matrix produces a row o f zeros : 4 -3 1 vl
v2 V3
= 1 O -2
ri H r2
>
1 0 -2 4 -3 1
-4rl added to r2 2 rl added to r3
(1/3)r2 added to r3
1 0
-2-
0 -3
9
0 0
0
CLIFFS QUICK REVIEW 140
REA L EUCLIDEA N VECTOR SPACES
Since the general solution will contain a free variable, the homogeneous system (*) has nontrivial solutions . This show s that there exists a nontrivial linear combination of the vector s v 1 , v 2 , and v3 that gives the zero vector: v,, v2 , and v3 are dependent. ■ Example 20 : There is exactly one value of c such that th e vectors v l = (1, 0, 0, 1), v2 v3
=
=
(0, 1, -1, 0) ,
(-1, 0, -1, 0), and v4 = (1, 1, 1, c)
are linearly dependent . Find this value of c and determine a nontrivial linear combination of these vectors that equals the zero vector. As before, consider the homogeneous system iii v, v2
I
V3 V4
kk2
=0
k4
and perform the following elementary row operations on th e coefficient matrix :
LINEAR ALGEBRA 141
REA L EUCLIDEA N VECTO R SPACE S
1
0
-1
1
0
1
0
1
1
0
-1
-1
1
c
0
1
1
1
0
-1
1-
0
1
0
1
0
-1
-1
1
1
0
—r1 added to r4
r2 added to r3 —r2 added to r4
r3 added to r4
>
c— 1
1
0
-1
1
0
1
0
0 0
0 0
-1 1
1 2 c— 2
1 0 0
0 1 0
-1
1-
0 -1
1 2
0
0
0
c
In order to obtain nontrivial solutions, there must be at leas t one row of zeros in this echelon form of the matrix . If c is 0 , this condition is satisfied . Since c = 0, the vector v 4 equals (1 , 1, 1, 0) . Now, to find a nontrivial linear combination of th e vectors v 1, v 2, v3, and v4 that gives the zero vector, a particular nontrivial solution to the matrix equation V1
V2
V3
V4
k2 k3
1 0 -1 1 k, 0 1 0 1 k2 0 -1 -1 1 k 3
k4 _
1
1 0 0_ k4
is needed . From the row operations performed above, thi s equation is equivalent to
CLIFFS QUICK REVIE W 142
REA L EUCLIDEA N VECTO R SPACES
0 1 0
1
0 0 -1 2 0 0 0 0 The last row implies that k4 can be taken as a free variable ; let k4 = t. The third row then says –k3 +2k4 =0 = k3 =2k4 =2t The second row implies k2 + k4 = 0 ~ k2 = –k4 = –t and, finally, the first row give s k,–k3 +k4 =0 ~ k,–2k4 +k4 =0
k,=k4 = t
Thus, the general solution of the homogeneous system (* * ) — and (*)_is (k1 , k2, k3, k4)T = (t, t, 2t, t) T (***) for any t in R . Choosing t = 1, for example, gives (k1 , k2, k3, k4) T = (1, -1, 2, 1)T , so klv1 +k2v2 +k3 V 3 +k4 V 4
= V 1 -V2
+2V 3
+V 4
is a linear combination of the vectors v 1, v2, v3, and v 4 that equals the zero vector. To verify tha t v 1 -V2 +2v 3 +v4 = 0 simply substitute and simplify :
LINEAR ALGEBRA 143
REA L EUCLIDEA N VECTO R SPACE S
vl — v 2 +2v 3 +v 4 = (1, 0, 0, 1)—(0, 1, -1, 1) + 2(—1, 0, -1, 0)+0, 1, 1, 0 )
=(1—0—2+1, 0—1+0+1 , 0+1—2+1, 1—1+0+0) _ (0, 0, 0, 0) =0
3
Infinitely many other nontrivial linear combinations of v l , v 2, v3 , and v4 that equal the zero vector can be found by simpl y choosing any other nonzero value of t in (***) and substituting the resulting values of k l , k2, k3 , and k4 in the expression k l v l + k2 V 2 + k3 V3 + k4v4 . If a collection of vectors from R n contains more than n vectors, the question of its linear independence is easily answered . If C = {v,, v2, . . ., v n } is a collection of vectors fro m Rn and m > n, then C must be linearly dependent . To see why this is so, note that the equation kl v l + k2 v 2 + . . . + km v m = 0 (* ) is equivalent to the matrix equation Vl
V2
.. .
Vm
_kl _ k2
=0
km
Since each vector v./ contains n components, this matrix equation describes a system with m unknowns and n equations . Any homogeneous system with more unknowns than equations has nontrivial solutions (see Theorem B, page 90), a re -
CLIFFS QUICK REVIEW 144
REA L EUCLIDEA N VECTOR SPACES
suit which applies here since m > n . Because equation (*) ha s nontrivial solutions, the vectors in C cannot be independent . Example 21 : The collection of vectors {2i — j, i + j, — i + 4 j } from R Z is linearly dependent because any collection of 3 (o r more) vectors from R 2 must be dependent. Similarly, the collection {i + j — k, 2i — 3j + k, i — 4k, Zj, —Si + j — 3k} of vec tors from R3 cannot be independent, because any collection o f 4 or more vectors from R3 is dependent . ■ Example 22 : Any collection of vectors from R" that contain s the zero vector is automatically dependent, for if {v,, v2, . . ., 0} is such a collection, then for any k ~ 0 , 0v1 +0v 2 +•••+Ov r_1 +k0 is a nontrivial linear combination that gives the zero vector. ■
The Rank of a Matrix The maximum number of linearly independent rows in a matrix A is called the row rank of A, and the maximum number of linearly independent columns in A is called the colum n rank of A . If A is an m by n matrix, that is, if A has m rows and n columns, then it is obvious that row rank ofA m column rank of A 5 n
(*)
What is not so obvious, however, is that for any matrix A , the row rank of A = the column rank of A
LINEAR ALGEBRA 145
REA L EUCLIDEA N VECTOR SPACES
Because of this fact, there is no reason to distinguish betwee n row rank and column rank ; the common value is simply called the rank of the matrix . Therefore, if A is m x n, it follows from the inequalities in (*) that rank(A m X,, )
min (m, n)
(* *)
where min(m, n) denotes the smaller of the two numbers m and n (or their common value if m = n). For example, the rank of a 3 x 5 matrix can be no more than 3, and the rank o f a 4 x 2 matrix can be no more than 2 . A 3 x 5 matrix , -* * * * * -
can be thought of as composed of three 5-vectors (the rows ) or five 3-vectors (the columns) . Although three 5-vectors could be linearly independent, it is not possible to have five 3 vectors that are independent . Any collection of more than three 3-vectors is automatically dependent. Thus, the colum n rank and therefore the rank—of such a matrix can be n o greater than 3 . So, if A is a 3 x 5 matrix, this argument shows that rank(A3 x s) S 3 =min (3, 5 ) in accord with (**) . The process by which the rank of a matrix is determine d can be illustrated by the following example . Suppose A is the 4 x 4 matrix
CLIFFS QUICK REVIEW 146
REA L EUCLIDEA N VECTO R SPACES
1 -2 0
4-
3
1
0
-1 -5 -1
8
1
3 8 2 -1 2
The four row vectors, r1 = (1, — 2, 0, 4 ) r2 =(3,1,1,0 ) r3 = (—1, — 5, -1, 8)
r4 =(3,8,2,—12 ) are not independent, since, for example , r3 = 2 r1 — r2
and r4 = -3 r1 + 2 r2
The fact that the vectors r3 and r 4 can be written as linear combinations of the other two (r1 and r2 , which are independent) means that the maximum number of independent rows i s 2 . Thus, the row rank and therefore the rank—of this matri x is 2 . The equations in (***) can be rewritten as follows : -2r1 +r2 +r3 =0 and 3r1 —2r2 +r4 = 0 The first equation here implies that if -2 times the first row i s added to the third and then the second row is added to th e (new) third row, the third row will be become 0, a row of zeros. The second equation above says that similar operation s performed on the fourth row can produce a row of zeros ther e also. If after these operations are completed, -3 times the first row is then added to the second row (to clear out all entries below the entry a ll = 1 in the first column), these elementar y
LINEAR ALGEBRA
147
REA L EUCLIDEA N VECTO R SPACES
row operations reduce the original matrix A to the echelo n form 1 0 0 0
-2 0 7 1 0 0
0 0
4-1 2 0 0
The fact that there are exactly 2 nonzero rows in the reduce d form of the matrix indicates that the maximum number o f linearly independent rows is 2; hence, rank A = 2, in agreement with the conclusion above . In general, then, to compute the rank of a matrix, perform elementary row operations unti l the matrix is left in echelon form; the number of nonzero row s remaining in the reduced matrix is the rank . [Note: Since column rank = row rank, only two of the four columns in A—c 1, c2 , c3 , and c 4 are linearly independent . Show that this is in deed the case by verifying the relations c2 = -2cl + 7c3 and c4 = 4c 1 — 12c3 (and checking that c l and c 3 are independent) . The reduce d form of A makes these relations especially easy to see .] Example 23 : Find the rank of the matrix -2 -1 31 1 0 B=
0 2 -1 1 1 4
CLIFFS QUICK REVIE W 148
REA L EUCLIDEA N VECTOR SPACE S
First, because the matrix is 4 x 3, its rank can be no greater than 3 . Therefore, at least one of the four rows will become a row of zeros. Perform the following row operations : 2
-1
3-
1
0
1
r1 H
r2
1 0 2 -1
13
0 2 -1
0
2 -1
1
4
1
1
- 2r, added to r2 —r1 added to r4
1 0 0 -1
1
2r2 added to r3 r2 added to r4
-4r3 added to r4
11
0
2 -1
0
1
3
1 0 10 -1 1 0 0 0 0
(—1)r2
4
1 4
10 10 1 -1 00 00
1 0
Since there are 3 nonzero rows remaining in this echelon for m of B, rank B = 3
■
LINEAR ALGEBRA 149
REA L EUCLIDEA N VECTOR SPACE S
Example 24 : Determine the rank of the 4 by 4 checkerboard matri x
C=
1
-1
1
-1 -
-1
1
-1
1
1
-1
1
-1
-1
1
-1
1
Since r2 = r4 = -r 1 and r3 = r1, all rows but the first vanish upon row-reduction : -1
1 -1
1
rl added to r2 —rl added to r3 r1 added to r4
1 -1 1 -1 -1 1 -1 1
1 -1 1 -1 ^ 0 0 0 0 0 0 0 0 0 0
Since only 1 nonzero row remains, rank C = 1 .
0 0
■
A Basis for a Vector Space Let V be a subspace of R n for some n. A collection B = {v l , v2, . . ., vr} of vectors from V is said to be a basis for V if B is linearly independent and spans V. If either one of these criteria is not satisfied, then the collection is not a basis for V. If a collection of vectors spans V, then it contains enough vector s so that every vector in V can be written as a linear combinatio n of those in the collection . If the collection is linearly independent, then it doesn't contain so many vectors that some become dependent on the others . Intuitively, then, a basis has just the right size : It's big enough to span the space but not s o big as to be dependent .
150
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTOR SPACE S
Example 25 : The collection {i, j} is a basis for R2, since it spans R2 (Example 14) and the vectors i and j are linearly independent (because neither is a multiple of the other) . This is called the standard basis for R 2. Similarly, the set {i, j, k} i s called the standard basis for R3 , and, in general, {e l = (1, 0, 0, . . ., 0), e 2 = (0, 1, 0, . . ., 0), . . e,, _ (0, 0, . . ., 0, 1) } is the standard basis for W .
■
Example 26 : The collection {i, i + j, 2 ,0 is not a basis for R 2. Although it spans R2, it is not linearly independent . No collection of 3 or more vectors from R 2 can be independent. ■ Example 27 : The collection 1i+ j, j + k} is not a basis for W . Although it is linearly independent, it does not span all of W . For example, there exists no linear combination of i + j and j + k that equals i + j + k . ■ Example 28 : The collection {i+ j, i— j} is a basis for R 2. First, it is linearly independent, since neither i + j nor i — j is a multiple of the other . Second, it spans all of R 2 because every vector in R2 can be expressed as a linear combination of i + j and i — j . Specifically, if ai + bj is any vector in R2, the n k~(i+ j)+k Z (i— j) =
if kl = 2 (a + b) and k2 = i (a — b) .
ai+bj
■
LINEAR ALGEBRA
151
REA L EUCLIDEA N VECTOR SPACES
Examples 25 and 28 showed that a space may have man y different bases . For example, both {i, al} and { i+ j, i — j } are bases for R 2. In fact, any collection containing exactly two linearly independent vectors from R2 is a basis for R 2 . Similarly, any collection containing exactly three linearly independent vectors from R3 is a basis for R3, and so on . Although no nontrivial subspace of Rn has a unique basis, there is something that all bases for a given space must have in common . Let V be a subspace of Rn for some n. If V has a basi s containing exactly r vectors, then every basis for V contain s exactly r vectors . That is, the choice of basis vectors for a given space is not unique, but the number of basis vectors is unique. This fact permits the following notion to be well de fined: The number of vectors in a basis for a vector spac e V Rn is called the dimension of V, denoted dim V. Example 29 : Since the standard basis for R 2, { i, j }, contain s exactly 2 vectors, every basis for R2 contains exactly 2 vectors, so dim R2 = 2 . Similarly, since { i, j, k } is a basis for R 3 that contains exactly 3 vectors, every basis for R 3 contains exactly 3 vectors, so dim R 3 = 3 . In general, dim Rn = n for every natural number n . ■ Example 30 : In R3, the vectors i and k span a subspace of dimension 2 . It is the x-z plane, as shown in Figure 43 .
CLIFFS QUICK REVIEW
152
REA L EUCLIDEA N VECTO R SPACES
/ ~z plane
~r
y
■ Figure 43
■
Example 31 : The one-element collection O. + j = (1, 1)} is a basis for the 1-dimensional subspace V of R2 consisting of th e line y = x . See Figure 44 .
■ Figure 44
■
LINEAR ALGEBRA
153
REA L EUCLIDEA N VECTOR SPACE S
Example 32 : The trivial subspace, {0} , of R n is said to hav e dimension 0 . To be consistent with the definition of dimension, then, a basis for {0} must be a collection containing zer o elements; this is the empty set, 0 . ■
The subspaces of R ', R2, and R 3 , some of which have bee n illustrated in the preceding examples, can be summarized as follows : subspaces of R '
subspaces of R2
subspaces of R3
dim = 0:
{0}
{0}
{0}
dim = 1 :
R'
lines through the origin
lines through the origin
R2
planes through the origin
dim = 2: dim = 3 :
R3
Example 33 : Find the dimension of the subspace V of R 4 spanned by the vector s vl = (1, -2, 0, 4) v2 = (3, 1, 1, 0) v3 = (—1, — 5, -1, 8 ) v4 = (3, 8, 2, -12 ) The collection {v l, v 2, v3, v4} is not a basis for V —and di m V is not 4—because {v,, v2, v3, v4} is not linearly independent ; see the calculation preceding Example 23 above . Discarding v3 and v4 from this collection does not diminish the span o f
154
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTO R SPACES
{v,, v2 , v3 , v4 }, but the resulting collection, {v,, v 2}, is linearl y independent . Thus, {v,, v2} is a basis for V, so dim V = 2 . ■
Example 34 : Find the dimension of the span of the vector s w,=(1,2,3,4,5), w2 =(-2, 1, 3, 5,--4), and w 3 = (—1,8,3,2 ;7) Since these vectors are in R 5, their span, S, is a subspace o f R5. It is not, however, a 3-dimensional subspace of R 5, since the three vectors w 1, w2, and w3 are not linearly independent . In fact, since w3 = 3w1 + 2w2, the vector w3 can be discarde d from the collection without diminishing the span . Since the vectors w 1 and w 2 are independent—neither is a scalar multiple of the other—the collection { w 1 , w2 } serves as a basis for S, so its dimension is 2 . ■
The most important attribute of a basis is the ability t o write every vector in the space in a unique way in terms of th e basis vectors . To see why this is so, let B = { v 1 , v2 , . . ., V r } be a basis for a vector space V. Since a basis must span V, every vector v in V can be written in at least one way as a linea r combination of the vectors in B . That is, there exist scalars kl, k2, . . ., k,. such that k1V1+k2v2+ . . .+krVr =V
(* )
To show that no other choice of scalar multiples could give v , assume that (** ) k1V1 +k2v2 + . . .+krVr = v is also a linear combination of the basis vectors that equals v .
LINEAR ALGEBRA
155
REA L EUCLIDEA N VECTO R SPACES
Subtracting (*) from (**) yields (k1'—k1 )v1 +(kZ —kZ )vZ +•••+(kr —kt )V r = O This expression is a linear combination of the basis vectors that gives the zero vector. Since the basis vectors must be linearly independent, each of the scalars in (***) must be zero : k; —k1 =0, k2 —k2 =0, . . ., kr — kr = 0 Therefore, lc; = kl, k2 = k2, . . ., and kr = kr, so the representation in (*) is indeed unique . When v is written as the linea r combination (*) of the basis vectors v 1, v2, . . ., vr, the uniquely determined scalar coefficients k 1, k2, . . ., kr are called the components of v relative to the basis B. The row vector (k1, k2, . . k,.) is called the component vector of v relative to B and is denoted (v) B. Sometimes, it is convenient to write th e component vector as a column vector; in this case, the compo nent vector (k 1 , k2, . . ., kr)T is denoted [v] B. Example 35 : Consider the collection C = {i, i + j, 2j) of vec tors in R 2. Note that the vector v = 3i + 4j can be written as a linear combination of the vectors in C as follows : 3i + 4j = 1(i) + 2(i + j) + 1(2j ) and 3i + 4j = 3(i) + 0(i + j) + 2(2j ) The fact that there is more than one way to express the vecto r v in R2 as a linear combination of the vectors in C provide s another indication (besides the simpler one stated in Exampl e 26) that C cannot be a basis for R 2. If C were a basis, the vector v could be written as a linear combination of the vectors i n C in one and only one way . ■
CLIFFS QUICK REVIE W
156
REA L EUCLIDEA N VECTO R SPACES
Example 36 : Consider the basis B = {i+ j, 2i — j} of R 2. Determine the components of the vector v = 2i — 7j relative to B . The components of v relative to B are the scalar coefficients kl and k2 which satisfy the equatio n kl (i + j) + k2(2i — j) = 2i — 7j This equation is equivalent to the syste m k, + 2k2 = 2 — k2 = - 7 The solution to this system is k l = -4 and k2 = 3, s o (v)B = (-4, 3) II
Example 37 : Relative to the standard basis {i, j, k} = (e l , e2 , e 3 } for R3 , the component vector of any vector v i n R3 is equal to v itself: (v)B = v . This same result holds for th e ■ standard basis { e l , e 2 , . . ., e n } for every Rn. Orthonormal bases . If B = {v 1, v2 , . . ., vn } is a basis for a vector space V, then every vector v in V can be written as a linear combination of the basis vectors in one. and only one way : /,v, +k2v2 +•••+kn v n = v Finding the components of v relative to the basis B —the scalar coefficients k1, k2, . . ., kn in the representation abovegenerally involves solving a system of equations, as in Example 36 above . However, if the basis vectors are orthonormal , that is, mutually orthogonal unit vectors, then the calculatio n
LINEAR ALGEBRA
157
REA L EUCLIDEA N VECTO R SPACES
of the components is especially easy. Here's why . Assume that B = { v l , v2 , . • •, in } is an orthonormal basis . Starting with the equation above—with vl , v2 , . . ., i n replacing v 1 , v2 , . . ., v n to emphasize that the basis vectors are now assumed to be uni t vectors—take the dot product of both sides with vl : (kivl + k2v2 + .+knin) .il = v . vl By the linearity of the dot product, the left-hand side becomes kl(Vl . Vl)+k2(v2 .Vl)+ . . .+kn(vn . Vl) = v . vl Now, by the orthogonality of the basis vectors, vi • vl = 0 for i = 2 through n . Furthermore, because vl is a unit vector, vi • v, = 0 1 11 2 = 12 =1 . Therefore, the equation above simplifies to the statement kl = v .vl In general, if B = { v l , v2 , . . ., i n ) is an orthonormal basis fo r a vector space V, then the components, kl, of any vector v relative to B are found from the simple formul a kl = component i of (v )B
=v.
Example 38 : Consider the vectors v1 (—2 , 2 , 1 ) , v2 = (1, -1, 4), and v 3
=
(1, 1, 0)
from R 3 . These vectors are mutually orthogonal, as you may easily verify by checking that v l . v2 = vl . v3 = v2 • v3 = O . Normalize these vectors, thereby obtaining an orthonorma l basis for R3 and then find the components of the vector v = (1, 2, 3) relative to this basis .
158
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTOR SPACES
A nonzero vector is normalized—made into a unit vector—by dividing it by its length . Therefore , Vi -
yl
(-2, 2, 1)
1lv 1 ii
3
v2 " 2- 1 1211 v3 11311
- 32' 23 ' 31 lI
(1, -1, 4 ),( 1 Vig
– 1 3~ ' 3~ 3~ 3~ '
(1,1,O)( -If -
4
~~
Since B = 0'7 0 v2 , v 3 } is an orthonormal basis for R3, the result stated above guarantees that the components of v relativ e to B are found by simply taking the following dot products : = v . il = (1, 2, 3) . (--i, 19
=i
1 _ 1 4 ) k2 = V • V 2 = (1, 2, 3) .( 3,a ' 3,a ' 3~
11 3~
k3 =vv3=(1, 2, 3) - ( 1 , *, 0 – Therefore, (v) B = (5 / 3, 11/(31i), 3/-Ti), which means that the unique representation of v as a linear combination of the basi s vectors reads v = 5 / 3v i + 11 /(31i)v 2 + 3 / ,/i- v3, as you may verify . ■ Example 39 : Prove that a set of mutually orthogonal, nonzer o vectors is linearly independent.
Proof Let { v 1, v 2 , . . ., yr } be a set of nonzero vector s from some R" which are mutually orthogonal, which mean s that no vi = 0 and v i . v1 = 0 for i ~ j . Let /1 V, + k2v2 + . . . + kr v r = 0 (*)
LINEAR ALGEBRA
159
REA L EUCLIDEA N VECTO R SPACES
be a linear combination of the vectors in this set that gives the zero vector. The goal is to show that kl = k2 = • • • = kr = 0. T o this end, take the dot product of both sides of the equatio n with v l : (k1v1 +k2v2 +•••-FkrVr)•Vl = 0•V l k,(v, •v,)+kZ(v2 •v,)+•••+kn(Vr•v,)= 0
k1 Ilv l II2 +k2(0)+•••+k,(0)= 0 =0 The second equation follows from the first by the linearity o f the dot product, the third equation follows from the second b y the orthogonality of the vectors, and the final equation is a consequence of the fact that 11 v 111 2 ~ 0 (since v 1 0). It is now easy to see that taking the dot product of both sides of (* ) with vi yields kl = 0, establishing that every scalar coefficient in (*) must be zero, thus confirming that the vectors v l, v2, . . ., yr are indeed independent . ■
Projection onto a Subspac e Let S be a nontrivial subspace of a vector space V and assume that v is a vector in V that does not lie in S. Then the vector v can be uniquely written as a sum, vlls + vls, where vlls is parallel to S and v ls is orthogonal to S; see Figure 45 .
160
CLIFFS QUICK REVIE W
REA L EUCLIDEA N VECTO R SPACE S
■ Figure 45
■
The vector vlls, which actually lies in S, is called the projectio n of v onto S, also denoted projsv . If v l , v 2 , . . ., y r form an orthogonal basis for S, then the projection of v onto S is the sum of the projections of v onto the individual basis vectors, a fact that depends critically on the basis vectors being orthogonal: projsv = proL l v + proj, 2 v + • • • + proj, r v v•vl vl ' Vl
V1
+
v .v2 V2 . V 2
v2 + . . .+
V•vr
yr
(* )
Vr • yr
Figure 46 shows geometrically why this formula is true in th e case of a 2-dimensional subspace S in R 3 .
LINEAR ALGEBRA
161
REA L EUCLIDEA N VECTOR SPACES
■ Figure 46 ■
Example 40 : Let S be the 2-dimensional subspace of R 3 spanned by the orthogonal vectors v1 = (1, 2, 1) and v2 = (1 , -1, 1) . Write the vector v = (–2, 2, 2) as the sum of a vector i n S and a vector orthogonal to S. From (*), the projection of v onto S is the vecto r projsv = proLl v + proj, 2 v vi •vl
vl
vZ •v2
s
(–2)(1)+(2)(2)+(2)(1) ) (1, 2, 1 (1)(1) + (2)(2) + (1)(1 ) (—2)(l)+(2)(—l)+(2)(l) (I, _1, I ) (1)(1) + (—1)(—1) + (1)(1 ) = -t(1, 2, 1)—0, -1, 1 ) = (0, 2, 0 ) Therefore, v = vll s + vls, where vll s = projsv = (0, 2, 0) and
CLIFFS QUICK REVIEW 162
REA L EUCLIDEA N VECTO R SPACE S
vl s = v — v~~ s _ (2 , 2 , 2) — (0 , 2 , 0 ) _ (—2 , 0, 2 )
That vls = (—2, 0, 2) truly is orthogonal to S is proved b y noting that it is orthogonal to both v 1 and v2: vls •v 1 = (—2, 0, 2) - (1, 2, 1) = 0 vls - v 2 = (—2, 0, 2)•(1, -1, 1) = 0 In summary, then, the unique representation of the vector v a s the sum of a vector in S and a vector orthogonal to S reads a s follows : (—2, 2, 2) = (0, 2, 0)+(—2, 0, 2) V
Vls
VIIS
See Figure 47 .
■ Figure 47
■
LINEAR ALGEBRA 163
REA L EUCLIDEA N VECTO R SPACES
Example 41 : Let S be a subspace of a Euclidean vector spac e V. The collection of all vectors in V that are orthogonal to every vector in S is called the orthogonal complement of S: Sl = {v
E
V : v1 s for every s e S}
_ {v
E
V : v• s= 0 for every s e S}
(S1 is read "S perp.") Show that Sl is also a subspace of V. Proof First, note that S I is nonempty, since 0 E S1 . In order to prove that S1 is a subspace, closure under vector addition and scalar multiplication must be established . Let v , and v2 be vectors in S I ; since v l • s = v 2 • s = 0 for every vecto r s in S, (vl + v2 ) . s = vl s + v2 s = 0 for every s e S proving that vl + v2 E Sl . Therefore, Sl is closed under vecto r addition . Finally, if k is a scalar, then for any v in S 1 , (kv) • s = k(v • s) = k(0) = 0 for every vector s in S, which shows that Sl is also closed under scalar multiplication . This completes the proof. ■ Example 42 : Find the orthogonal complement of the x -y plane in R3. At first glance, it might seem that the x-z plane is the orthogonal complement of the x-y plane, just as a wall is perpendicular to the floor . However, not every vector in the x- z plane is orthogonal to every vector in the x-y plane: for example, the vector v = (1, 0, 1) in the x-z plane is not orthogonal
164
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTO R SPACE S
to the vector w = (1, 1, 0) in the x-y plane, since v • w = 1 0 . See Figure 48 . The vectors that are orthogonal to every vecto r in the x-y plane are only those along the z axis ; this is the orthogonal complement in R 3 of the x -y plane . In fact, it can b e shown that if S is a k-dimensional subspace of R n , then dim S l = n —k ; thus, dim S + dim Sl = n, the dimension of the entire space . Since the x-y plane is a 2-dimensional subspac e of R3 , its orthogonal complement in R 3 must have dimension 3 — 2 = 1 . This result would remove the x-z plane, which is 2 dimensional, from consideration as the orthogonal complement of the x -y plane .
Sl = z axis
x-z plane
S = xy plane
x.
■ Figure 48
■
LINEAR ALGEBRA
165
REA L EUCLIDEA N VECTOR SPACE S
Example 43 : Let P be the subspace of R 3 specified by the equation 2x + y — 2z = 0 . Find the distance between P and the point q = (3, 2, 1) . The subspace P is clearly a plane in R 3, and q is a poin t that does not lie in P . From Figure 49, it is clear that the distance from q to P is the length of the component of q orthogonal to P . distance fro m q to P = IIg 1 4 = comp o
q
,,,j, q1 P
■ Figure 49 ■
One way to find the orthogonal component cli p is to find an orthogonal basis for P, use these vectors to project the vector q onto P, and then form the difference q — proj pq to obtai n q1 p . A simpler method here is to project q onto a vector tha t is known to be orthogonal to P. Since the coefficients of x, y, and z in the equation of the plane provide the components o f a normal vector to P, n = (2, 1, -2) is orthogonal to P. Now, since comp,, q -
q n II I'
(3)(2) + (2)(l) + (1)(—2 ) —2 V2 2 + 1 2 +(_2)2
the distance between P and the point q is 2 .
■
CLIFFS QUICK REVIE W 166
REA L EUCLIDEA N VECTO R SPACES
The Gram-Schmidt orthogonalization algorithm . The advantage of an orthonormal basis is clear. As Example 38 showed , the components of a vector relative to an orthonormal basi s are very easy to determine : A simple dot product calculation i s all that is required. The question is, how do you obtain such a basis? In particular, if B is a basis for a vector space V, how can you transform B into an orthonormal basis for V? The process of projecting a vector v onto a subspace S--the n forming the difference v — p ro j sv to obtain a vector, v 1s, orthogonal to S—is the key to the algorithm .
2
Example 44 : Transform the basis B = {v, = (4, 2), v2 for R into an orthonormal one .
=
(1, 2) }
The first step is to keep v i ; it will be normalized later . Th e second step is to project v 2 onto the subspace spanned by v , and then form the difference v2 — proj,, I v2 = v l, . Sinc e (1)(4)+(2)(2) (4 pro~v~ ~2 = v 2 ' vl l ~~ = (4)(4)+ (2)(2) , 2) _ (1, 5vi the vector component of v2 orthogonal to v~ is vl , =vZ —proL ► vz = (1, 2)—(s'
0 — ( —s> s )
as illustrated in Figure 50 .
LINEAR ALGEBRA
167
REA L EUCLIDEA N VECTO R SPACES
VZ
~
x'11
vl
■ Figure 50 ■
l
The vectors v, and v , are now normalized: A
v
V
v
l
= - =
ll
Dv
`'ll
lviii
(4, 2)
-(-)-
V2
(2
' 15-
1
i ,
5 V3'
l 2 B' - Iii = (*, *), V
Thus, the basis B = { v = (4, 2), v into the orthonormal basis
=
.a ' 1
2
(1, 2) } is transforme d
11 =(—*,
*) 1
shown in Figure 51 .
CLIFFS QUICK REVIE W
168
REA L EUCLIDEA N VECTOR SPACES
V1 1
■ Figure 51
■
The preceding example illustrates the Gram-Schmidt orthogonalization algorithm for a basis B consisting of two vectors . It is important to understand that this process not only produces an orthogonal basis B' for the space, but also preserves the subspaces . That is, the subspace spanned by the first vector in B' is the same as the subspace spanned by the firs t vector in B, and the space spanned by the two vectors in B' is the same as the subspace spanned by the two vectors in B. In general, the Gram-Schmidt orthogonalization algorithm, which transforms a basis, B = {v,, v2, . . ., Vr }, for a vector space V into an orthogonal basis, B' = {w,, w2, . . ., wr }, for V —while preserving the subspaces along the way—proceed s as follows : Step 1 . Set w, equal to v , Step 2. Project v 2 onto S,, the space spanned by w 1; then, for m the difference v2 — projs 1v2 This is w 2.
LINEAR ALGEBRA
169
REA L EUCLIDEA N VECTOR SPACE S
Step 3 . Project v 3 onto S2, the space spanned by w 1 and w2 ; then, form the difference v 3 – projs5v3 . This is w3 . Step i. Project vi onto St_l , the space spanned by w 1, . . ., w i_1 ; then, form the difference v i – projst_1 v i . This is w i.
This process continues until Step r, when wr is formed, and th e orthogonal basis is complete . If an orthonormal basis is de sired, normalize each of the vectors w i . Example 45 : Let H be the 3-dimensional subspace of R 4 with basis B= {v 1
=
(0, 1, -1, 0), v 2
=
(0, 1, 0, 1), v 3
= (1,
-1, 0, 0)}
Find an orthogonal basis for H and then-by normalizin g these vectors an orthonormal basis for H. What are the components of the vector x = (1, 1, -1, 1) relative to this orthonormal basis? What happens if you attempt to find th e components of the vector y = (1, 1, 1, 1) relative to the orthonormal basis? The first step is to set w, equal to v l . The second step is t o project v 2 onto the subspace spanned by w l and then form the difference v 2 – proj w1 v 2 = w 2 . Sinc e A rolw, v2 = vZ W ' w' l w l 'w , (0)(0) + (1)(1) + (0)(–1) + (1)(0) (0, 1, (0)(0) + (1)(1) + (–1)(–1) + (0)(0) ,
1, 0 )
0)
CLIFFS QUICK REVIE W 170
REA L EUCLIDEA N VECTO R SPACE S
the vector component of v2 orthogonal to w 1 is w 2 = v 2 — pro)w l v 2 =(0,1,0,1)—(0,
-2, 0
=(0, 1, 21 ~ 1 ) Now, for the last step : Project v3 onto the subspace S2 spanned by w, and w2 (which is the same as the subspace spanned b y v, and v 2 ) and form the difference v3 — projS2v3 to give th e vector, w3, orthogonal to this subspace . Since Pro lw, v3 — V3 W~ w I W I .W l (1)(0)+(—1)(1)+(0)(—1)+(0)(0) (0, 1, -1, 0) (0)(0) + (1)(1) + (—1)(—1) + (0)(0 ) Zs 4-,
s
0)
and V3
prOlw2
W2 w W2 .W 2 2 (1)(0) + (- I )(+) +MI) +(0)(1) (0)M +(I)(I)+(+)(+)+ (DM (0' 1' 1' 1) _ ll
and {w 1 , w2} is an orthogonal basis for SZ, the projection of v3 onto SZ is proj S2 v3
=
proj w, v3 + proj R,z V3 ( 0,
2
,
2,
°) + (o'
6,
-1)
LINEAR ALGEBRA 171
REA L EUCLIDEA N VECTOR SPACES
This gives W 3 = V3 — projs2 v 3 1
= (1, -1, 0, o)—(o , - 1,
-
1,
-
1 3) ,
Therefore, the Gram-Schmidt process produces from B th e following orthogonal basis for H : B' {w 1
(0, 1,
1, 0), w 2
(0, l 2,
1,
1), w3
1,
-1,
3,
You may verify that these vectors are indeed orthogonal b y checking that w i ' w2 = w, ' w 3 = w2 ' w3 = 0 and that the sub spaces are preserved along the way: span{w, } =span{v l } span{w l , w 2 } = span{v„ v 2 } span{w l , w 2 , w3 } = span{v„ v 2 , v3 } An orthonormal basis for H is obtained by normalizing the vectors w 1 , w2 , and w3 : wl
wl
w
W2 =
I)
=
(0' 2' 2'1) — (0, 1 ,-
1 72
2 ~6
2II
W3 _ (1 W3
= 0,
1
1
1
—
7
,
l
W2 W
(0, 1, -1, 0)
l w3ll ^
9
3,
39 3
12
1 'V1' o
3 2,5 '
'V 0
'
2 'V o
1 1 1 2,5 ' 24P 2,5
CLIFFS QUICK REVIEW
I)}
REA L EUCLIDEA N VECTO R SPACES
Relative to the orthonormal basis B" = { w 1 , w 2 , w 3 } , the vec tor x = (1, 1, -1, 1) has components x•w 1 = (1,1, -1,1)•0,
,0 = 'If
1 , — 1
1 1 2 _ 2 x•w2 =(1,1, — l,l•0, ., -, 7 - 7 ) 2 ~, 3 1 1 , 1_ 2 ~, — 2 X • W 3 -- (1, 1, -1, 1
These calculations imply that
X=If W1+
2 W 2 + 2 W3
a result that is easily verified . If the components of y = (1, 1, 1, 1) relative to this basi s are desired, you might proceed exactly as above, findin g y•w 1 = (1, 1, 1, 1)• 0,, — *, 0 = 0 1 1 Y' W2 — (1, 1, 1, 1) .(o, *, *
w3
2
4
,~
3 1 ' 1 (1, 1 1 1)• 2~' — 2 , , — 2,13 , 2~ —
These calculations seen to imply that y=ow l + * w 2 + - W 3 = * "2 +
W3
The problem, however, is that this equation is not true, as th e following calculation shows :
LINEAR ALGEBRA 173
REA L EUCLIDEA N VECTO R SPACE S
4
w+ 2 1 W3
0 16* '
= 4
+
1
1
1fl2 3 ' 4
2
-a 2,5' 24P 2,5
=(o, l 3 ' 3 ' 1)+(1, _ 1 21
1
1
29 29
_ 1 _ 1
6'
1
6' 6
3) 2
y
What went wrong? The problem is that the vector y is not in H, so no linear combination of the vectors in any basis for H ca n give y . The linear combination
1 4 ^ W2+,f"S w3
gives only the projection of y onto H .
■
Example 46 : If the rows of a matrix form an orthonorma l basis for then the matrix is said to be orthogonal . (The term orthonormal would have been better, but the terminology is now too well established.) If A is an orthogonal matrix, show that A -1 = A T . Let B = {it , v 2, . . ., 'in ) be an orthonormal basis for R " and consider the matrix A whose rows are these basis vectors :
A= v,,
The matrix A T has these basis vectors as its columns :
174
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTO R SPACE S
A T = V~ V 2
y
...
Vn
y
Since the vectors v 1 , i29 . . ., in are orthonormal ,
10 if i j 1 ifi= j Now, because the (i, j) entry of the product AAT is the dot product of row i in A and column j in AT, AA T = [vl . fl . [Sl; ] = I Thus, A-' = A T. [In fact, the statement A-1 = AT is sometimes taken as the definition of an orthogonal matrix (from which i t is then shown that the rows of A form an orthonormal basi s for R").] An additional fact now follows easily. Assume that A is orthogonal, so A -1 = A T. Taking the inverse of both sides o f this equation give s (AT)T=(AT)_ 1 (A-')-' = (AT)-i ~ A= ( A r)-1 which implies that A T is orthogonal (because its transpos e equals its inverse) . The conclusion A orthogonal ~ A T orthogonal means that if the rows of a matrix form an orthonormal basi s for R ", then so do the columns . ■
LINEAR ALGEBRA
175
REA L EUCLIDEA N VECTO R SPACES
The Row Space and Column Space of a Matri x Let A be an m by n matrix . The space spanned by the rows o f A is called the row space of A, denoted RS(A) ; it is a subspace of Rn. The space spanned by the columns of A is called th e column space of A, denoted CS(A); it is a subspace of Rm. The collection {r,, r2, . . ., r,„} consisting of the rows of A may not form a basis for RS(A), because the collection ma y not be linearly independent . However, a maximal linearly in dependent subset of {r,, r2, . . ., r,,,} does give a basis for the row space. Since the maximum number of linearly independent rows of A is equal to the rank of A , dim RS(A) = rank A (* ) Similarly, if c o c2, . . ., en denote the columns of A, then a maximal linearly independent subset of {c o c2, . . ., cn} gives a basis for the column space of A . But the maximum number o f linearly independent columns is also equal to the rank of th e matrix, so dim CS(A) = rank A (** ) Therefore, although RS(A) is a subspace of R" and CS(A) is a subspace of R'", equations (*) and (**) imply tha t dim RS(A)
= dim CS(A )
even if m ~ n .
CLIFFS QUICK REVIE W
176
REA L EUCLIDEA N VECTOR SPACE S
Example 47 : Determine the dimension of, and a basis for, th e row space of the matrix 2 -1 B=
0
3 1
0 2
-1
1 1
1
4
In Example 23 above, a sequence of elementary row operations reduced this matrix to the echelon matri x 1
0
1-
0
1 -1
0 0
1
0 0
0
The rank of B is 3, so dim RS(B) = 3 . A basis for RS(B) consists of the nonzero rows in the reduced matrix : 1(1, o, I), (o, 1,—1), (o, o , Another basis for RS(B), one consisting of some of the original rows of B, i s {r1 r2 , r3 } = {(2, -1, 3), (1, 0, 1), (0, 2, -1 ) Note that since the row space is a 3-dimensional subspace o f R3 , it must be all of R3. n Criteria for membership in the column space . If A is an m x n matrix and x is an n-vector, written as a column matrix, the n the product Ax is equal to a linear combination of the columns of A :
LINEAR ALGEBRA
177
REA L EUCLIDEA N VECTOR SPACE S
AX = C l _I
CZ I
x1 x2 = x1 C 1 + x2 C 2
• • • Cn
+ . . . + x,z cn
(*)
(_
xn By definition, a vector b in R m is in the column space of A if i t can be written as a linear combination of the columns of A . That is, b E CS(A) precisely when there exist scalars x1 , x2, . . xn such that 4. ) x1 C 1 + x2c2 + . . . + xncn = b Combining (*) and (**), then, leads to the following conclusion : b E CS(A)
q
Ax = b is consistent
Example 48 : For what value of b is the vector b = (1, 2, 3, b) T in the column space of the following matrix ? -2 3 30 -4 -5 A= 6 3 0 1 1 3
Form the augmented matrix [Al b] and reduce :
CLIFFS QUICK REVIEW 178
REA L EUCLIDEA N VECTOR SPACE S
2 3 3 0 -4 -5 [AI b] 6 3 0 1 1 3
12 3 b 1
rl H r4
>
added to r3 added to r4
3r2 added to r3 4r2 added to r4
— 2 r3 added to r4
3
3
b-
2 3 1
3 0 -4 -5 1
b
1
2 0 -3 -18 3—6b 0 1 -3 1— 2 b -1
r2H r4
3
0 -4 -5 6 3 0 2
- 6rl -2r1
1
1
3
b-
0
1 -3 1— 2 b 0 -3 -18 3—6 b 0 -4 -5 2 -1 1 01
3 -3
b1— 2 b
0 0 -27 0 0 -17
6—12 b 6—8 b
11 3 b0 1 -3 1—2b = 0 0 -27 6 -12b 00 0 (6 — 8b) — 27 (6 -12b )
b']
LINEAR ALGEBRA 179
REA L EUCLIDEA N VECTO R SPACES
Because of the bottom row of zeros in A' (the reduced form of A), the bottom entry in the last column must also be 0— giving a complete row of zeros at the bottom of [A' lb' ]—in order for the system Ax = b to have a solution . Setting (6— 8b) — (17/27)(6 -12b) equal to 0 and solving for b yield s (6—8b)—(6—12b)= 0 27(6 — 8b)=17(6 -12b) 162 — 216b =102 — 204b -12b = -60 b= 5 Therefore, b = (1, 2, 3, b)T is in CS(A) if and only if b = 5 .
■
Since elementary row operations do not change the rank of a matrix, it is clear that in the calculation above, rank A = rank A' and rank [ A ~ b] = rank [ A' ~ b'] . (Since the botto m row of A ' consisted entirely of zeros, rank A ' = 3, implying rank A = 3 also .) With b = 5, the bottom row of [A'lb'] als o consists entirely of zeros, giving rank [ A' ~ b' ] = 3 . However, i f b were not equal to 5, then the bottom row of [A'lb'I woul d not consist entirely of zeros, and the rank of [A' b'] would have been 4, not 3 . This example illustrates the following general fact : When b is in CS(A), the rank of [ A ~ b] is the same as the rank of A ; and, conversely, when b is not in CS(A), the rank of [Al b] is not the same as (it's strictly greater than) th e rank of A . Therefore, an equivalent criterion for membershi p in the column space of a matrix reads as follows : b
E CS(A)
q
rank A = rank [A ~ b]
CLIFFS QUICK REVIEW
180
REA L EUCLIDEA N VECTOR SPACE S
Example 49 : Determine the dimension of, and a basis for, th e column space of the matrix
B=
2 -1 31 0 1 0 2 -1 1 1 4
from Example 47 above . Because the dimension of the column space of a matri x always equals the dimension of its row space, CS(B) must also have dimension 3: CS(B) is a 3-dimensional subspace of R4. Since B contains only 3 columns, these columns must be linearly independent and therefore form a basis :
basis for CS(B) =
2 1
3 0 2
1 -1
1
4
■
Example 50 : Find a basis for the column space of the matri x 1 2 -1 3 1A= 2 0 -6 2 - 2 -3 1 10 -2 4 Since the column space of A consists precisely of thos e vectors b such that A x = b is a solvable system, one way t o determine a basis for CS(A) would be to first find the space o f all vectors b such that Ax = b is consistent, then constructing a basis for this space . However, an elementary observation sug -
LINEAR ALGEBRA 181
REA L EUCLIDEA N VECTO R SPACES
gests a simpler approach : Since the columns of A are the rows ofA T, finding a basis for CS(A) is equivalent to finding a basi s for RS(A T) . Row-reducing A T yields 1 2
2 0
-3 1
A T = -1 3
-6
10
1
-2
2 -2
-2 r1 added to r2 rl added to r3 -3r1 added to r4 —rl added to r5
4 —r2 added to r3 —r2 added to r4 —r2 added tor5
1 0
2 -4
-37
0
-4
0 0
-4 -4
7 7 7
-1 2 -30 -4 7 0 0 0 0
0 0
0 0 0_ Since there are two nonzero rows left in the reduced form o f AT, the rank of A T is 2, so dim RS(A T) = 2 = dim CS(A ) Furthermore, since { vl , v2 } for RS(A T), the collection
=
{(1, 2, 3), (0, -4, 7)} is a basis 1 ,
{VT, VI) = -3
is a basis for CS(A), a 2-dimensional subspace of R 3.
■
CLIFFS QUICK REVIEW
182
REA L EUCLIDEA N VECTO R SPACES
The Rank Plus Nullity Theorem Let A be a matrix . Recall that the dimension of its colum n space (and row space) is called the rank of A . The dimension of its nullspace is called the nullity of A . The connection between these dimensions is illustrated in the following example . Example 51 : Find the nullspace of the matri x 0 1 2 -1 1 -1 -3 1
3-
1 A= 4 0 0 1 -2 2 3 8 -2 - 1 The nullspace of A is the solution set of the homogeneou s equation Ax = 0 . To solve this equation, the following elementary row operations are performed to reduce A to echelo n form: 0 1 1 -1 A= 4 0 2 3
2
-1
-3 0
1 1
8
-2
31 -2
rl Hr2
-1 -4r1 added to r3 -2rl added to r4
1
-1
0
1
-3 2
4 2
0 3
0 8
1 0 0
-1 1 4
-3 2 12
0
5
14
1 -1 1 -2
13 -2 -1
1 -1 -3 -4
13 -6 -3
LINEAR ALGEBRA
18 3
REA L EUCLIDEA N VECTO R SPACES
-4r2 added to r3 -5r2 added to r4
-r3 added to r4
1
-1
-3
1
1-
0
1
2
-1
3
0
0
0
0
4 4
1 1
-1 8 -1 8
1
-1
-2
1
0
1
2
-1
3
0
0
0
0
4 0
1 0
-18 0
1-
= A'
Therefore, the solution set of A x = 0 is the same as the solution set of A'x = 0 : xl -0 1 -1 -2 1 10
1
2
-1
3
0
0
4
1
-18
0
0
0
0
x2
0
x3 = 0
0_ x4
x5-
0
0
With only three nonzero rows in the coefficient matrix, ther e are really only three constraints on the variables, leaving 5 — 3 = 2 of the variables free . Let x4 and x 5 be the free variables . Then the third row of A' implie s
4x3 + x4 - 18x5 = 0 = x3 =
184
x4 + 2 x 5
CLIFFS QUICK REVIEW
REA L EUCLIDEA N VECTO R SPACE S
The second row now yields x2 +2(—4x4 +x5 )—x4 +3x5 = 0 2x2 — x4 +18x 5 — 2x4 + 6x5 = 0 2x2 — 3x4 + 24x 5 = 0 x2 =2x4 —12x5 from which the first row give s x~—(2x4 -12x5 )—2(—+x4 + 2xs )+x4 +xs = 0 2x1 — 3x 4 + 24x5 +x4 -18x5 + 2x4 + 2x5 = 0 2x 1 + 8x5 = 0 xl = -4x5 Therefore, the solutions of the equation A x = 0 are those vectors of the form (x1 , x 2 , x3 , x4 , x5 )T = (—4x5 , x4 -12x 5 , - 4X4 + 2x5 , x4 , x5 )T To clear this expression of fractions, let t1 = 4x4 and t2 = 2 x5 ; then, those vectors x in R 5 that satisfy the homogeneous system A x = 0 have the form x = (-8t2 , 6t 1 - 24t2 , - tl + 9t2 , 4t1 , 2t2 )T T = t l (0, 6, -1, 4, 4)T + t2 (—8, — 24, 9, 0 , 2)
Note in particular that the number of free variables—th e number of parameters in the general solution—is the dimension of the nullspace (which is 2 in this case) . Also, the rank of this matrix, which is the number of nonzero rows in its echelon form, is 3 . The sum of the nullity and the rank, 2 + 3, i s equal to the number of columns of the matrix . ■
LINEAR ALGEBRA
185
REA L EUCLIDEA N VECTO R SPACES
The connection between the rank and nullity of a matrix , illustrated in the preceding example, actually holds for any matrix : The Rank Plus Nullity Theorem . Let A be an m by n matrix , with rank r and nullity £ . Then r + £ = n ; that is, rank A + nullity A = the number of columns of A Proof Consider the matrix equation A x = 0 and assume that A has been reduced to echelon form, A' . First, note that the elementary row operations which reduce A to A' do not change the row space or, consequently, the rank of A . Second, it is clear that the number of components in x is n, the number o f columns of A and of A' . Since A' has only r nonzero rows (because its rank is r), n —r of the variables x l , x2, . . ., x,, in x are free . But the number of free variables—that is, the number of parameters in the general solution of A x = 0—is the nullit y ofA . Thus, nullity A = n — r, and the statement of the theorem , ■ r + £ = r + (n — r) = n, follows immediately . Example 52 : If A is a 5 x 6 matrix with rank 2, what is the dimension of the nullspace of A ? Since the nullity is the difference between the number o f columns of A and the rank of A, the nullity of this matrix is 6 — 2 = 4 . Its nullspace is a 4-dimensional subspace of R 6. ■
186
CLIFFS QUICK REVIE W
REA L EUCL (DEA N VECTOR SPACE S
Example 53 : Find a basis for the nullspace of the matrix 1 -2 0 4 A= 3 1 1 0 -1 -5 -1 8 Recall that for a given m by n matrix A, the set of all solutions of the homogeneous system A x = 0 forms a subspace o f R" called the nullspace of A. To solve A x = 0, the matrix A is row reduced : 1 -2 A= 3 1
04 10
-3r1 added to r2 r1 added to r3
-1 -5 -1 8
1 -2 0
4-
> 0 7 1 -1 2 0 -7 -1 12 1 -2 0
r2added to r3
4-
> 0 7 1 -1 2 0 0 0 0
Clearly, the rank of A is 2 . Since A has 4 columns, the ran k plus nullity theorem implies that the nullity of A is 4 - 2 = 2 . Let x3 and x4 be the free variables . The second row of the reduced matrix gives 7x2 + x3 -12x4 = 0 =' x2 =
+ (-x3 + 12x4 )
and the first row then yield s xl - 2[4 (-x 3
+ 12x4 )] + 4x4 =
0 ~ xl = -÷(2x3
+ 4x 4 )
Therefore, the vectors x in the nullspace of A are precisel y those of the form
LINEAR ALGEBRA 187
REA L EUCLIDEA N VECTOR SPACES
— 7 (2x3 + 4x4 ) 4(—x3 +12x4 )
x=
x3 _
za
_
which can be expressed as follows :
x=
+ x4
0 If tl = ? x3 and t2 0,7)T,so
= + x4 ,
+7x4
7 x3
-
then x = tl (—2, -1, 7, 0) T
N(A) = span
-1
12
7 0
0
+ t2(-4,
Since the two vectors in this collection are linearly independent (because neither is a multiple of the other), they form a basis for N(A) :
CLIFFS QUICK REVIEW
188
REA L EUCLIDEA N VECTOR SPACES
Other Real Euclidean Vector Spaces and the Concept of Isomorphism The idea of a vector space can be extended to 'include object s that you would not initially consider to be ordinary vectors . Matrix spaces . Consider the set M2.3(R) of 2 by 3 matrice s with real entries . This set is closed under addition, since th e sum of a pair of 2 by 3 matrices is again a 2 by 3 matrix, an d when such a matrix is multiplied by a real scalar, the resultin g matrix is in the set also . Since M2x3(R), with the usual algebraic operations, is closed under addition and scalar multiplication, it is a real Euclidean vector space . The objects in th e space—the "vectors" are now matrices . Since M2.3(R) is a vector space, what is its dimension ? First, note that any 2 by 3 matrix is a unique linear combination of the following six matrices : El = E4 =
E2 =
[0 1
o]
0 0
0
-00
0
ES–
o
io
E3 =
E6 =
0 0
1
Therefore, they span M2x3 (R). Furthermore, these ".vectors" are linearly independent : none of these matrices is a linea r combination of the others . (Alternatively, the only way k,E, + k2E2 + k3E3 +k4E4 + k5E5 + k6E6 will give the 2 by 3 zero matri x is if each scalar coefficient, k,, in this combination is zero .) These six "vectors" therefore form a basis for M2.3(R), so dim M2x3(R) = 6 .
LINEAR ALGEBRA
189
REA L EUCLIDEA N VECTO R SPACES
If the entries in a given 2 by 3 matrix are written out in a single row (or column), the result is a vector in R 6. For example, 1 3 -41 -2 -1 0
gives
(1, 3, -4, -2, -1, 0)
The rule here is simple : Given a 2 by 3 matrix, form a 6 vector by writing the entries in the first row of the matrix followed by the entries in the second row . Then, to every matri x in M2x3(R) there corresponds a unique vector in R6 , and vice versa. This one-to-one correspondence between M2x3(R) and R6,
Ca
b cl
~D
>
(a, b, c, d, e, f)
is compatible with the vector space operations of addition an d scalar multiplication. This means that (p(A + B) _ (p(A) + (p(B) (p(k,4) = k(p(A ) The conclusion is that the spaces M2x3(R) and R6 are structurally identical, that is, isomorphic, a fact which is denote d M2x3(R) = R . One consequence of this structural identity is that under the mapping 9—the isomorphism—each basis "vector" Ei given above for M2x3(R) corresponds to the standard basis vector e l for R 6. The only real difference betwee n the spaces R 6 and M2x3(R) is in the notation : The six entries denoting an element in R6 are written as a single row (or column), while the six entries denoting an element in M2x3(R) ar e written in two rows of three entries each .
CLIFFS QUICK REVIEW
190
REA L EUCLIDEA N VECTO R
SPACES
This example can be generalized further . If m and n are any positive integers, then the set of real m by n matrices , Mmxn(R), is isomorphic to Rmn, which implies that dim Mmxn(R) = mn. Example 54 : Consider the subset S3 .3(R) c M33 (R) consisting of the symmetric matrices, that is, those which equal thei r transpose. Show that S3 .3 (R) is actually a subspace of M3x3(R) and then determine the dimension and a basis for this sub space . What is the dimension of the subspace Snxn (R) of symmetric n by n matrices ? Since M3 .3 (R) is a Euclidean vector space (isomorphic to R9), all that is required to establish that S 3.3(R) is a subspace is to show that it is closed under addition and scalar multiplication. If A = A T and B = BT, then (A + B) T = A T + BT = A + B , so A + B is symmetric ; thus, S3.3 (R) is closed under addition . Furthermore, if A is symmetric, then (kA)T = kAT = kA, so kA is symmetric, showing that S3x3 (R) is also closed under scalar multiplication. As for the dimension of this subspace, note that the 3 en tries on the diagonal (10, ©, and ® in the diagram below), an d the 2 + 1 entries above the diagonal (®, ®, and ©) can b e chosen arbitrarily, but the other 1 + 2 entries below the diagonal are then completely determined by the symmetry of th e matrix :
LINEAR ALGEBRA
191
REA L EUCLIDEA N VECTOR SPACE S
Therefore, there are only 3 + 2 + 1 = 6 degrees of freedom i n the selection of the nine entries in a 3 by 3 symmetric matrix . The conclusion, then, is that dim S3 .3(R) = 6 . A basis fo r S3 .3 (R) consists of the six 3 by 3 matrice s 1 1 1 1
1
1
1 1
1
In general, there are n + (n — 1) + • • • + 2 + 1 = n (n + 1 ) degrees of freedom in the selection of entries in an n by n symmetric matrix, so dim Snxn (R) = kn(n + 1) . ■ Polynomial spaces. A polynomial of degree n is an expression of the form a0 +a lx+a2 x2 +•••+an x n
where the coefficients al are real numbers . The set of all such polynomials of degree _< n is denoted Pn . With the usual algebraic operations, Pn is a vector space, because it is closed under addition (the sum of any two polynomials of degree < n i s again a polynomial of degree < n) and scalar multiplication ( a scalar times a polynomial of degree < n is still a polynomial of degree n) . The "vectors" are now polynomials .
CLIFFS QUICK REVIEW
192
REA L EUCLIDEA N VECTO R SPACES
There is a simple isomorphism between Pn and Rn+' : n ao+aix+a2x 2 +•••+anx
(P
-1 (P
(ao, a1 , a 2 ,
n+ 1 . ., an) E R
This mapping is clearly a one-to-one correspondence an d compatible with the vector space operations . Therefore , Pn R n+1 , which immediately implies dim Pn = n + 1 . Th e standard basis for P,,, { 1, x, x2, . . ., xn }, comes from the standard basis for R n+ , { e l , e2, e 3 , . . ., e n+1 }, under the mapping 1,
x
(1, 0, 0, . . ., 0) = e l (0, 1,0, . . .,0)=e2
x2
(0,0,1, . . .,0)= e3
1
xn
<
(p
-l
(0, 0, 0, . . ., 1) =
en+ 1
Example 55 : Are the polynomials p, = 2 — x, p2 = 1 + x + x2, and p 3 = 3x — 2x2 from P2 linearly independent? One way to answer this question is to recast it in terms o f R3, since P2 is isomorphic to R 3. Under the isomorphis m given above, p, corresponds to the vector v 1 = (2, -1, 0), p 2 corresponds to v 2 = (1, 1, 1), and p3 corresponds to v 3 = (0, 3 , -2) . Therefore, asking whether the polynomials p i , p 2, and p3 are independent in the space P2 is exactly the same as asking whether the vectors v 1 , v 2, and v3 are independent in the space R3 . Put yet another way, does the matrix
LINEAR ALGEBRA
19 3
REA L EUCLIDEA N VECTO R SPACES
vl
-3
-2
-1
v2
- = 1 1
V3
-_ _0
01
3 -2 _
have full rank (that is, rank 3)? A few elementary row operations reduce this matrix to an echelon form with three nonzer o rows : 2
-1
0
1
1
1
rl *+ r2
0 3 -2
-2rl added to r2
r2 added to r3
Thus, the vectors—either v 1, v2, v3 in R3 or p l , p 2, p 3 in P2 ■ are indeed independent . Function spaces. Let A be a subset of the real line and consider the collection of all real-valued functions f defined on A . This collection of functions is denoted R A . It is certainly closed under addition (the sum of two such functions is agai n such a function) and scalar multiplication (a real scalar multiple of a function in this set is also a function in this set), so R A is a vector space ; the "vectors" are now functions . Unlike each of the matrix and polynomial spaces described above ,
CLIFFS QUICK REVIEW 194
REA L EUCLIDEA N VECTOR SPACE S
this vector space has no finite basis (for example, R A contain s Pn for every n) ; RA is infinite-dimensional . The real-valued functions which are continuous on A, or those which ar e bounded on A, are subspaces of RA which are also infinite dimensional. Example 56 : Are the functions f1 = sin2 x, f2 = cos2 x, and f3 = 3 linearly independent in the space of continuous functions defined everywhere on the real line? Does there exist a nontrivial linear combination of f l, f2, and f3 that gives the zero function? Yes : 3f1 + 3f2 — f3 = 0 . This establishes that these three functions are not independent . ■ Example 57 : Let C2 (R) denote the vector space of all real valued functions defined everywhere on the real line that possess a continuous second derivative . Show that the set of solutions of the differential equation y" +y = 0 is a 2-dimensiona l subspace of C2 (R) . From the theory of homogeneous differential equation s with constant coefficients, it is known that the equatio n y" + y = 0 is satisfied by y l = cos x and y2 = sin x and, more generally, by any linear combination, y = c 1 cos x + c 2 sin x, of these functions . Since y l = cos x and y2 = sin x are linearly independent (neither is a constant multiple of the other) an d they span the space S of solutions, a basis for S is {cos x, sin x}, which contains two elements . Thus , dim S = dim as desired.
yE
C2 (R) : y" + y = o = 2
■
LINEAR ALGEBRA
195
THE DETERMINAN T
Associated with each square matrix A is a number called its determinant, denoted by det A or . by the symbol {A I . The purpose here is to give the definition of the determinant, t o illustrate how it is evaluated, and to discuss some of its applications . Throughout this section, all matrices are square ; the determinant of a nonsquare matrix is meaningless .
Definitions of the Determinant The determinant function can be defined by essentially tw o different methods. The advantage of the first definition –on e which uses permutations—is that it provides an actual formul a for det A, a fact of theoretical importance . The disadvantage is that, quite frankly, no one actually computes a determinant b y this method . Method 1 for defining the determinant. If n is a positive integer, then a permutation of the set S = {l, 2, . . ., n } is defined to be a bijective function—that is, a one-to-one correspondence—a, from S to S. For example, let S = { 1, 2, 3) and define a permutation a of S as follows : a 1i-->3,
a 2H1,
a 3H 2
Since a(1) = 3, a(2) = 1, and a(3) = 2, the permutation a maps the elements 1, 2, 3 into 3, 1, 2 . Intuitively, then, a permutation of the set S = {l, 2, . . ., n } provides a rearrangemen t of the numbers 1, 2, . . ., n. Another permutation, a', of th e set S is defined as follows :
LINEAR ALGEBRA
197
THE DETERA/l//NANT
a'
a'
a'
1H2,
2f-->1,
3H 3
This permutation maps the elements 1, 2, 3 into 2, 1, 3, respectively. This result is written 6' (1, 2, 3)H (2, 1, 3 ) Example 1 : In all, there are six possible permutations of th e 3-element set S = { 1 ,2, 31 : a1
(1, 2, 3)H(1,2,3)
64
(1, 2,3)i-->(2,3, 1)
a2
a5
(1,2,3)i--x(1,3,2)
(1,2,3)i->(3,l,2 ) 66 (1,2,3)i->(3,2,1 )
a3
(1,2,3)i->(2,1,3)
In general, for the set S = {l, 2, . . ., n } , there are n ! (n factorial) possible permutations . ■
To transpose two adjacent elements simply means to inter change them ; for example, the transposition (or inversion) o f the pair 2, 3 is the pair 3, 2. Every permutation can be obtained by a sequence of transpositions . For example, consider the permutation 0'5 of S = {1, 2, 3} defined in Example 1 above . The result of this permutation can be achieved by tw o successive transpositions of the original set : (1, 2, 3) 1_1
198
rl
transpose > 2 and 3
(1, 3, 2)
transpose > 1 and 3
(3, 1, 2 )
CLIFFS QUICK REVIEW
TH E DETERMINAN T
Three transpositions are needed to give the permutation 66 of Example 1 :
n
I
(1, 2, 3) .l . "
transpose > (2 ' 1' 3) 1 and 2
U I
transpose > (2,_._13, 1 ) 1 and 3 transpose > (3, 2, 1 ) 2 and 3
The number of transpositions needed to recover a give n permutation is not unique . For example, you could alway s intersperse two successive transpositions, the second one o f which simply undoes the first . However, what is unique is whether the number of transpositions is even or odd. If the number of transpositions that define a permutation is even , then the permutation is said to be even, and its sign is +1 . If the number of transpositions that define a permutation is odd , then the permutation is said to be odd, and its sign is -1 . The notation is as follows: sgn o- =
+l if 6 is even — 1 if 11 Cf1S 6 1S 'odd
Note that sgn a can be defined as (-1)°, where t is the numbe r of transpositions that give a . Example 2 : Determine the sign of the following permutatio n of thesetS= {1,2, 3,4} : a: (1, 2, 3, 4) H (3, 4, 1, 2 )
LINEAR ALGEBRA
199
THE DETERM1/NAN T
The "brute-force" method is to explicitly determine th e number of transpositions : (1, 2, 3, 4) --~ (1,3,2, 4) --+ (3, 1, 2,4 ) -3 (3,
L:i' I
2) -4 (3, 4, 1, 2 )
Since a can be achieved by 4 successive transpositions, a i s even, so its sign is +1 . A faster method proceeds as follows : Determine ho w many pairs within the permutation have the property that a larger number precedes a smaller one . For example, in th e permutation (3, 4, 1, 2) there are four such pairs : 3 precedes 1 , 3 precedes 2, 4 precedes 1, and 4 precedes 2 . The fact that the number of such pairs is even means the permutation itself i s even, and its sign is +1 . [Note : The number of pairs of elements that have the property that a larger number precedes a smaller one is the minimum number of transpositions that de fine the permutation . For example, since this number is fou r for the permutation (3, 4, 1, 2), at least four transpositions ar e needed to convert (1, 2, 3, 4) into (3, 4, 1, 2) ; the specific sequence of these four transpositions is shown above .] ■
For every integer n >_ 2, the total number of permutations, n!, of the set S = {1, 2, . . ., nl is even. Exactly half of these permutations are even ; the other half are odd.
CLIFFS QUICK REVIE W
200
THE DETERMINAN T
Example 3 : For the 6 = 3! permutations of the set S = {1, 2 , 3} given in Example 1, verify that the three permutation s 6 1,
64,
and (Y5 are even
and, therefore, each has sign +1, while the other three permutations, a2 , 6 3 , and 66 are odd and each has sign --1 .
■
Now that the concepts of a permutation and its sign hav e been defined, the definition of the determinant of a matrix ca n be given . Let A = [a lp] be an n by n matrix, and let Sin denot e the collection of all permutations of the set S = {1, 2, . . ., n) . The determinant of A is defined to be the following sum : det A =
E (sgna) .ai(1)a2(2) . . . ano. (n)
6 ESn
(* )
Example 4 : Use definition (*) to derive an expression for th e determinant of the general 2 by 2 matri x a12 a2z 1i
a22
Since n = 2, there are 2! = 2 permutations of the set {l, 21 , namely, 6~
(1, 2) H (l, 2) and (1, 2)
LINEAR ALGEBRA
6Z H
(2, 1 )
201
THE DETERMINAN T
The identity permutation, a 1, is (always) even, so sgn 6 1 = + 1 , and the permutation a2 is odd, so sgn a2 = - 1 . Therefore, the sum (*) becomes
n
detA = (sgn a 1) ' a lal (1) a2al (2) + ( sg a2) ' a1 62 (1)a 26 2 ( 2 ) = (+Dal la22 + ( -4)al2a2 l
= all a22 — a12 a2 1 This formula is one you should memorize: To obtain the determinant of a 2 by 2 matrix, subtract the product of the off diagonal entries from the product of the diagonal entries : det
ali
au
a2~
az2
=
aii
au
azi
a22
= al i an — a~z az i
To illustrate, det [3 41 = (1)(4) — (2)(3) = -2 ■ Example 5 : Use definition (*) to derive an expression for th e determinant of the general 3 by 3 matrix all
a 12
a1 3
A= a21
a22
a23
a31
a32
a3 3
Since n = 3, there are 3! = 6 permutations of {l, 2, 3), and, therefore, six terms in the sum (*) :
CLIFFS QUICK REVIEW 202
TH E DETERMINAN T
det A = (sgn 61) . ala i (1)a2a i (2) a3a l (3) + (sgn a 2 )' al 0. 2 (1) a2 0. 2 (2)a3o.2 (3 ) + ( sgn 63 )' a la i (1) a2a 3 (2) a3a3 (3 ) + ( sgn 64) ' ala4 (1)a2a 4 (2) a3a 4 ( 3 ) + (sgn 65) ' alas (l) a2as (2) a3as (3 ) + ( sgn a 6) ' ala 6 (1) a2a 6 (2)a3a 6 (3 ) Using the notation for these permutations given in Example 1 , as well as the evaluation of their signs in Example 3, the su m above becomes detA = (+ 1)a11a22a33
+ (—1 )al t a 2s a3 2
f e""DR 12 a21 a33 +(+l ) ai2 a23 a3 1 + (+1 ) Q 13 a21 a32 +e'''l)ai3 a22 a3 1
or, more simply , detA = al l a22 a 33 + a12 a23 a31 + a13 a21 a32 —all a23 an
` a12 a21 a33 — a13 a22 a3 1
As you can see, there is quite a bit of work involved in computing a determinant of an n by n matrix directly from definition (*), particularly for large n . In applying the definition to evaluate the determinant of a 7 by 7 matrix, for example , the sum (*) would contain more than five thousand terms . This is why no one ever actually evaluates a determinant b y this laborious method . ■
A simple way to produce the expansion (**) for the determinant of a 3 by 3 matrix is first to copy the first and second columns and place them after the matrix as follows :
LINEAR ALGEBRA
203
THE DETERMIINANT
all
a1 2
a13
all
a 12
a2 l
a22
_ a31
a3 2
a23 a33
a 21 a31
a2 2 a3 2
Then, multiply down along the three diagonals that start wit h the first row of the original matrix, and multiply up along th e three diagonals that start with the bottom row of the origina l matrix . Keep the signs of the three "down" products, reverse the signs of the three "up" products, and add all six resultin g terms; this gives (**) . Note : This method works only for 3 b y 3 matrices . reverse the signs of these product s
a2 2 keep the signs of these product s Here's a helpful way to interpret definition (*) . Note that in each of the products involved in the sum det A = E (sgn a) . ala(l)a 2a(2) . . . ana(n ) Esn
there are n factors, no two of which come from the same ro w or column, a consequence of the bijectivity of every permutation. Using the 3 by 3 case above as a specific example, each of the six terms in the sum (**) can be illustrated as follows :
CLIFFS QUICK REVIEW 204
THE DETERMINAN T
al 1 a22 a33 H
a12 a23 a31 *3
all
a12
an
a21
a22
a2 3
a31
a32
a3 3
al
l
a12
an
a2 1
a22
a23
a31
a32
a33
ll
a12
a1 3
a2 1
a22
a23
a31
a32
a33
all
a12
a1 3
a21
a22
a23
a31
a32
a33
al 1
a1 2
a1 3
a21
a22
a2 3
a3l
a32
a3 3
ll
a 12
a 13
a21
a22
a2 3
a31
a32
a33
a a13 a21
a32 H
~—)aisaszasi *3
(—)a12a21a33
H
a
(—)a11a23a32 H
These six products account for all possible ways of choosin g three entries, no two of which reside in the same row or column. In general, then, the determinant is the sum of all possible products of n factors, no two of which come from the
LINEAR ALGEBRA
20 5
THE DETERMINAN T
same row or column of the matrix, with the sign of each prod uct, a1 Jl a2.12 • • • anJn , determined by the sign of the corresponding permutation a : (1, 2, . . ., n) H (j 1, .12, • • • in) . Method 2 for defining the determinant . The second definitio n for the determinant follows from stating certain properties that the determinant function is to satisfy, which, it turns out , uniquely define the function. These properties will then lea d to an efficient method for actually computing the determinant of a given matrix . There exists a unique real-valued functionthe determinant function (denoted det)—which is defined for n by n matrices and satisfies the following three properties : Property 1 : The determinant of a matrix is linear in each row . Property 2 : The determinant reverses sign if two rows are interchanged . Property 3 : The determinant of the identity matrix is equal t o 1. Property 1 deserves some explanation . Linearity of a function that f(x + y) = f(x) + f(y) and, for any scalar k, f(kx ) = kf(x) . Linearity of the determinant function in each ro w means, for example, that
f means
— rl + r; det - r
2
rn
—
= det
r1 r2 rn
+ det
r2 rn
CLIFFS QUICK REVIE W
206
THE DETERMINAN T
and
det
r2 r,,
=kdet
rl r2 r,,
Although these two equations illustrate linearity in the firs t row, linearity of the determinant function can be applied t o any row. Property 2 can be used to derive another important property of the determinant function : Property 4 : The determinant of a matrix with two identica l rows is equal to 0 . The proof of this fact is easy : Assume that for the matrix A, Row i = Row j . By interchanging these two rows, the determinant changes sign (by Property 2) . However, since these two rows are the same, interchanging them obviously leaves the matrix and, therefore, the determinant unchanged . Since 0 is the only number which equals its own opposite, det A = 0 . One of the most important matrix operations is adding a multiple of one row to another row . How the determinant re acts to this operation is a key property in evaluating it : Property 5 : Adding a multiple of one row to another row leaves the determinant unchanged . The idea of the general proof will be illustrated by the following specific illustration . Suppose the matrix A is 4 by 4 , and k times Row 2 is added to Row 3 :
LINEAR ALGEBRA
207
THE DETERMINANT
A=
rl r2
=A'
r3 r4
By linearity applied to the third row,
det A' = det
rl = det
r2 r3
-
rl r2
+ det
kr2
r4 rl r2
= det -
r3
-
r4
+ k det _
_
-
But the second term in this last equation is zero, because the matrix contains two identical rows (Property 4) . Therefore ,
det A' = det
rl r2 r3 + kr2 r4
= det
rl r2 r3 r4
= det A _
CLIFFS QUICK REVIE W 208
TH E DETERMINAN T
The purpose of adding a multiple of one row to anothe r row is to simplify a matrix (when solving a linear system, fo r example) . For a square matrix, the goal of these operations i s to reduce the given matrix to an upper triangular one . So the natural question at this point is : What is the determinant of a n upper triangular matrix ? Property 6 : The determinant of an upper triangular (or diagonal) matrix is equal to the product of the diagonal entries . To prove this property, assume that the given matrix A has been reduced to upper triangular form by adding multiples o f rows to other rows and assume that none of the resulting diagonal entries is equal to 0 . (The case of a 0 diagonal entry will be discussed later.) This upper triangular matrix can b e transformed into a diagonal one by adding multiples of lower rows to higher ones. (Recall Example 10, page 97 .) At eac h step of this transformation, the determinant is left unchanged , by Property 5 . Therefore, the problem of evaluating the determinant of the original matrix has been reduced to evaluating the determinant of an upper triangular matrix, which i n turn has been reduced to evaluating the determinant of a diagonal matrix . By factoring out each diagonal entry and usin g Property 1 (linearity in each row), Property 3 (det 1= 1) gives the desired result:
LINEAR ALGEBRA 209
THE DETERMINAN T
al l det
a22
a22
= all det
am,
ann
1
= al 1 a22 det
am,
=
al l a22 . . . ann
det
1
1 =
al l a22
...a
nn
Now, to handle the case of a zero diagonal entry, the following property will be established : Property 7 : A matrix with a row of zeros has determinan t zero. This is also easy to prove . As in the proof of Property 5, th e essential idea of this proof will also be illustrated by a specifi c example . Consider the 3 by 3 matrix A= 00
0
(Recall that each * indicates an entry whose value is irrelevant to the present discussion .)
210
CLIFFS QUICK REVIEW
THE DETERMINAN T
Since for any scalar k, -* *
A= 0
0 *
*
0 = k0 k•0 k• 0 *
*
linearity of the determinant implie s -* * ** *
*-
det 0 0 0 = det k0 * * * *
k•0 *
*
k 0 = k det 0 0 * **
0 *
But, if det A is equal to k det A for any scalar k, then det A must be 0 . Now, to complete the discussion of Property 6 : If a diagonal entry in an upper triangular matrix is equal to 0, then th e process of adding a multiple of one row to another can pro duce a row of zeros . For example , 1
-3
0 0
5 2
0 0 3
^1 —ire added to r3
-3
5
> 0 0
2
0 0
0
This step does not change the determinant (Property 3), so th e determinant of the original matrix is equal to the determinan t of a matrix with a row of zeros, which is zero (Property 4) . But in this case at least one of the diagonal entries of the uppe r triangular matrix is 0, so the determinant does indeed equa l the product of the diagonal entries . Generalizing these arguments fully establishes Property 6 .
LINEAR ALGEBRA 211
THE DETERM//NAN T
Example 6 : Evaluate the determinant o f A=
Reduce the matrix to an upper triangular one , 1
-3
7
-2r,added to r2
4rl added to r3
2
2
1
-4
-4
3
1 >
2r2 added to r3 >
0
-3
7
0
8 -1 3 -16 31
1
-3
7
0
8
-1 3
0
0
5
in order to exploit Property 6—that none of these operations changes the determinant and Property 7—that the determinant of an upper triangular matrix is equal to the product o f the diagonal entries . The result i s 1 det A = det 0 0
-3 8
7-13 = (1)(8)(5) = 4 0
0
5
■
Example 7 : Evaluate the determinant of
A=
2
1
0
1-
1
0
-1
2
0
-1
2
1
-1
2
1
0
CLIFFS QUICK REVIEW
212
THE DETERMINAN T
The following elementary row operations reduce A to a n upper triangular matrix : 2
1 0
0 -1
-1 ^ 2
1
0 -1 -1 2
2 1
- 1 2
0 1
-1 0
2 -1
1 0
0 -1
-1 2
2 1
1 0
-2rl added to r2 r1 added to r4
1 0
0 1
-1 2
2-5
0 0
-1 2
2
1 2
1
0
0 0
1 0
0
0 -4
12
-1 0
0 -1 1 2
2-5
0 0
-4 8
4''+ r2
r2 added to r3 -2r2 added to r4
r3 added to r4
0 0
0 -1
2-
2 -5 4 -4
4 0
None of these operations alters the determinant, except for th e row exchange in the first step, which reverses its sign . Sinc e the determinant of the final upper triangular matrix i s (1)(1)(4)(8) = 32, the determinant of the original matrix A i s -32. ■
LINEAR ALGEBRA
21 3
THE DETERMIINAN T
Example 8 : Let C be a square matrix . What does the rank of C say about its determinant ? Let C be n x n and first assume that the rank of C is less than n . This means that if C is reduced to echelon form by a sequence of elementary row operations, at least one row o f zeros appears at the bottom of the reduced matrix . But a square matrix with a row of zeros has determinant zero . Since no elementary row operation can turn a nonzero-determinant matrix into a zero-determinant one, the original matrix C ha d to have determinant zero also . On the other hand, if rank C = n, then all the rows are in dependent, and the echelon form of C will be upper triangular with no zeros on the diagonal . Thus, the determinant of the reduced matrix is nonzero . Since no elementary row operation can transform a zero-determinant matrix into a nonzero determinant one, the original matrix C had to have a nonzer o determinant. To summarize then , IfCisnxn,
rank C< n
a det C= 0 q detC~ O
1rankC=n
Example 9 : Evaluate the determinant of 12
3
A= 4 5
6
7 8 9
CLIFFS QUICK REVIEW
214
THE DETERMINAN T
None of the following row operations affects the determinant of A : 1 4
2 5
7
8
3 6 9
added to r2 -7rl added to r3
>
1
-2 r2 added to r3
~
2
3
0 -3 -6 0 -6 -1 2 1
2
0 0
-3 0
3-6 0
Because this final matrix has a zero row, its determinant i s zero, which implies det A = 0 . ■ Example 10 : What is the rank of the following matrix ? 1 2 3A= 4 5 78
6 9
Since the third row is a linear combination, r 3 = —r 1 + 2r2 , of the first two rows, a row of zeros results when A is reduce d to echelon form, as in Example 9 above . Since just 2 nonzer o rows remain, rank A = 2 . ■
The three preceding examples illustrate the following important theorem : Theorem E . Consider a collection {v l , v 2 , . . ., v1z } of n vectors from R . Then this collection is linearly independent if an d only if the determinant of the matrix whose rows are v 1 , v2, . . ., v,, is not zero .
LINEAR ALGEBRA
21 5
THE DETERMINAN T
In fact, Theorem E can be amended : If a collection of n vectors from Rn is linearly independent, then it also spans R n (and conversely) ; therefore, the collection is a basis for W . Example 11 : Let A be a real 5 by 5 matrix such that the su m of the entries in each row is zero . What can you say about the determinant of A ? Solution 1 . The equation x i + x2 + x3 + x4 + x5 = 0 de scribes a 4-dimensional subspace of R 5 , since every point i n this subspace has the form (x1, x 2, x39 x4 , x 1 - x 2 ` x3 - x4) which contains 4 independent parameters . Since every row o f the matrix A has this form, A contains 5 vectors all lying in a 4-dimensional subspace. Since such a space can contain a t most 4 linearly independent vectors, the 5 row vectors of A must be dependent. Thus, det A = 0 . Solution 2. If x o is the column vector (1, 1, 1, 1, 1) T, then the product A x o equals the zero vector. Since the homogeneous system A x = 0 has a nontrivial solution, A must have determinant zero (Theorem G, page 239) . ■ Example 12 : Do the matrices in MZXZ(R) with determinant 1 form a subspace of M2x2( R ) ? No . The determinant function is incompatible with th e usual vector space operations : The set of 2 x 2 matrices with determinant 1 is not closed under addition or scalar multiplication, and, therefore, cannot form a subspace of M2x2 (R). A counterexample to closure under addition is provided by th e
CLIFFS QUICK REVIEW
216
TH E DETERMINAN T
matrices I and —I; although each has determinant 1, their sum , I + (—I) = 0, clearly does not . ■ Example 13 : Given that 1
-3
7
det 2 2 1 = 4 0 -4 -4 3 (see Example 6), compute the determinant of the matri x 2 -6 14 4 4 2 -8 -8
6
obtained by multiplying every entry of the first matrix by 2 . This question is asking for det (2A) in terms of det A . If just one row of A were multiplied by 2, the determinant woul d be multiplied by 2, by Property 1 above . But, in this case, all three rows have been multiplied by 2, so the determinant i s multiplied by three factors of 2 : det(2A) = 2 3 det A This gives det (2A) = 8 . 40 = 320 . In general, if A is an n by n matrix and k is a scalar, then det (kA) = k" det A
■
Example 14 : If A and B are square matrices of the same size, is the equation det (A + B) = det A + det B always true?
LINEAR ALGEBRA 217
TH E DETERMINAN T
Let A and B be the following 2 by 2 matrices : A=
[3
4
and B= 5 l L7 g
J
Then det A = det B = 2, but
12 I=—8~--4=detA+det B
det(A + B) = det ~
ll
Thus, det (A + B) = det A + det B is not an identity . [Note : This does not mean that this equation never holds . It certainl y is an identity for 1 x 1 matrices, and, making just one chang e in the entries of the matrices above (namely, changing the en try b 22 from 8 to 12), A= [3 4
J
and B=
[57 1
yields a pair of matrices that does satisfy det (A + B) = det A + det B, as you may check .] ■ Example 15 : One of the most important properties of the determinant function is that the determinant of the product o f two square matrices (of the same size) is equal to the produc t of the individual determinants . That is , (AB) = (det A)(det B) 2det is an identity for all matrices A and B for which both sides ar e defined .
218
CLIFFS QUICK REVIEW
THE DETERMINAN T
(a) Verify this identity for the matrices [—2 A=
5]
and
B=
31
L4
J
(b) Assuming that A is an invertible matrix, what is the relationship between the determinant of A and the determinan t of A'? (c) IfA is a square matrix and k is an integer greater than 1 , what relationship exists between det (A') and det A ? The solutions are as follows : (a) It is easy to see that det A = 7 and det B = -10 . The product of A and B , 2 AB = [-3
6
6
5][ 4 3] — [—138
21
has determinant (—16)(21) — (38)(—7) = -336 + 266 = -70 . Thus, det (AB) = -70 = (7)(—10) = (det A)(det B) as expected. (b) Taking the determinant of both sides of the equation = I yields
AA '
det(AA -l ) = det(l) (det A)(det A -1 ) = 1 det A -1 = (det Note that the identity (det A)(det A -' ) = 1 implies that a necessary condition for A-' to exist is that det A is nonzero. (In fact, this condition is also sufficient; see Theorem H, page 243 . )
LINEAR ALGEBRA
219
TH E DETERMIINAN T
(c) Let k = 2; then det (A Z ) = det (AA) = (det A)(det A) = (det A) 2 . If k = 3, then det (A 3) = det (A ZA) = det (A Z )(det A) = (det A)Z(det A) = (det A)3 . The pattern is clear: det (A"`) = (det A)k. [You may find it instructive to give a more rigorous proof of this statement by a straightforward induction argument.] ■
Laplace Expansions for the Determinant Using the definition of the determinant, the following expression was derived in Example 5 : ll
a 12
a1 3
a21
a22
a23
a31
a32
a33
a
al 1 a22 a33
+ a 12a23 a 31 + a13 a21 a3 2
— a 11 a23 a32 — a 12 a21 a 33 — a13 a22 a3 1
This equation can be rewritten as follows : a ll a12
an
a21
a22
a2 3
a31
a32
a33
= al 1(an an — a23 a32 ) + a12 ( a23a31 — a2 1(13 3 + Q 13 ( a21 a32 — a22 a3 1
=al l
a22
a23
a21
a2 3
a21
a22
a32
a33
a31
a3
a31
a3 2
Each term on the right has the following form : determinant of the matrix that d (entry in the first row • ± remains when the row an column containing that entry are removed from the original matrix,
CLIFFS QUICK REVIEW
220
TH E DETERMINAN T
In particular, note that
gives the term
all
a22
a2 3
a32
a3 3
13
a2 1
2
an
an
gives the term
a lz~
—
a33_
gives the term
a 13
a21
a23
a31
a33
a21
a2 2
a31
a3 2
If A = [a u] is an n x n matrix, then the determinant of the (n — 1) x (n — 1) matrix that remains once the row and colum n containing the entry a u are deleted is called the a u minor, de noted mnr(a y.). If the a ;~ minor is multiplied by (—1) i, the result is called the au cofactor, denoted cof(au). That is , cof(au) = (—l)j+j mnr(au ) Using this terminology, the equation given above for the determinant of the 3 x 3 matrix A is equal to the sum of th e products of the entries in the first row and their cofactors : all
a12
a1 3
a21
a22
a23 =
a 31
a32
a33
cogan ) + a iz cof(au ) + a13 cof(ao )
LINEAR ALGEBRA
221
TH E DETERMINAN T
This is called the Laplace expansion by the first row . It can also be shown that the determinant is equal to the Laplace expansion by the second row, a ll a12 an a ll
a22
a23 = aZ ~ cof(a21 ) + a22 cof(an ) + a23 cof(a23 )
a31
a32
a33
or by the third row, a ll
al2
a1 3
a 21
a22
a23
a31
a32
a33
= Q31 C Oga31)
+ a32 cof(a32 ) + a33 Co'1a33
Even more is true . The determinant is also equal to th e Laplace expansion by the first column , al 1
au
a1 3
a 21
a22
a2 3 = air cogai~)+a2l cof(ail ) +as p cof(asi )
a31
a32
a33
by the second column, or by the third column . Although the Laplace expansion formula for the determinant has been explicitly verified only for a 3 x 3 matrix and only for the first row, it can be proved that the determinant of any n x n matri x is equal to the Laplace expansion by any row or any column .
CLIFFS QUICK REVIEW
222
TH E
DETERMINAN T
Example 16 : Evaluate the determinant of the following matri x using the Laplace expansion by the second column : 2
1 -1
A= -3 2
1
5 0 -2
The entries in the second column are a l2 = -1, a22 = 2, and a 32 = 0. The minors of these entries, mnr(a12), mnr( a 22) , and mnr(a32), are computed as follows : ~ a, Z minor =
-3
1
5 -2
= ( —3 )(— 2 ) — (5)( 1 ) =
~ a22 minor =
2 -1
5 -2
= (2)(–2) – (5)(–l) = 1 a32 minor =
2 -1 -3
= (2 )( 1 )
1 ( —3 )(— 1 ) = - 1
Since the cofactors of the second-column entries ar e cof(au ) = (–1)' +2 mru.(aiz) = –mnr(a i2) = - 1 cof(a22 ) = (–1)22 nirir(a22 ) = mnr ( an) = 1 cof(a32 ) = (– 1)32 nmr(a32 ) = –mnr(a32) = -(- 1) = 1 the Laplace expansion by the second column become s
LINEAR ALGEBRA
223
THE DETER,711NAN T
det A = a 12 cof(a12 ) + a 22 cof(an ) + a32 cof(a32 ) = (-l)(-l) + (2)(l) + (0)(l ) =3 Note that it was unnecessary to compute the minor or the co factor of the (3, 2) entry in A, since that entry was 0 . In general, then, when computing a determinant by the Laplac e expansion method, choose the row or column with the most zeros . The minors of those entries need not be evaluated, be ■ cause they will contribute nothing to the determinant. The factor (—1 )`+J which multiplies the a ir minor to give th e cofactor leads to a checkerboard pattern of signs ; each sign alp gives the value of this factor when computing the a lp cofactor from the alp minor. For example, the checkerboard pattern fo r a 3 x 3 matrix looks like this : + —
+
+ —
+
For a 4 x 4 matrix, the checkerboard has the form + - + + + - +
+
and so on .
CLIFFS QUICK REVIEW
224
TH E DETERMINAN T
Example 17 : Compute the determinant of the following matrix :
A=
0 -3
7 2
1 -1
-5 1
1 0 2 -4
0 -2
2 0
First, find the row or column with the most zeros . Here, it's the third row, which contains two zeros ; the Laplace expansion by this row will contain only two nonzero terms . Th e checkerboard pattern displayed above for a 4 by 4 matrix implies that the minor of the entry a 31 = 1 will be multiplied by +1 , and the minor of the entry a 34 = 2 will be multiplied by - 1 to give the respective cofactors :
~ cof(a31 ) _ +
~
cof(a3a) _
-
7
1
-5
2 -4
-1 -2
1 0
0
7 2
1 -1
-4
-2
-3
Now, each of these cofactors—which are themselves determinants—can be evaluated by a Laplace expansion . Expanding by the third column ,
LINEAR ALGEBRA
22 5
THE DETERMINAN T
1 -5 cof(a31 ) =
2 -1 1 = -5 -4 -2 0
i
= -5(—4 — 4) — (—14 + 4 ) = 50 The other cofactor is evaluated by expanding along its first row: 7
cof( a3a) _ —
1 -3 2 - 1 2 -4 -2
j+ i
2 -4
= —[ — 7(6 + 2) + (12 — 4) ] = 48 Therefore, evaluating det A by the Laplace expansion alon g A's third row yields det A
= Q31 Coga31 )+ a34 CO `\a34 )
_(1)(50) + (2)(48) =146 ■ Example 18 : The cross product of two 3-vectors, x = x 1 i + x2j + x 3 k and y = y 1 i + y2j + y3k, is most easily evaluated by per forming the Laplace expansion along the first row of th e symbolic determinant i
j
k
xl
x2
x3
.Y1 Y2
Y3
CLIFFS QUICK REVIE W
226
THE DETERMINAN T
This expansion gives i xxy= xl Yl
J x2
k x3 =i
x2
x1
x3
Y2 Y3
—J
x3
+k
Y1 Y3
xl
x2
Y1 Y2
Y2 Y3
To illustrate, the cross product of the vectors x = 3j — 3k an d y = 2i+2j—kis xxy=
=i
3 -3 2 -1 — J
+k
0 -2
3 2
= 4—3 + 6) — j(O — 6) + k(O + 6 ) = 3i+6j+6 k ■
This calculation was performed in Example 27, page 45 .
Example 19 : Is there a connection between the determinant o f AT and the determinant of A ? In the 2 by 2 case, it is easy to see that det (A T) = det A : a det A = ~ det AT =a b
b d = ad — b e c d
=ad—b c
In the 3 by 3 case, the Laplace expansion along the first ro w of A gives the same result as the Laplace expansion along th e first column of A T, implying that det (A T) = det A :
LINEAR ALGEBRA
227
THE DETERM/NA NT
a det A = d
b e
g
h
a
d e
det A T = b c f
=a
e d f d e —b +c h i g i g h
=a
e f
h d g d —b +c i f i e
g h
Starting with the expansion det A = E (sgn a) ala(l)a2a(2) . . . ana(n) aESn
for the determinant, it is not difficult to give a general proo f that det (A T) = det A . ■ Example 20 : Apply the result det (A T ) = det A to evaluate e r g o
a
a p n e
given that a p e e g o = 20 3 a
n
(where a, e, g, n, o, p, and r are scalars) . Since one row exchange reverses the sign of the determinant (Property 2), two row exchanges,
CLIFFS QUICK REVIEW
228
TH E DETERMINAN T
e g o a p e
r1 H r2
r2 H r3
r a n will leave the determinant unchanged: a p e eg det e g o = de t r a 3 an
a p
o n e
But the determinant of a matrix is equal to the determinant o f its transpose, s o -T e g o e g o e r a det r a n = det r a n = det g a p _a p e _ _a p e _o n e_ Therefore, e r a g a p o ne
a p e e g o = 20 r a n
■
Example 21 : Given that the numbers 1547, 2329, 3893, and 4471 are all divisible by 17, prove that the determinant o f 1 5 4 72 3 2 A= 3 8 9 4 4 7
9 3 1
is also divisible by 17 without actually evaluating it .
LINEAR ALGEBRA
229
THE DETERMIINAN T
Because of the result det (AT) = det A, every property of the determinant which involves the rows of A implies another property of the determinant involving the columns of A . For example, the determinant is linear in each column, reverse s sign if two columns are interchanged, is unaffected if a multiple of one column is added to another column, and so on. To begin, multiply the first column of A by 1000, the second column by 100, and the third column by 10 . The determinant of the resulting matrix will be 1000 . 100 . 10 time s greater than the determinant of A :
det
1000 2000
500 300
40 20
79
3000 4000
800 400
90 70
3 1
=1000 . 100 . 10 . det
1 2
5 3
4 2
79
3 4
8 4
9 7
3 1
Next, add the second, third, and fourth columns of this new matrix to its first column . None of these column operations changes the determinant ; thus, 1
5 2 3 106 det 3 8 4 4
4 2 9 7
71000 9 2000 = det 3 3000 1
= det
500 300
40 20
79
_4000
800 400
90 70
3 1_
1547 2329
500 300
40
79
3893 _4471
800 400
20 90 70
3 1_
CLIFFS QUICK REVIEW
230
THE DETERMINAN T
Since each entry in the first column of this latest matrix is di visible by 17, every term in the Laplace expansion by the first column will be divisible by 17, and thus the sum of thes e terms—which gives the determinant will be divisible by 17 . Since 17 divides 106 det A, 17 must divide det A because 17 is prime and doesn't divide 106. ■ Example 22 : A useful concept in higher-dimensional calculu s (in connection with the change-of-variables formula for multiple integrals, for example) is that of the Jacobian of a map ping. Let x and y be given as functions of the independen t variables u and v : x = x(u, v ) Y = Y(u ,
v)
The Jacobian of the map (u, v) H (x, y), a quantity denote d by the symbol a(x, y)/a(u, v), is defined to be the following determinant :
ax/au axial a(x,Y) a(u, v) = detLaylau avl~~ To illustrate, consider the polar coordinate transformation, x=rcos 9
y=rsin9
(* )
The Jacobian of this mapping, (r, 8) H (x, y), is
LINEAR ALGEBRA
231
THE DETERMINANT
[Na r Y) a(r, 8) – det [ay/ar ay/ae] = det
cos e
—r sin e
Lsln O
r cos 6]
= r(cos2 8 +sine B ) =r The fact that the Jacobian of this transformation is equal to r accounts for the factor of r in the familiar formul a
f fR f(x, y) dx dy = SIR, f(rcos6, rsin9)
rdrd9
where R' is the region in the r-8 plane mapped by (*) to th e region of integration R in the x -y plane . The Jacobian can also be extended to three variables . Fo r example, a point in 3-space can be specified by giving it s spherical coordinates p, 0, and e--which are related to th e usual rectangular coordinates x, y, and z—by the equations x=psinq5cos 6 y = psinq5sin o z = pcosq5 See Figure 52 .
CLIFFS QUICK REVIEW
232
THE DETERMINAN T
■ Figure 52 ■ The Jacobian of the mapping (p,
a(z~ Y~ z) —= det 5( P' 0' e)
8x/8p 8x/8 7y/cep ay/a O 8z/7p ai/a¢
0) H (x, y, z) i s
~/ae ay /ae &/7 6
[sin ~ cos 0 p cos ~ cos 0 —p sin sin 0 = det sin sin 8 p cos ~ sin B psin cos 8 cos —p sin 0 By a Laplace expansion along the third row, a(x, y, z) cos (/) cos 0 —p sin ~ sin 8 = cos ~ p ~ p, 0, g) p cos sin 8 psin ~ cos8 sin0cos9 —psinOsin 9 —(—psin0) sin(/) sine psin(i) cos 9 = cos¢[p2 cos ~ sin ¢(cost 8 +sin e 9)] + p sin O[p sin 2 0(cos2 8 +sin e 0)] = p2 sin ~ cos2 ~ + p 2 sin ~ sin2 ~ = p2 sin~
LINEAR ALGEBRA 233
THE DETERMIINAN T
The fact that the Jacobian of this transformation is equal t o p 2 sin O accounts for the factor of p2 sin ¢ in the formula fo r changing the variables in a triple integral from rectangular t o spherical coordinates :
ffff(x,y, z) dx dy dz = fuR' x
f(p sin q5 cos 0, psin ~b sin 0, p cos
p2 sink dpdOd8
b)
■
Laplace expansions following row-reduction . The utility o f the Laplace expansion method for evaluating a determinant i s enhanced when it is preceded by elementary row operations . If such operations are performed on a matrix, the number o f zeros in a given column can be increased, thereby decreasin g the number of nonzero terms in the Laplace expansion alon g that column . Example 23 : Evaluate the determinant of the matrix 1 -2 4 -
A=
2 -1
1
-3 1 -2
The following row-reduction operations, because the y simply involve adding a multiple of one row to another, d o not alter the value of the determinant : 1 2 -3
-2 -1 1
4 1 -2
-2 rl added to r2 3r1 added to r3
1
>
0 0
-2 3 -5
4 -7 10
CLIFFS QUICK REVIE W
234
THE DETERMINAN T
Now, when the determinant of this latter matrix is compute d using the Laplace expansion by the first column, only on e nonzero term remains : 1 -2
4
-7 = (3)(1 0) - (-5)(-7) = - 5 10
0 3 -7 = 1 0 -5 10 Therefore, det A = -5 .
■
Example 24 : Evaluate the determinant of the matri x 51 1 A= -3 3 - 8 20 4 In order to avoid generating many noninteger entrie s during the row-reduction process, a factor of 2 is first divide d out of the bottom row . Since multiplying a row by a scalar multiplies the determinant by that scalar , det -3 3 -8 = 2 det -3 3 -8 2 0 4 10 2 Now, because the elementary row operation s 5 -3
1
1 3 0
1 -8 2
-5r3 added to rl 3r3 added to r2
0
1
-9
0 1
3 0
-2 2
LINEAR ALGEBRA 23 5
THE DETERMINAN T
do not change the determinant, Laplace expansion by the first column of this latter matrix completes the evaluation of th e determinant of A : 5
det -3 2
1
1-
5
1
1
3 -8 = 2 det -3 0 4 1
3 0
-8 2
0 =2det 0
1 3
-9-2
1
0
2
= 2 1 • det
3
_ ~
l = 2(—2 + 27 ) = 50
2
■
Cramer's Rul e Consider the general 2 by 2 linear system a11x + a12y = b1 a21 x +a22y=b2
Multiplying the first equation by a22, the second by —a 12, and adding the results eliminates y and permits evaluation of x :
CLIFFS QUICK REVIEW
236
THE DETERMINAN T
a l iaz2x + a l 2a22Y = a zzb i —aiza2ix — a l zaz2Y =" —al 2132 x(a i 1a2 2 — a l 2a 2 i)
= ar bi
— al
A
a22 bl — a12 b2
x
a ll a22 — a12a 2 1
assuming that a ll a22 — a12 a21 # 0 . Similarly, multiplying the first equation by —a 21 , the second by a ll , and adding the results eliminates x and determines y: —a l 1a21 x — a 12 a21Y = —a21 b 1 a l 1a21x + a11a22Y= al 1b2
Y(a i 1a22 — ~ 2 a2 i)
= a l i ii — a2 al i b2 — ai
Y=
a lla22 — a 12a 2 1
again assuming that a ll a22 — a12a21 0 . These expressions fo r x and y can be written in terms of determinants as follows :
x =
a22bl — a12b2 al a22
a12 a21
bl
a12
b2
a2 2
a ll
a1 2
a21
a22
all
bl
a21
b2
a ll
a1 2
a21
a22
and Y= .
LINEAR ALGEBRA
al 1 b2 —a21 bl al 11122 — a12 a2 1
—
237
TH E DETERMINAN T
If the original system is written in matrix form, al i
C a2 l
a 22 j[yj _ Lbz
then the denominators in the above expressions for the unknowns x and y are both equal to the determinant of the coefficient matrix . Furthermore, the numerator in the expression for the first unknown, x, is equal to the determinant of th e matrix that results when the first column of the coefficien t matrix is replaced by the column of constants, and the numerator in the expression for the second unknown, y, is equa l to the determinant of the matrix that results when the secon d column of the coefficient matrix is replaced by the column o f constants . This is Cramer's Rule for a 2 by 2 linear system . Extending the pattern to a 3 by 3 linear system, - bl all a 12 a13 x a21
a22
a23
.y
=
_ a31 a32 a33- z
b2
- b3 -
Cramer's Rule says that if the determinant of the coefficien t matrix is nonzero, then expressions for the unknowns x, y, and z take on the following form :
x=
238
a 12
a13
al l
a
a1 3
a ll
a1 2
a
b2
a22
a23
a2 1
b2
a2 3
a21
a22
b2
b3
a32
a3 3
a3 3
z = a31 all
a32 a12
a1
a 21
a22
a2 3
a31
a32
a3 3
an y
all
a1 2
a13
a21
a 22
a23
a21 a22
a1 3 an
a31
a 32
a 33
a31 a32
a33
all
a1 2
3
CLIFFS QUICK REVIEW
TH E DETERMINAN T
The general form of Cramer's Rule reads as follows : A system of n linear equations in n unknowns, written in matrix form Ax = b as all
an
a21
a22
and
an2
es .
aln 2n
...
x1 _ x2
ann _ xn
bi b2
_ bn _
will have a unique solution if det A 0, and in this case, the value of the unknown x. is given by the expression x'
det 4 det A
where A . is the matrix that results when column j of the coefficient matrix A is replaced by the column matrix b .
Two important theoretical results about square systems follow from Cramer's Rule : Theorem F. A square system Ax = b will have a unique solution for every column matrix b if and only if det A 0. Theorem G . A homogeneous square system A x = 0 will hav e only the trivial solution x = 0 if and only if det A 0. Although Cramer's Rule is of theoretical importance becaus e it gives a formula for the unknowns, it is generally not an efficient solution method, especially for large systems . Gaussian elimination is still the method of choice . However, Cramer' s Rule can be useful when, for example, the value of only on e unknown is needed .
LINEAR ALGEBRA 239
TH E DETERM111NAN T
Example 25 : Use Cramer's Rule to find the value of y give n that
x+y–2z=–1 0 2x–y+3z=-- 1 4x+6y+z= 2 Since this linear system is equivalent to the matrix equa tion 1 1 -2 x -10 2 -1 3 y = -1 4 6
1z _
2
Cramer's Rule implies that the second unknown, y, is given b y the expression 1 -10 -2 2 -1 3 4 y=
2 1 1 1 -2 2 -1 3
4 6
(* )
1
assuming that the denominator—the determinant of the coefficient matrix is not zero . Row-reduction, followed by La place expansion along the first column, evaluates these determinants:
CLIFFS QUICK REVIEW
240
THE DETERMINAN T
-10
-2
2
-1
3
4
2
1
-2ri added to r2 --4ri added to r3
-10 0 0
19 42
-2 7 9
19
7
42
9
=171—294 = -12 3 1
1
-2
2 -1 4 6
3 1
-2 ri added to r2 -4ri added to r3
1
-2
0 -3
7
0
9
2
-3
7
2 9 =–27–1 4 =--4 1
With these calculations, (*) implies -123 = 3 Y -41
■
The Classical Adjoint of a Square Matrix Let A = [a u] be a square matrix. The transpose of the matri x whose (i, j) entry is the a, cofactor is called the classical ad joint of A :
Adj A = [cof(au)I T Example 26 : Find the adjoint of the matrix 1 -1 2 A=40 6 0 1 -1
LINEAR ALGEBRA
24 1
THE DETERMINAN T
The first step is to evaluate the cofactor of every entry : cof(al ,) = +
0
6
1
-1
= -6
4 cof(a12 ) = — 0
6 -1
1
2
0
-1
1 4
2 .2 6
cof( a21 ) = —
-1 1
2 =1 -1
cof(a22) _ +
cof( a31 ) _ +
-1 0
2 _ -6 6
cof(a32) _ —
4 = --11
cof(a13) = + cof( a23 ) = — cof( a33 ) = + Therefore, Adj A = [cof(au)]T =
-6 1 -6 -6 4 4-T 1 -1 -1 = 4 -1 2 -6 2 4 4 -1 4
■
Why form the adjoint matrix? First, verify the followin g calculation where the matrix A above is multiplied by its ad j oint : 1 -1 2- -6 1 -6 -2 0 0A . Adj A= 4 0 6 4 -1 2 = 0 -2 0 = -2 I 1 -1 4 -1 4 0 0 -2 0 (*)
CLIFFS QUICK REVIE W
242
THE DETERMINAN T
Now, since a Laplace expansion by the first column of A give s 1 -1 2 0 6 det A = 4 0 6=11 -1 0
1 -1
-1 —4 1
2
--6—4(—1)=—2
equation (*) becomes A • Adj A = (det A)1 This result gives the following equation for the inverse of A : Adj A detA By generalizing these calculations to an arbitrary n by n matrix, the following theorem can be proved : Theorem H . A square matrix A is invertible if and only if it s determinant is not zero, and its inverse is obtained by multi plying the adjoint of A by (det A)-'. [Note: A matrix whose determinant is 0 is said to be singular ; therefore, a matrix is invertible if and only if it is nonsingular .] Example 27 : Determine the inverse of the following matri x by first computing its adjoint: _ 12 3 A= 4 5 6 7 8 10 First, evaluate the cofactor of each entry in A :
LINEAR ALGEBRA 243
THE
DETERMINANT
cof(a„) _ + cof(a 21) _ — cof(a 31 ) _
+
5
6
=2
cof(a 12) _ —
=4
cof(a 22 ) = +
= -3
cof(a 32) = —
8 10 2
3
8
10
2
3
5
6
4
5
7
8
1
2
7
8
1
2
4
5
2
2
-3
4 -3
-11 6
6 -3
cof(a l3 ) = + cof( a 23) _ — cof(a33) =
4 6 =2 7 10 1
3
7
10
1 4
3 6
=—1 1 =6
= -3 =6 = -3
These computations imply that Adj A = [cof(a y )]T
=
T
2
4
2 -3
-11 6
-36 -3
Now, since Laplace expansion along the first row give s 1 2 det A = det 4 5 7 8 =1
5 8
3 6 10
4 6 -2 10 7
6 4 +3 10 7
5 8
= 2 — 2(—2)+3(—3 ) = -3
CLIFFS QUICK REVIEW
244
THE DETERMINAN T
the inverse of A is 2 A
_ 1_
Adj_ — 1 det A 3
4 -3-
-
3 —3
2 --11 6 = - 32 -3
6
-3
1
11
3 -2
1-2 1
which may be verified by checking that AA-1 = A -1A = I .
■
Example 28 : IfA is an invertible n by n matrix, compute the determinant of Adj A in terms of det A. Because A is invertible, the equation A -1 = Adj A/det A implies Adj A = (det A) - A-1 Recall from Example 13 that if B is n x n and k is a scalar, then det(kB) = k" det B . Applying this formula with k = det A and B = A' gives det [(det A) • A -' ] = (det A)" • (det A -' ) = (det A)" • (det A)-1
= (det A)" -'
Thus, det (Adj A) = (det A)"-'
■
Example 29 : Show that the adjoint of the adjoint of A is guaranteed to equal A ifA is an invertible 2 by 2 matrix, but not if A is an invertible square matrix of higher order.
LINEAR ALGEBRA 245
THE DETERMIINAN T
First, the equation A . Adj A = (det A)I can be rewritten Adj A = (det A) . A -1 which implie s Adj (Adj A) = det (Adj A) • (Adj A)-'
(*)
Next, the equation A • Adj A = (det A)I also implies (AdjA) -1 =
det A
This expression, along with the result of Example 28, trans forms (*) into Adj (Adj A) = det (Adj A) • (Adj A )-1 = (det Ar l
A det A Adj (Adj A) = (det A}n-2A where n is the size of the square matrix A . If n = 2, then (de t A)n-2 = (det A)° = 1—since det A 0—which implies Adj (Ad j A) = A, as desired . However, if n > 2, then (det A) i-2 will not equal 1 for every nonzero value of det A, so Adj (Adj A) will not necessarily equal A . Yet this proof does show that whatever the size of the matrix, Adj (Adj A) will equal A if det A = 1 . ■ Example 30 : Consider the vector space C2-(a, b) of function s which have a continuous second derivative on the interva l (a, b) c R . If f,, g, and h are functions in this space, then th e following determinant,
246
CLIFFS QUICK REVIEW
TH E DETERMINAN T
_f g h det f' g' h ' f g►, h ?P
is called the Wronskian off g, and h . What does the value o f the Wronskian say about the linear independence of the functions f, g, and h? The functions J, g, and h are linearly independent if th e only scalars cl, c2, and c3 which satisfy the equation cif + c2g + c3h = 0 (* ) are c 1 = c2 = c3 = O. One way to obtain three equations t o solve for the three unknowns c1, c2, and c 3 is to differentiat e (*) and then to differentiate it again . The result is the system clf+c2g+c3h= 0 c1f'+c2g'+c3h'= 0 clf " + c2g" + c3 h" = 0 which can be written in matrix form a s - f. g h f' g' fit g
►,
h'
c=0
(** )
h
where c = (cl, c2, c3)T. A homogeneous square system suc h as this one—has only the trivial solution if and only if the determinant of the coefficient matrix is nonzero . But if c = 0 i s the only solution to (**), then c 1 = c, = c3 = 0 is the only solution to (*), and the functions J, g, and h are linearly independent . Therefore ,
LINEAR ALGEBRA 247
THE DETERMINAN T
f g
f, g, and h are linearly independent
h
det f' g' h'
0
f►► g►► h "
To illustrate this result, consider the functions J, g, and h defined by the equation s f(x)=sin2 x,
g(x) = cost x,
and h(x) = 3
Since the Wronskian of these functions i s sine x cost x 3 sin2x —sin2 x det sin2x —sin2x 0 =3 2 cos2x -2 cos 2x 2cos2x -2cos2x 0 = 3(—2 sin 2x cos 2x + 2 sin 2x cos 2x ) 0
these functions are linearly dependent . This same result was demonstrated in Example 56, page 195 by writing the following nontrivial linear . combination of these functions tha t gave zero : 3f + 3g — h = 0 . Here's another illustration . Consider the functions f, g, and h in the space C2(1/2, oo) defined by the equation s f (x) = ex ,
g(x) = x,
and h(x) = In x
By a Laplace expansion along the second column, the Wronskian of these functions is
CLIFFS QUICK REVIE W
248
TH E DETERMINAN T
ex W(x) = det e" ex
x 1
Inx 1/x
0 -1x2
= —x
ex
1/ x
ex
-1/x2
=e x (1 +1— x
ex + ex
In x -1/x 2
2 —ln x x
Since this function is not identically zero on the interval (1/2 , oo)--for example, when x = 1, W(x) = W(1) = e ~ 0—th e functions f, g, and h are linearly independent. ■
LINEAR ALGEBRA 249
LINEAR TRANSFORMATION S
Given two sets A and B, a function f which accepts as input an element in A and produces as output an element in B is written f : A -+ B. This is read 'y' maps A into B," and the words map and function are used interchangeably. The statement 'Y' maps the element a in A to the element b in B" is symbolized f(a) = b and read "f of a equals b ." The set A is called th e domain off, and b is the image of a by f. The purpose here is to study a certain class of functions between finitedimensional vector spaces . Throughout this section, vectors will be written as column matrices .
Definition of a Linear Transformatio n
Let V and W be vector spaces . A function f V -+ W is said to be a linear map or linear transformation if both of the following conditions hold for all vectors x and y in V and an y scalar k : f( x + y) = f( x ) +
fly)
f(kx) = kf(x )
These equations express the fact that a map is linear if an d only if it is compatible with the vector space operations o f vector addition and scalar multiplication .
LINEAR ALGEBRA
251
LINEA R TRANS FORMA TION S
Example 1 : Consider the function J: RZ -3 R 3 defined by th e equatio n
Since f(x + y) = f
(F xi x2] [
+ [Y2 ]
+ y,
x2 +y2 x i +y~ + yl + x2 +Y2 _
+ Y~) – 3( xz + YZ xl xt +x2 2x, — 3x2
y~ +
y i +y2
— 3y2
[xii + f 2 -f xzCY
= fi x) + fly) and
252
CLIFFS QUICK REVIE W
LINEA R TRANS-
FORMATION S
kx 1 kx~ + kxZ 2(kxl ) – 3(kx2 )_
xl = k x l + x2 _ 2xl – 3x2 _ = kf(x) the two conditions for linearity are satisfied ; thus, f is a linear transformation . ■ Example 2 : Consider the function g : R 2 -4 R 3 defined by th e equation xl x1 + x2 _ 2x 1 – 3x2 + 1 _ This function is not linear . For example, if k = 2 and x = (1, 1)T , then g( kx ) = g 2 I 1 = g 2 = 4 L J J -1
LINEAR ALGEBRA
253
LINEA R TRANS FORMA TION S
but [1- 2 kg(x)=2g =2 2 = 4 J 0 0
1
Thus, g(kx) ~ kg(x) ; this map is not compatible with scala r multiplication . For this particular function, nonlinearity coul d also have been established by showing that f is incompatibl e with vector addition . Let y = (2, 2)T; then 3([11 + F2]) -31 g(X+y)=g =g = 6 I] [2 3~ -2 but -1 -2 g(x) + g(y) = g + g 1 The fact that g(x + y) g(z) + g(y) also establishes that th e function g is not linear. ■ Example 3 : Iff V -~ W is linear, then f(0) = 0 ; that is, a lin ear transformation always maps the zero vector (in V) to th e zero vector (in W) . One way to prove this is to use the fact tha t
f must be compatible with vector addition . Since 0 = 0 + 0 , f(0)= f(0+0)= f(0)+f(0)=2f(0 )
254
CLIFFS QUICK REVIEW
LINEA R TRANS FORMATION S
Now, subtracting f(0) from both sides of the equation f(0) = 2 f(0) yields f(0) = 0, as desired . This observation provides a quick way to determine that a given function is not linear : Iff does not map the zero vecto r to the zero vector, then f cannot be linear . To illustrate, th e function g: R2 —> R 3 defined in Example 2 above could hav e been proven to be nonlinear by simply noting that g maps th e zero vector 0 = (0, 0) T in R2 to the nonzero vector (0, 0, 1) T in R3. ■
t 2
Example 4 : Let h : R 2 —> R 2 be defined by the equatio n h(x l , x2 ) T =(x -x , xl xZ ) T . Is h linear? First, check whether the function maps 0 to 0 ; if the answer is no, then no further analysis is necessary ; the function is nonlinear by the result proved in the preceding example . Al though the answer is yes in this cased does map 0 to 0—thi s does not mean that the function is linear . The guarantee is that if the function does not map the zero vector to the zero vecto r then it is not linear ; the condition that 0 be mapped to 0 i s necessary, but not sufficient, to establish linearity . In fact, h is not linear, as seen by the following calculation : If x = (1, 1)T and k = 2, then
h(kx) = h (2[l]) = h[2 ] =. [4 ] 2 But
kh(x) = 2h[1] = 2[ o
LINEAR ALGEBRA
H
ZJ 255
LINEA R TRANS FORMATIONS
Thus, h(kx) ~ kh(x) . Incompatibility with scalar multiplicatio n proves that h is nonlinear . ■ Example 5 : Consider the matrix A=
and define a function T: R 2 Is T linear?
—*
R3 by the equation T(x) = Ax .
Clearly, T(0) = A • 0 = 0, so you cannot immediately sa y that T is nonlinear . In fact, checking the two conditions reveal s that T actually is a linear transformation: T(z + y) = A(z + y)
1 =1 2
0
1 1Fxi1 + FYi
]1 J
-3 [x l
+ Y~ + y2
xl + .yl + x2 + y2
2(x l + ) — 3(x2 + Y2 )..
256
CLIFFS QUICK REVIEW
LINEA R TRANS-
FORMA TIONS
xl xl +x2
yl
2 x l – 3x2 -
1 1 2
0
- 2y1 – 3y2
rx1l
1
Yi+y2
+
-3
x2
1
+
J
0
1
1
2
-3
l [Yi ~Y2 .1
= Ax + A y
= T(z) + T(y) and
I k [ X1 2 -3 1
0-
1 2 -3
[:c:1 1
kx1 2(kxI ) — 3(kx2 k
LINEAR ALGEBRA
25 7
LINEA R TRANS -
FORMA TIONS
[x l
=k
[x2
= k(Ax ) = kT(x)
Since T is compatible with both vector addition and scalar multiplication, T is linear. ■ Example 6 : Consider the polynomial space P 3 and define a map U. P3 -~ P2 by the equation D(p) = p' . Explicitly, D(ao + aIx + a 2x + a 3 x3 ) = al + 2a2x + 3a3x2; this is the differentiation map. Is D linear? Because the derivative of a sum is equal to the sum of th e derivatives, D(P+ q) _ (P+9 y = P' + 9 ' _ D (P)+ D(9) and because the derivative of a scalar times a polynomial i s equal to the scalar times the derivative of the polynomial , D(kP) = (k P) ' = kP~ = w(P) the map D is indeed linear. Example 7 : Is the map T: M2x3 linear transformation?
■
--*
M3X2
given by T(A) = A T a
The process of taking the transpose is compatible with addition, T(A+B) _ (A+B) T =
A T +B T
=
T(A)+T(B )
CLIFFS QUICK REVIEW
258
LINEA R TRANS FORMATIONS
and with scalar multiplication, T(kA) _ (kA) T
Thus, T is linear.
= kA T =
kT(A )
■
Example 8: Is the determinant function on 2 by 2 matrices, det: M2 .2 --* R, a linear transformation ? No. The determinant function is incompatible with addition, as illustrated by the following example . If A=
[0
0J and B = [~ ~
J
then det(A + B) = det[0
°J =
but det A + det B = detl0
~ J + det[o ~ J = 0 + 0 = 0
Thus, det(A + B) ~ det A + det B, so det is not a linear map . You may also show that det is nonlinear by providing a counterexample to the statement det(kA) = k det A . ■
LINEAR ALGEBRA
259
LINEA R TRANS FORMA TIONS
Linear Transformations and Basis Vectors Theorem I. Let T: V -+ W be a linear transformation between the finite-dimensional vector spaces V and W. Then the imag e of every vector in V is completely specified—and, therefore, the action of T is determined—once the images of the vector s in a basis for V are known . Proof. Let V be n-dimensional with basis B = { vi , v2 , . . ., vn } and assume that the images of these basis vectors, T(v l ) = wl , T(v2 ) = w2 , . . ., T(vn) = wn are known . Recall that if v is a n arbitrary vector in V, then v can be written uniquely in terms of the basis vectors; for example , v = k1v, +k2v2 + . . .+knvn The calculation T(v) = T(k,v, + k2v2 +• • •+ knv„ ) = T(k,v,)+T(k2vZ)+•••+T(k„v„ ) = k,T(v,)+k2T(v2)+• • •+k„T(v n ) =
+ k2w2 + . . . + k„w „
where the first two steps follow from the linearity of T, show s that T(v) is uniquely determined . In short, once you kno w what T does to the basis vectors, you know what it does t o every vector. ■ Example 9 : Let T: R2 --* R 2 be the linear transformation that maps (1, 0)T to (2, -3) T and (0, 1) T to (-1, 4) T. Where does T map the vector (3, 5)T ?
CLIFFS QUICK REVIEW
260
LINEA R TRANS FORMATION S
Since (1, 0)T and (0, 1) T form a basis for R2, and the images of these vectors are known, the image of every vector i n R2 can be determined . In particular, 7'51= L .1
3~0~+S
=3
~
L J
T[~1 + 5 JO] Ll
= 3L] + S
4
L J [ F—5] -9i L20
61 +
Example 10 : Let T: R2 —> R2 be the linear transformation tha t maps (1, 1)T to itself and (1, -3)T to (5, -15)T. Where does T map the vector (3, 5) T? First, note that the vectors v ~ = (1, 1)T and v2 = (1, -3)T form a basis for R 2, since they are linearly independent an d span all of R 2. In order to use the information about the images of these basis vectors to compute the image of v = (3, 5) T, the components of v relative to this basis must first be deter mined. The coefficients k l and k2 that satisfy +/r2v2 = v are evaluated as follows. The equatio n
LINEAR ALGEBRA
261
LINEA R TRANS FORMA TIONS
k, [l] + k,[_ 311 = ~ 5 1
Cl -3][k2] _ leads to the augmented matrix [1 1
1 3 -3
-rl added to r2
[1
5
from which it is easy to see that k2 fore, V—
2
Vl
= - 1/2
1
3
0 -4
2
and kl = 7/2. There-
2 V2
so
T(v) = T(3- v l - + V2 ) = T(vI ) - T(v2 ) 2
[1]
2 [-15 ]
The transformation here is, in fact, the same as the one in Ex ample 9 . ■
262
CLIFFS QUICK REVIEW
LINEA R TRANS FORItIADON S
The Standard Matrix of a Linear Transformation The details of the verification of the conditions for linearity i n Example 5 above were unnecessary . IfA is any m by n matrix , and x and y are n-vectors (written as column matrices), the n A (x + y) = A x + Ay, and for any scalar k, A(kx) = k(Ax) . Therefore, if T: Rn --~ Rm is a map defined by the equation T(x) = Ax, where A is an m by n matrix—that is, if T is given by multiplication by an m by n matrix—then T is automatically a linear transformation . Note in particular that the matri x A given in Example 5 generates the function described in Ex ample 1 . This is an illustration of the following general result : Theorem J. Every linear transformation from R n to R m is given by multiplication by some m by n matrix . This result raises this question : Given a linear transformatio n T: Rn –~ Rm , how is a representative matrix for T computed ? That is, how do you find a matrix A such that T(x) = Ax for every x in Rn? The solution begins by recalling the observation that th e action of a linear transformation is completely specified onc e its action on the basis vectors of the domain space is known . Let T: Rn —* Rm be a given linear- map and assume that both Rn and Rm are being considered with their standard bases, B = {e l , e 2 , . . ., en } cRn and B' = {e, e2, . . ., e,n } cRm . The goal is to find a matrix A —which must necessarily be m by n—such that T(x) = A x for every x in R n'. In particular, T(x ) must equal Ax for every x in the basis B for R n. Sinc e
LINEAR ALGEBRA
263
LINEA R TRANS FORMA TIONS
T(e,)= T
. ..
a1 n
a22
. ..
a2n
0 _ aml -0- - al
am2
. . . a .0
au
. ..
0
0
=A
1
=A
0
0
=A
_1_
1
a21
a22
0
am1
-O-
0T(e„)=T
a12
1
0 -0T(e ) = T 2
all a21
1
-a
ll
0
a21
_ 1_
_ aml
1 a
a ln -0-
a
_ ml
au
• • • a2n
1
a 22
am2
amn
0
am 2
a12
. . . al?,
0^
a1 n
0
a2 n
amn - 1
_am n
a22 • • • . ..
am2
a2n
it is clear that the images of the basis vectors are the column s of A :
A= T(e l )
T(e2 )
•
T(e n )
When R n and R m are equipped with their standard bases, th e matrix representative, A, for a linear transformation T: R n –~► Rm is called the standard matrix for T; this is symbolized A =
[n .
CLIFFS QUICK REVIEW 264
LINEA R TRANS FORMA TIONS
Example 11 : A linear map that maps a vector space into itsel f is called a linear operator . Let T: R2 --> R 2 be the linear operator—given in Example 9—that maps (1, 0) T to (2, -3) T and (0, 1)T to (—1, 4) T . Verify the calculation T(v) = (1, 11) T , where v = (3, 5) T , by first computing the standard matrix for T . The images of the standard basis vectors (1, 0) T and (0, 1) T form the columns of the standard matrix . Thus, [ 2 -1
[ T] =
[—3 4]
from which it follows T(v) = [
[
45] — [—3
as calculated in Example 9 . Example 12 : Let T: R 3 equation x1 T x2 x3
—*
4 ]~ S J
_
111 ~
■ R4 be the linear map defined by th e x1 +2x2 +3x3 -x1 + x2 4x2 - x3
2x 1 + x2 — 2x 3 _
Find the standard matrix for T and use it to compute the image of the vector v = (1, 2, 3) T .
LINEAR ALGEBRA
265
LINEA R TRANS FORMA TIONS
The images of the standard basis vectors are the column s of the standard matrix . Sinc e 0 T 1 0
2
-2-
0 1 , and T 0 4 1 1
the standard matrix for T is
[T]
Since the standard matrix is constructed so that T(v) = [T]v , the image of v = (1, 2, 3)T is given by the matrix produc t 1 T(v) = [T]v
2 3
Example 13 : Fix a vector v o = (a, b, c)T in R3 and define a linear operator T : R3 —* R3 by the formula T(x) = v o x x. Find the standard matrix for T. The columns of the standard matrix are the images of th e standard basis vectors e l = i, e2 = j, and e3 = k under T. Sinc e
266
CLIFFS QUICK REVIEW
LINEA R TRANS FORMA TIONS
i j
k
T(e i )=T(i)=vo x i = a b c =cj—b k 10 0 i j
k T(e 2 ) = T(j) = vo x j = a b c = —ci + ak 01 0 i j k T(e 3 ) = T(k) = vo x k = a b c = bi — aj 00 1 the standard matrix for T is 0 —c
[T]= T(e l) T(e2 ) T(e 3 )
b
c 0 —a —b a 0
Example 14 : Consider the linear operator by To (x) = Ae x where
To
■
on R 2 defined
pose —Sine Ao — Sine cos e What does To do to the standard basis vectors of R2? Compute TB(1, 1)T for 0 = 60° . Since T is already a matrix transformation—that is, a linear map of the form T(x) = Ax—the standard matrix for T is A itself. Since [TB] = A o, and the images of the standard basi s vectors are the columns of the standard matrix for T ,
LINEAR ALGEBRA
267
LINEA R TRANS FORMATIONS
T(el ) = column 1 of AB =
c0se]
T(e2 ) = column 2 ofAe =
8 sin cos O
sin e
Figure 53 shows that the effect of this map is to rotate the basis vectors through the angle 6 ; this transformation is therefore called a rotation.
T(e)
z
sin 8 cos 8
■ Figure 53
■
If B = 60°, the n pose —Sine 1/2 _-1/2 Ae = [sine cos e] - 13/2 1/2 and the image of the vector v = (l, 1) T is To(v) _A9v
268
_
1/2 -1 —1 1— ~ I/2 1/2 [1] 2 1+13-
CLIFFS QUICK REVIEW
LINEA R TRANS FORMA TIONS
a result that can be verified geometrically .
■
Example 15 : Consider the differentiation map D: P3 -+ P2 given in Example 6 above . Determine the representative matrix for D relative to the standard bases for P3 and P2. Because P 3 is isomorphic to R 4 and P 2 is isomorphic t o R3, the map D: P3 -4 P2 can be regarded as a map D : R4 R3; furthermore, the matrix for D relative to the standard base s for P3 and P2 is the same as the standard matrix fo r D : R4 -* R3 . Sinc e D(ao +a~x+a2x2 +a3x3 ) = al + 2a2x + 3a3x2 the corresponding map
b is given by the equation ao al D a2 a3
al 2a2 3a3
The columns of the standard matrix for D are the images o f the basis vectors--1, x, x 2, and x 3—of the domain. Because D(l) = 0, D(x) = 1, D(x 2) = 2x, and D(x3) = 3x 2, or, in terms of b , -0-1-0_ 0 1 -00 -0 1 0 0 0 0 D 0 D 0 D 2 , and D 0 0 0 1 0 0 0 3 1 0 0 0 the standard matrix for this differentiation map i s
LINEAR ALGEBRA
269
LINEA R TRANS FORMA TIONS
o
0 1 0 [D]=[13]= 0 0 2
0 3
0 0 0
Here's an illustration of this result . Consider the polynomial p = 4 — 5x + x2 — 2x3, whose derivative is p' = -5 + 2x — 6x2. In terms of the matrix above, 4
0 1 0 0 _5 D(P) = 0 0 2 0 _0 0 0 3
1 =
_
-5 2
_ -6 _
_ -2
which corresponds to the polynomial -5 + 2x — 6x 2, as expected. ■ Example 16 : Let T : Rn --* Rn be a linear operator and assume that R n is being considered with a nonstandard basis . While it is still true that the transformation can be written in terms o f multiplication by a matrix, this matrix will not be the standard matrix . In fact, one of the reasons for choosing a nonstandar d basis for R n in the first place is to obtain a simple representative matrix, possibly a diagonal one . Let B = { b l , b2 , . . ., bn } be a basis for Rn ; then, the representative matrix of an operator T has this property : It multiplies the component vector of x to give the component vector of T(x) : [ T]B[x]B = [ T( x )]B
(*)
The matrix [TSB is called the matrix of T relative to the basi s B. The columns of [TJ B are the component vectors of the images of the basis vectors :
270
CLIFFS QUICK REVIEW
LINEA R TRANS FORMA TION S
[T] B = [T(b l )] B
. ..
IT(bz)le
[T(b,z )] e
The matrix for T will be the standard matrix only when th e basis for R n is the standard basis . In Example 10, R 2 was considered with the basis B = {bi = (1, 1)T, b~=(1, -3) T } . Because T mapped b, to itself, [T(bl)]B = (1, 0) ,and, because T mapped b 2 to SbZ, [ T( b2)]B = (0, 5)T . Therefore, the matrix for T relative to the nonstandar d basis B is the simple, diagonal matrix [T1B
1 0 [0 5 ]
Recall that the vector v = (3, 5) T has components k, = 7/2 an d k2 = -1/2 relative to B : that is, [v]B = (7/2, -1/2)T. Equation (* ) above then gives [T]B[v]B =
1
0
0
5
_2
Since [T(v)] B = (7/2, 5/2)T, the image of v under T is T(v) =
— b2
[ 11_1 [
1
1
which agrees with the result of Example 10 .
■
LINEAR ALGEBRA 271
LINEA R TRANS FORMATION S
The Kernel and Range of a Linear Transformation Let T : Rn - R m be a linear transformation ; then, the set of all vectors v in Rn which T sends to the zero vector in R m is called the kernel of T and denoted ker T: ker T = v E Rn : T(v) = 0 Since every linear transformation from R" to R' is given b y multiplication by some m by n matrix A—that is, there exists an m by n matrix A such that T(x) = Ax for every x in R"—the condition for v to be in the kernel of T can be rewritten Av = 0 . Those vectors v such that Av = 0 form the nullspace of A ; therefore, the kernel of T is the same as the nullspace of A : ker T = N(A ) Thus, for T : R n –* Rm , ker T is a subspace of and its dimension is called the nullity of T . This agrees with the definition of the term nullity as the dimension of the nullspace of A . The range of a linear transformation T : R" ---> R"l is the collection of all images of T: range(T) _ {w e R' : w = T(v) for some v in R " } The range of T also has a description in terms of its matri x representative . A vector w in Rm is in the range of T precisely when there exists a vector v such that T(v) = w . If T(v) is always equal to Av for some matrix A, then w is in the range o f T if and only if the equation Av = w can be solved for v . Bu t this is precisely the condition for w to be in the column space of A. Therefore, if T : Rn --> R', then the range of T, denote d
272
CLIFFS QUICK REVIE W
LINEA R TRANS FORMA TON S
RM, is a subspace of R' and, for T(v) = Av, is the same as th e column space of A : R(T) = CS(A )
Thus, for T : R n —* R m , R(T) is a subspace of Rm, and its dimension is called the rank of T. This agrees with the definition of the term rank as the dimension of the column space o f A. Example 17 : Determine the kernel, nullity, range, and rank o f the linear operator T : R2 -4 R 2 defined by the equation T[x2 ] [—3x, — + 4x2 Since 2xi — x2
1_[
-3xl + 4x2 —
2
1 xl 4] x2
it is clear that the standard matrix for T i s 2 -1 A= [—3 4] . Now, because A is a square matrix with nonzero determinant, i t is invertible, and there are two consequences : (1) The equation A x = 0 has only the trivial solution, so th e nullspace of A, which is the kernel of T, is just the trivial subspace, {0} . Thus, nullity T = 0.
LINEAR ALGEBRA
273
LINEA R RANSFORMA TIONS
(2) The equation Ax = b has a (unique) solution for every b in R2, so the column space of A, which is the range of T, is all of R 2. Thus, rank T = 2 . ■
The observations made in the solution of the precedin g example are completely general : Theorem K. Let T : Rn --- Rn be a linear operator . Then ker T = {O} and R(T) = Rn if and only if the standard matrix for T is invertible . Example 18 : Determine the kernel, nullity, range, and rank o f the linear operator T : R3 --* R3 given by the equation T(x) = A x, where 1 2 3A= 4 5 6 78
9
Since the determinant of A is zero, the preceding theore m guarantees that the kernel of T is nontrivial, and the range of T is not all of R3 . The augmented matrix [Alb] may be rowreduced as follows :
274
CLIFFS QUICK REVIEW
LINEA R TRANS -
FORMA TION S
1 2 3 bl [Alb] = 4 5 6 b2 _ 7 8 9 b3 -1
-4r1 added to r2 -7r1 added to r3
3 bl 0 -3 -6 -4bl + b2 2
0 -6 -12 -7bl + b3 -2 r2 added to r3
(—1/3)r2
~
1 2 3 0 -3 -6 -4b, + b2 0 0 0 bl — 2b2 + b3, 123 012 000
b, +b2 ) b,—2b2 +b3
=
LA'lb'i
The row of zeros implies that T(x) = Ax = b has a solutio n only for those vectors b = (b 1 , b 2, b3)T such that b l — 2b2 + b3 = 0 ; this describes the column space of A, which is the rang e of T: R(T) ={b=(b,, b2, b3 )T ER 3 :
- 2bZ
+b3 =0}
and it follows from the two nonzero rows in the echelon for m ofA obtained above that rank A = rank T = 2. It also follows from the row reduction above that the set o f solutions of Ax = 0 is identical to the set of solutions of A' x = 0, wher e
LINEAR ALGEBR A
275
12 3 A'= 0 1 2 00 0 Let x = (x,, x 2, x3 )T . The two nonzero rows in A' imply that 3 — 2 = 1 of the variables is free ; let x3 be the free variable . Then back-substitution into the second row yields x 2 = 2x3, and back-substitution into the first row gives x i = x3 . The vectors x that satisfy Ax = 0 are those of the form x = (x 3, -2x3, x3) = x3(1 2, 1) . Therefore, the nullspace of Ache kernel of T—is ker T = N(A) = {x E R3 : x = t(l, — 2, 1)T for any t in R} This is a 1-dimensional subspace of R 3 , so nullity T = 1 .
■
Note that the rank plus nullity theorem continues to hol d in this setting of linear maps T : Rn --~ Rm , that is, rankM + nullity(T) = n = dim (domain of T) Example 19 : Determine the kernel, nullity, range, and rank o f the linear map T : R3 - R2 defined by the equation T(x) = Ax, where 1 -2 1 A= [2 -3 - 4 Theorem K does not apply here, since T is not a linear operator; the matrix for T is not square . One elementary row operation reduces A to echelon form :
276
CLIFFS QUICK REVIE W
1
ri
-2 1 2 -3 -4]
-2r1 added to r2
1 -2 0
. i,
1 -6
=A '
To find the kernel of T, the equation A'x = 0 must be solved . Let x = (x l , x2, x3)T. The two nonzero rows in A' imply 3 — 2 = 1 of the variables is free; let x 3 be the free variable . The second row implies x2 = 6x3 , and back-substitution into the firs t row gives xl = 11x3 . The vectors x that satisfy Ax = 0 are thos e of the form x = (11x3, 6x 3, x3)T = x3 (11, 6, 1)T. Therefore, th e nullspace of A—the kernel of T —i s ker T N(A) = x E R3 : x = t(11, 6, 1)T for any t in R This is a 1-dimensional subspace of R 3, so nullity T = 1 . Now, by the rank plus nullity theorem , rank(T) + nullity(T) = n = dim(domain T ) rank(7) + 1= 3 rank(T) = 2 Because T maps vectors into R2 , the range of T is a subspac e of dimension 2 (since rank T = 2) of R2 . Since the only 2 dimensional subspace of R2 is R2 itself, R(T) = R 2 . ■ Example 19 illustrates an important fact about some linea r maps that are not linear operators : Theorem L . Let T : Rn -* R'" be a linear map with m < n . Since the standard matrix of T —which is m by n- has fewe r rows than columns, the kernel of T cannot be the trivial sub space of R . In fact, nullity(T) n — m > 0.
LINEAR ALGEBRA 277
LINEA R TRANSFORMATION S
Example 20 : Determine the kernel, nullity, range, and rank o f the linear map T: R2 --* R3 defined by the equation T(x) = Ax , where A=
Gaussian elimination transforms A into echelon form : A=
1
-2 -
2
-5
1
0
-2r1 added to r2 r1 added to r3
-2 r2 added to r3
>
1
-2-
0 0
-1 -2
1
-2-
0
-1
0
0
=A'
Since there are just two columns, the fact that there are tw o nonzero rows in A ' implies that there are 2 — 2 = 0 free variables in the solution of Ax = 0 . Since a homogeneous system has either infinitely many solutions or just the trivial solution , the absence of any free variables means that the only solutio n is x = 0 . Thus, the kernel of T is trivial : ker T = {0} c R2 and nullity T = 0 . Now, the rank plus nullity theorem , rank( T)+ nullity( T) = n = dim(domain T ) rank(? + 0 = 2 rank(7) = 2
CLIFFS QUICK REVIE W
278
LINEA R TRANS FORMA TION S
Since T maps vectors into R 3, the range of T is a subspace o f R3 of dimension 2 . It follows directly from the definition of T that
R(T) = {T(x) . x ER 2 1 ={Az :zERZ l 1 -2 2 -5 [ Xl l :x I, x2E R [x j -1 0 2
Since every 2-dimensional subspace of R3 isaplane throug h the origin, the range of T can be expressed as such a plane . Since R(T) contains w 1 = (1, 2, -1) T = i + 2j — k and w2 = (— 2, 5, 0)T = -2i — 5j + Ok, a normal vector to this plane i s i j k n=w1 xw 2 = 1 2 -1 -2 -.5 0 =1
2 -1 • 1 -5 0 — ~— 2
0
+k
1 -5
=—5i+2j— k The standard equation for the plane is therefore -5x + 2y — z = d for some constant d. Since this plane must contain th e origin (it's a subspace), d must be 0 . Thus, the range of T can also be written in the form
LINEAR ALGEBRA 279
LINEA R TRANS FORMATION S
R(T) =
{(x, y, z) T : 5x — 2y + z = 0}
■
Infectivity and surjectivity. A linear transformation T : R"--> R' is said to be one to one (or infective) if no two vectors i n R" are mapped to the same vector in Rm. That is, T is one t o one if and only i f T(v l)=T(v 2 ) ~
vl =v2
(* )
A map T : Rn -+ Rm is said to be onto (or surjective) if th e range of T is all of R m. Theorem M . A linear map T : R" —> Rn' is one to one if an d only if ker T = {O} . Proof. (~) First, assume that T is one to one . If v is in ker T, then T(v) = 0 . Since T(0) = 0 also, T(v) = T(0) = 0, which, ap plying (*), implies v = 0 . Therefore, ker T contains only th e zero vector. (~) Now, assume that T is not one to one, that is, assum e there exist distinct vectors v~ and v 2 in R " such that T(v = T(v2). Then, T(v I ) — T(v2) = 0, which, by the linearity of T, implies T(v i — v2) = 0. Since v l v2, T maps the nonzero ■ vector v l — v Z to 0 ; therefore, ker T ~ {0} . Example 21 : Consider the linear map T : R3 -4 R2 defined b y the equation T(x) = Ax, where A=[
1 -2
1
2 -3 -4
CLIFFS QUICK REVIEW
280
LINEA R TRANSFORMA TIONS
(This is the map given in Example 19 .) Is T one to one? Is it onto ? In Example 19, it was shown tha t ker T = N(A) = x
E
R3 : x =411, 6, 1) T for any t in R
which is a line through the origin . Since ker T contains vector s other than 0, T is not one to one. However, since rank T = rank A = 2, the range is all of R 2, so T is onto. ■
Assume that T : Rn is a linear operator which is on e to one. Then ker T = {0}, so the nullity of T is 0. The ran k plus nullity theorem then guarantees that rank T is n — 0 = n , which means the range of T is an n-dimensional subspace o f Rn. But the only n-dimensional subspace of R n is Rn itself, which means that T is onto . This argument justifies the following result : Theorem N. A linear operator T : Rn - Rn is one to one if and only if it is onto, and this case arises precisely when th e standard matrix for T is invertible, that is, when det [T] 0 . Example 22 : Since det A 0, Theorem N guarantees that th e linear operator T : R2 —> R2 defined by the equation T(x) = Ax, where A= is both one to one and onto ; therefore, ker T = {0} and R(T) = R2 .
LINEAR ALGEBRA
281
LINEA R TRANS FORMATIONS
The linear operator T : R3 -a R3 defined by the equation T(x) = Ax, where 12 3 A= 4 5 6 78 9 is neither one to one nor onto, since det A does equal zero . The range (which is not all of R3) and the nontrivial kernel o f this operator were explicitly determined in Example 18 . ■ Example 23 : Define an operator P : R3 -+ R3 by the formula P(x, y, z) T = (x, y, 0)T. Is P one to one? Is it onto ? The effect of this map is to project every point in R 3 onto the x-y plane; see Figure 54 . Intuitively, then, P cannot be one to one : To illustrate, both (1, 2, 3) T and (1, 2, 4)T get mappe d to (1, 2, 0) T. Furthermore, it cannot be onto R3 , since the image of every point lies in the x -y plane only ; this is the range of P. Note that this is consistent with Theorem N, since an operator is one to one if and only if it is onto, and the standar d matrix for P, 10 IP] = 0 1 00
0 0 0
has determinant 0 .
CLIFFS QUICK REVIEW
282
LINEA R TRANS FORMA TIONS
z
,
x
,
y
P(x) projection onto x -y plane ■ Figure 54
■
Example 24 : Let T : Rn —~ Rn be a linear operator which is both one to one and onto . Then T has an inverse , T—1 : Rn , which is defined as follows : If T(x) = y, the n T-i (y) = x . Since T is both one to one and onto, its standar d matrix, [T], is invertible ; its inverse, [TT', is the standard matrix for T-1 :
Consider the linear operator T : R2 equation T(X)
2
1
-3
4
—*
R 2 defined by the
Ix
This operator appeared in Example 11, where it was calculate d that T(3, 5)T = (1, 11)T. Obtain a formula for the inverse of T and verify that T-1 (l, 11)T =(3, 5) T .
LINEAR ALGEBRA 283
LINEA R TRANS FORMATION S
Since the standard matrix for T has a nonzero determinant , it is invertible, and its inverse, 2 -1F 4 1 [Tf' _ - [-3 4 _ 5 [3 2] is the standard matrix for the inverse operator T -1 . Therefore, 4 1 T-1 (x) 1[ 3 2 x and Tl l [11j[ 3 as expected .
2JL1
11
j
5 [2 5 j [ 5
■
Example 25 : Find the inverse of the rotation operator To given in Example 14 . Since To rotates every vector through the angle 0, the in verse operator should rotate it back; that is, (TB)-' should rotate every vector through an angle of -8 . Since the standard matrix for To is A o, the standard matrix for (TB)-' is (A B)-' , which by the argument just given, is equal to A t e: [cos(-O) sin(-0)] F cogs sin B )-1 A-B sin(—O) cos(—e) L— cosh Note that this intuitive, geometric argument gives the same result as would be obtained by formally taking the inverse o f the 2 by 2 matrix A e. ■
284
CLIFFS QUICK REVIEW
LINEA R TRANSFORMATION S
Composition of Linear Transformations Let V, W, and Z be vector spaces and let T 1 : V ~ W and T2: W -~ Z be linear transformations . The composition of T, and T2, denoted T2 o Ti , is defined to be the linear transformation from V to Z given by the equatio n TZ 0 T, ( v ) = TZ (T (v)) The notation T2 o Tl is meant to be read from right to left ; that is, first apply Tl and then T2. The composition T2 o T maps V all the way to Z; see Figure 55 .
/V/
T1
Ti
/W/ T
Z o
T2 /Z
/
T
■ Figure 55 ■ Example 26 : Consider the linear transformations T : R3 and TZ :R2 -+ R4 defined by the equation s
-4
RZ
V1
v2 V3
[—3v1 + v2 ? y3 ]
and TZ
[w2
Find a formula for the composition T2 o T : R3
LINEAR ALGEBRA
-3
R4 .
285
LINEA R TRANS FORMATION S
From the definitions of T l and T2, Ti (T (v)) = Ti
r
vl — 2v, i -3v + v2 — v3
4(—3v1 + v2 — v3 ) 2(vl -2v2 ) — (—3v1 +v2 — v3 ) — 2v2 ) + (—3vl + v2 — v3) 5(v~ — 2v2 ) -12v1 +4v2 —4v3 5111 -5v2 +V3 -411 1 +3v2 — v3 5v1 —10v2 Therefore, -12111 +4v2 -4v3 51'1 -5v2 +V3 -411 1
+ 3v2 — v3 5v1 —10v2
■
Example 27 : For the linear transformations Tl and T2 in Ex ample 26 above, find the standard matrix representatives fo r T1 , T2, and T2 o T , then show that [T2 o T ] = [T2 ][T1 ] . The standard matrix for T1 : R3 -+ R2 is the 2 x 3 matrix whose columns are the images of the basis vectors of the do main space, R3. Since
CLIFFS QUICK REVIEW
286
LINEA R TRANS FORMATION S
1 T 0
[
0
1
,
-3
T
-01 =[ 0
],
and
T
1
the standard matrix for T l is 1 -2 0 1 -11 [—3 1 The standard matrix for T2 : R2 --+ R4 is the 4 x 2 matrix whose columns are the images of the basis vectors of R 2. Sinc e 0 2 -1
41 1 0
5 the standard matrix for T 2 is
Finally, the standard matrix for T2 o T : R3 -> R 4 is the 4 x 3 matrix whose columns are the images of the standard basi s vectors of R 3 . From the result of Example 26, 1 0
(T2 o 7 ;)
0
LINEAR ALGEBRA
287
LINEA R TRANSFORMATIONS
0 and (T2 o 7 ;) 0 1 the standard matrix for T2
o
7; is -12 4 -45 -5 1 -4 3 -1 5 -10 0
Now, verify the following matrix multiplication:
[T2 ][7 ;] =
0 4 2 -1
-12 1 -2 0
-1 1 -3 5 0
1 -1
4 -4
5 -5
1
-4 3 -1 5 -10 0
This calculation shows that the standard matrix for T2 o T is the product of the standard matrices for T 2 and T 1 : [T2 .7il=[U ri ] It can be shown that this equation holds true for any linea r transformations Tl and T2 for which 7'2 o T is defined, and is, in fact, the motivation behind the rather involved definition o f matrix multiplication. ■
CLIFFS QUICK REVIEW
288
LINEA R TRANS FORMA TION S
Example 28 : Let Tl and T2 be the linear operators on R 2 defined by the equations — 3x
[x 1 T
]—
2 and xi + 4x 2 ]
r2x1
[x 1 TZ
x
Compute the compositions T o T2 and T2 07; ?
J _ [—4x 2 xl x
T2
o
T1 . Does T o T2
=
The composition T 0 T2 is given by the formul a T, 0 TZ(x) = T (TZ( x)) x i + x2 [2(—4x 2 )—3(x 1
+ x2 )
[—(—4x 2 ) + 4(x 1 [_3x1 -11x2 4x 1 + 8xZ
+ z2 )]
while the composition TZ o T is given by the formul a TZ o 7;(x) = TZ (T (x))
Tz
[2x 1 — 3x2 x ~++4x 4xz —x~
-4(—xl + 4x2 ) (2x1 — 3x2 ) + (— xl + 4x 2 )
r4x1 —16x2
L
xl + x2
LINEAR ALGEBRA 289
LINEA R TRANS FORMA TIONS
Another method to determine the compositions T o TZ and
TZ 0 T, is to compute the matrix products [T,][7] and [T2 ][T,] .
Since
– 3x2
[x i T x2
[
[x,+4x2j
2 T l– [–1
3 4
and -4x2 H '1 TZ xz J = [xl + x2J
['[ 1
the matrix products ar e [ 2 -3 ~1x0 [ T1][ Ti] =
-11
-41
4 J1 1
1
1
r–3 -111 L 44
8
J
and
4 -16 [00 2 -3 [7'2][T] = 1 -411 1 4 1 I 1l 1 But
[_3 4
11 gJ
~
J i 8x2
3x I -1 xZ
T0 TZ(x)= 4x
and
[T21[Ifl=[
1 4 -161
J
~
TZ
°
T ~x~
r4x, -16x 2 1 x, + x2
J
as above. Clearly, T 0 TZ TZ o T1 . The noncommutativity o f linear-map composition is reflected in the noncommutativit y of matrix multiplication: in general, T o TZ TZ o To since–in general–[T~][TZ] [ TZ ][T ] . ■
290
CLIFFS QUICK REVIEW
LINEA R TRANS FORMA TION S
Example 29 : If a linear operator T is composed with itself, th e resulting operator is written T2 , rather than T 0T . If To is the rotation operator on R2 defined in Example 14, find a formula for To2 . Since To rotates a vector through the angle 0, applying th e operator again should rotate the vector through another angl e of 8; that is, the effect of T8 is to rotate every vector through an angle of 20. Since the standard matrix for To is A 0, the standard matrix for Te is A te : [T]=A 29
[cos26 —Sin2e cos 20
Now, of course, the standard matrix for T2 is the square of th e standard matrix for T: [T 2 ] = [ToT] = [T][T] = [T]2 So, this same result could have been obtained by squaring th e matrix A 8: [cos() —sin eircose —sin 01 A2 Sine pose1 sinB cos9J rcosZ e — Sin2 e —a sin ° cose L 2 sine cos 8 cos t 8— sinZ 8 [cos —sin 26 cos 2 9 where the last equation is a consequence of the trigonometri c identities cos Ze = cos2 e — Sin2 e and sin 28 = 2 sin 8 cos 8 . Alternatively, you may look at this as a proof of these identities . ■
LINEAR ALGEBRA
291
EIGENVALUES AND EIGENVECTOR S
Although the process of applying a linear operator T to a vector gives a vector in the same space as the original, the resulting vector usually points in a completely different direction from the original, that is, T(x) is neither parallel no r antiparallel to x . However, it can happen that T(x) is a scalar multiple of x—even when x ~ 0—and this phenomenon is s o important that it deserves to be explored . Definition and Illustration of a n Eigenvalue and an Eigenvector If T : Rn -- Rn is a linear operator, then T must be given b y T(x) = A x for some n x n matrix A . If x 0 and T(x) = A x is a scalar multiple of x, that is, if Ax=X x
for some scalar then k is said to be an eigenvalue of T (or, equivalently, of A). Any nonzero vector x which satisfies thi s equation is said to be an eigenvector of T (or of A) corresponding to X . To illustrate these definitions, consider the linear operator T : R2 —* RZ defined by the equatio n 1 -2 x T(x) — [3 -4] That is, T is given by left multiplication by the matrix A= =2 1 3 4J
C1
LINEAR ALGEBRA 293
E/GENVALUES AN D
E/GENVECTOR S
Consider, for example, the image of the vector x = (1, 3) T der the action of T:
un -
9 3 3 — L'3 3—ZJL -4 33' J — L -9 T'J —SJ Clearly, T(x) is not a scalar multiple of x, and this is what typically occurs . However, now consider the image of the vector x = (2, 3) T under the action of T: J13 T[3]
-4][3] — [—6]
Here, T(x) is a scalar multiple of x, since T(x) = (-4, -6) T = 2(2, 3)T = -2x. Therefore, -2 is an eigenvalue of T, and (2 , 3)T is an eigenvector corresponding to this eigenvalue . The question now is, how do you determine the eigenvalues an d associated eigenvectors of a linear operator?
Determining the Eigenvalues of a Matri x
Since every linear operator is given by left multiplication b y some square matrix, finding the eigenvalues and eigenvector s of a linear operator is equivalent to finding the eigenvalue s and eigenvectors of the associated square matrix ; this is th e terminology that will be followed . Furthermore, since eigenvalues and eigenvectors make sense only for square matrices , throughout this section all matrices are assumed to be square .
294
CLIFFS QUICK REVIE W
EIGENVAL UES AND EIGENVECTOR S
Given a square matrix A, the condition that characterize s an eigenvalue, A,, is the existence of .a nonzero vector x such that A x = AA ; this equation can be rewritten as follows : Ax=A x Ax—Ax= 0 Ax — klx = 0 (A—AI)x= 0 This final form of the equation makes it clear that x is the solution of a square, homogeneous system. If nonzero solutions are desired, then the determinant of the coefficient matri x which in this case is A — XI-must be zero ; if not, then the system possesses only the trivial solution x = 0 . Since eigenvectors are, by definition, nonzero, in order for x to be an eigenvector of a matrix A, A must be chosen so tha t det(A — ?I) = 0 When the determinant of A — XI is written out, the resultin g expression is a monic polynomial in X . [A monic polynomial is one in which the coefficient of the leading (the highest degree) term is l .] It is called the characteristic polynomia l of A and will be of degree n if A is n x n. The zeros of the characteristic polynomial of A—that is, the solutions of th e characteristic equation, det(A — =mare the eigenvalue s of A . Example 1 : Determine the eigenvalues of the matrix 1 A
2
3 -4]
LINEAR ALGEBRA 295
EIGENVAL UES AN D EIGENVECTORS
First, form the matrix A — [1 -21 [X 3 -4
-2 1— k [ 3 -4— X
a result which follows by simply subtracting ? from each o f the entries on the main diagonal . Now, take the determinant o f A-
det(A — ~,I) = det[l 3 ~ -4 - 2
= (1— k)(—4 — X) — (3)(—2 ) = a,2
+ 3)t, + 2
This is the characteristic polynomial of A, and the solutions o f the characteristic equation, det(A — X1) = 0, are the eigenvalue s of A : det(A — 2J) = 0 XZ + 3X+ 2 = 0 (k + 1)(k + 2) = 0 _ -1, — 2 ■ In some texts, the characteristic polynomial of A is written det (X1— A), rather than det (A — 9 ,n. For matrices of even di mension, these polynomials are precisely the same, while for square matrices of odd dimension, these polynomials are additive inverses . The distinction is merely cosmetic, because th e solutions of det (?J—A) = 0 are precisely the same as the solutions of det (A 2J) = 0 . Therefore, whether you write the characteristic polynomial of A as det(X1—A) or as det(A — will have no effect on the determination of the eigenvalues o r their corresponding eigenvectors .
296
CLIFFS QUICK REVIEW
EIGENVALUE S AND EIGENVECTORS
Example 2 : Find the eigenvalues of the 3 by 3 checkerboard matrix 1
-1
1-
C= -1
1
-1
1
-1
1
The determinant det(C—~=det -1 1—X - 1 1 -1 1—2%, is evaluated by first adding the second row to the third an d then performing a Laplace expansion by the first column : 1—a, -1 1 -1 1—A, - 1 -1 1— A
The roots of the characteristic equation, -2% ,2 (a, – 3) = 0, are A, = 0 and A. = 3 ; these are the eigenvalues of C . ■
LINEAR ALGEBRA
297
EIGENVAL UES AN D EIGENVEC TOR S
Determining the Eigenvectors of a Matrix
In order to determine the eigenvectors of a matrix, you must first determine the eigenvalues . Substitute one eigenvalue into the equation A x = ?x—or, equivalently, into (A — ?J)x = 0—and solve for x; the resulting nonzero solutions form th e set of eigenvectors of A corresponding to the selected eigenvalue. This process is then repeated for each of the remaining eigenvalues . Example 3 : Determine the eigenvectors of the matri x 1 2 A 3 -4 In Example 1, the eigenvalues of this matrix were found t o be k = -1 and X, = -2 . Therefore, there are nonzero vectors x such that Ax = —x (the eigenvectors corresponding to the eigenvalue ~. = -1), and there are nonzero vectors x such tha t A x = -2x (the eigenvectors corresponding to the eigenvalu e = -2). The eigenvectors corresponding to the eigenvalue = -1 are the solutions of the equation Ax = —x : 33 -4
x = —x
[1 -2 T xi xl xi i -4][x2] [ This is equivalent to the pair of equation s x1 — 2x2 = —xl 3x1 — 4x2 = —x2
298
CLIFFS QUICK REVIEW
EIGENVALUE S AND EIGENVECTOR S
which simplifies to 2x1 —2x2 = 0 3x 1 — 3x2 = 0 [Note that these equations are not independent. If they were independent, then only (x,, x2) T = (0, 0) T would satisfy them ; this would signal that an error was made in the determinatio n of the eigenvalues . If the eigenvalues are calculated correctly , then there must be nonzero solutions to each system Ax = kx . ) The equations above are satisfied by all vectors x = (x,, x 2 ) T such that x2 = x i . Any such vector has the form (x l, xi ) T and is therefore a multiple of the vector (1, 1) T . Consequently, the eigenvectors of A corresponding to the eigenvalue = -1 ar e precisely the vector s t~ i
where t is any nonzero scalar. The eigenvectors corresponding to the eigenvalue = -2 are the solutions of the equation Ax = -2x : 1
2
3
-4)x
= -2x
1 [3
-4]j[x2]
-2 Lx2 J
This is equivalent to the "pair" of equation s 3x 1 —2x2 = 0 3x 1 -2x2 = 0
LINEAR ALGEBRA
299
EI GENVAL UES AND EIGENVECTOR S
Again, note that these equations are not independent. They are satisfied by any vector x = (x i, x2)T that is a multiple of the vector (2, 3) T ; that is, the eigenvectors of A corresponding t o the eigenvalue = -2 are the vectors t 2 L3J where t is any nonzero scalar .
■
Example 4 : Consider the general 2 x 2 matrix A= a b L c dJ (a) Express the eigenvalues of A in terms of a, b, c, and d. What can you say about the eigenvalues if b = c (that is, i f the matrix A is symmetric) ? (b) Verify that the sum of the eigenvalues is equal to the su m of the diagonal entries in A . (c) Verify that the product of the eigenvalues is equal to th e determinant of A . (d) What can you say about the matrix A if one of its eigenvalues is 0? The solutions are as follows : (a) The eigenvalues of A are found by solving the characteristic equation, det (A — 2J) = 0 : det(A — XI) = 0 b det[ a — A' cd—
0
CLIFFS QUICK REVIE W
300
EIGENVALUE S AN D EIGEN VECTORS
(a — X)(d — — bc = 0 — (a + d)X + (ad — bc) = 0
(*)
The solutions of this equation—which are the eigenvalues o f Aare found by using the quadratic formula :
_(a+d)±~(a+d)2—4(ad—bc)
(**)
2 The discriminant in (**) can be rewritten as follows :
(a + d)2 — 4(ad — bc) = a 2 +tad + d 2 — 4ad + 4bc = a2 -2ad+d 2 + 4bc = (a — d)2 + 4bc Therefore, if b = c, the discriminant becomes (a — d)2 + 4b 2 = (a—d) 2 +(2b)Z . Being the sum of two squares, this expressio n is nonnegative, so (**) implies that the eigenvalues are real . In fact, it can be shown that the eigenvalues of any real, symmetric matrix are real . (b) The sum of the eigenvalues can be found by adding th e two values expressed in (**) above :
(a+d)+V(a+d)2 — 4(ad— bc) 2
+ (a + d) — 4(a + d)2 — 4(ad — bc) 2
a+d + a+ d 2 2 =a+ d
LINEAR ALGEBRA
301
EIGENVALUE S AN D EIGENVECTOR S
which does indeed equal the sum of the diagonal entries of A . (The sum of the diagonal entries of any square matrix is called the trace of the matrix.) Another method for deter mining the sum of the eigenvalues, and one which works fo r any size matrix, is to examine the characteristic equation . From the theory of polynomial equations, it is known that i f p( X ) is a monic polynomial of degree n, then the sum of th e roots of the equation p(X) = 0 is the opposite of the coefficient of the term in p(k) . The sum of the roots of equation (*) is therefore —[—(a + d)] = a + d, as desired . This second method can be used to prove that the sum of the eigen-
values of any (square) matrix is equal to the trace of the ma trix .
(c) The product of the eigenvalues can be found by multi plying the two values expressed in (**) above :
2t'l X 2
=
[ (a+d)+ .J(a+c (' 2
— 4(ad — bc)
(a + d) — ~(a + d)2 — 4(ad — bc) 2 2
(a+d\)2 2
1J(a+d)2 -4(ad-bc ) 2
(a+d)2 —[(a+d) 2 -4(ad —bc 4
= ad — bc which is indeed equal to the determinant of A . Another proof that the product of the eigenvalues of any (square) matrix is equal to its determinant proceeds as follows . If A is an n x n matrix, then its characteristic polynomial, p(A.), is monic o f
CLIFFS QUICK REVIEW 302
EIGENVALUE S AND EIGENVECTOR S
degree n . The equation p(k) = 0 therefore has n roots: X1 , 7 2, . . X,, (which may not be distinct) ; these are the eigenvalues . Consequently, the polynomial p(X) = det(A – XI) can be ex pressed in factored form as follows : det(A – =
– k)(X 2 – k)—(X,, – X )
Substituting ? = 0 into this identity gives the desired result : det A = ? 1A'2 . . . a,n . (d) If 0 is an eigenvalue of a matrix A, then the equation A x = kx = Ox = 0 must have nonzero solutions, which are the eigenvectors associated with X = 0 . But if A is square and A x = 0 has nonzero solutions, then A must be singular, that is, det A must be 0 . This observation establishes the following fact : Zero is an eigenvalue of a matrix if and only if the matrix is singular. ■ Example 5 : Determine the eigenvalues and eigenvectors o f the identity matrix I without first calculating its characteristic equation. The equation Ax = k x characterizes the eigenvalues an d associated eigenvectors of any matrix A . If A = I, this equation becomes x = k x . Since x ~ 0, this equation implies k = 1 ; then, from x = 1x, every (nonzero) vector is an eigenvector of I . Remember the definition : x is an eigenvector of a matrix A i f A x is a scalar multiple of x and x ~ 0 . Since multiplication b y I leaves x unchanged, every (nonzero) vector must be an eigenvector of I, and the only possible scalar multipleeigenvalue—is 1 . ■
LINEAR ALGEBRA 303
EIGENVALUES AN D EIGENVEC TORS
Example 6 : The Cayley-Hamilton Theorem states that any square matrix satisfies its own characteristic equation ; that is, if A has characteristic polynomial p(X), then p(A) = 0. To illustrate, consider the matrix A
[1 -2 3 -4
from Example 1 . Since its characteristic polynomial is p(2‘) = + + 2, the Cayley-Hamilton Theorem states that p(A ) should equal the zero matrix, 0 . This is verified as follows : p(A) = A Z + 3A + 21 1
-21 2
[3 -4]
+3
1 -2 r1 0 1+21 ~3 -4 L0 1
[—9 10] + [9 -12] + [0 2 [0 Ol 0 Oj =0 3
1
If A is an n by n matrix, then its characteristic polynomia l has degree n. The Cayley-Hamilton Theorem then provides a way to express every integer power A k in terms of a polynomial in A of degree less than n. For example, for the 2 x 2 matrix above, the fact that A 2 + 3A + 21 = 0 implies A 2 = -3A — 21. Thus, A 2 is expressed in terms of a polynomial of degree 1 in A . Now, by repeated applications, every positive integer power of this 2 by-2 matrix A can be expressed as a polynomial of degree less than 2 . To illustrate, note the following calculation for expressing A 5 in terms of a linear polynomia l
304
CLIFFS QUICK REVIE W
EI GENVAL UES AN D EIGENVECTORS
in A ; the key is to consistently replace A 2 by -3A — 21 and simplify: A 5 = A Z • AZ • A
=(—3A—2 1) •(—3A—2 1) • A _ (9A 2
+12A +41) . A
=[9(—3A—2n+12A+4I]• A
-141) . A _ -15A 2 -14 A = -15(—3A -21)—14A = 31A +30I _ (—15A
This result yields A S = 31A + 301 = 31[3
-21
+ 30
Fl
61 -62 1 L0 1] — [93 -94
_4J a calculation which you are welcome to verify be performin g the repeated multiplication -2 1[ 1 1 -2 1[ 1 s1 A S = [3 - 4 3 -4 [3 - 4 3 -4] [3 -4 ] The Cayley-Hamilton Theorem can also be used to ex press the inverse of an invertible matrix A as a polynomial i n A . For example, for the 2 by 2 matrix A above , A 2 +3A+2I= 0 A Z +3A = -2 1 A(A + 31) = -2 I
A . [—+(A+ 31)] = I A-1
LINEAR ALGEBRA
= —+( A
+ 31)
(*)
305
EIGENVALUES AND EIGENVECTOR S
This result can be easily verified . The inverse of an invertibl e 2 by 2 matrix is found by first interchanging the entries o n the diagonal, then taking the opposite of the each off-diagona l entry, and, finally, dividing by the determinant of A . Sinc e det A = 2 , A — [3
A l —~
A-1=
2 1 z [—3 1J — [— i i
but 2 r1 0 -41+31 L3 0 1 J~
4 ——2 1 — i iJ Z ~3 -1 validating the expression in (*) for A -' . The same ideas use d to express any positive integer power of an n by n matrix A in terms of a polynomial of degree less than n can also be use d to express any negative integer power of (an invertible matrix ) A in terms of such a polynomial . ■ Ti —Z (A + 3I) _ —1 \
Example 7 : Let A be a square matrix . How do the eigenvalues and associated eigenvectors of A2 compare with those of A? Assuming that A is invertible, how do the eigenvalues and associated eigenvectors of A-1 compare with those of A? Let X be an eigenvalue of the matrix A, and let x be a corresponding eigenvector . Then Ax = Xx, and it follows fro m this equation that A 2 x = A(Ax) = A( X x) = X(Ax) = X(Xx) = ~,Zx
Therefore, X 2 is an eigenvalue of A 2 , and x is the corresponding eigenvector. Now, if A is invertible, then A has no zero eigenvalues, and the following calculations are justified :
306
CLIFFS QUICK REVIEW
El GEN VALUES AND EIGENVECTOR S
Ax=Xx A-I (Ax) = A -I Oa) x = 24A-1 x) X-'x = A -Ix so 2■,-' is an eigenvalue of A -1 with corresponding eigenvecto r X.
■
Eigenspaces Let A be an n x n matrix and consider the set E = {XER " = X,x 1 . If x E E , then so is tx for any scalar t, since A(tx) = t(Ax) = t(kz) = X(tx) ~ tx E E
:
Ax
Furthermore, if xl and x2 are in E, then A(xl +x2) = Axe +Ax 2
= ~l,x l
=
+~x2 + x2 )
xl +x2 E E
These calculations show that E is closed under scalar multiplication and vector addition, so E is a subspace of Rn. Clearly , the zero vector belongs to E; but more notably, the nonzero elements in E are precisely the eigenvectors of A corresponding to the eigenvalue A,. When the zero vector is adjoined t o the collection of eigenvectors corresponding to a particula r eigenvalue, the resulting collection ,
Jeigenvectors ofA corresponding v 0} to the eigenvalue
LINEAR ALGEBRA
307
EIGENVALUES AND EIGENVECTORS
forms a vector space called the eigenspace of A correspondin g to the eigenvalue X . Since it depends on both A and the selection of one of its eigenvalues, the notatio n Ex (A) =
: Ax = ?a l
will be used to denote this space . Since the equation A x = X x is equivalent to (A — Xnx = 0, the eigenspace Ex (A) can also be characterized as the nullspace of A — XI: E), (A)={x : Ax=Xx}={x : (A—XI)x=0}=N(A—?J)
This observation provides an immediate proof that Ex (A) is a subspace of R . Recall the matrix rl
21
A = L3 -4 ~
given in Example 2 above . The determination of the eigenvectors of A shows that its eigenspaces ar e E_,(A)={x eR 2 : x=t i 1~, t eR }
l
Ll .
and E Z (A)={xeRZ : z=tI3 , tER }
l
L J
J
is the line in R 2 through the origin and the point (1, 1) , and E_2 (A) is the line through the origin and the point (2, 3) . Both of these eigenspaces are 1-dimensional subspaces of R 2. E_, (A)
308
CLIFFS QUICK REVIEW
E/GENVALUE S AN D EIGENVECTORS
Example 8 : Determine the eigenspaces of the matrix 1 0 2B= 0 3 0 20
1
First, form the matrix 1—X 0 2 B—XI= 0 3 — X 0 2
0
(*)
1
The determinant will be computed by performing a Laplac e expansion along the second row : 1—0 det(B — 2J) = det 0 3— 2
0
2 0 1
X
= (3 — k)[(l — x) 2 - 2 2 ]
= (3 — k)[(l —
+ 2E1 — X) — 2 1
= (3 — k)(3 — k)(— 1 — k) The roots of the characteristic equation , (3 — 20(3 — k)(— 1 — = 0 are clearly k = -1 and 3, with 3 being a double root ; these are the eigenvalues of B . The associated eigenvectors can now b e found. Substituting = -1 into the matrix B — 2 ,I in (*) gives
LINEAR ALGEBRA
30 9
EIGENVALUE S AND EIGENVECTOR S
1—0 (B — 2J)x =_1 =
0
3—
2
0
2
0 1—
2
0
2
0 4 _2 0
0 2_
which is the coefficient matrix for the equation (B— 2 .I)x = 0 with X = -1, which determines the eigenvectors correspondin g to the eigenvalue = -1 . These eigenvectors are the nonzer o solutions of 2 0 2 x1 0 0 4 0 x2 = 0 _2 0
2_x3
2x1 + 2x3 = 0 4x 2 = 0
0
2x1 + 2x3 = 0
The identical first and third equations imply that x l + x3 = ()— that is, x3 = x l and the second equation says x 2 = 0. There fore, the eigenvectors of B associated with the eigenvalue = -1 are all vectors of the form (x 1 , 0, x 1 ) T = x 1 (1, 0, -1)T for x 1 0. Removing the restriction that the scalar multiple b e nonzero includes the zero vector and gives the full eigenspace : E_ 1 ( B)
X
ER3 : R= l
Now, since
- 1— (B — AA =3
0 0 3—?, 2 0
2 -2 0 2 0 = 0 0 0 1— k"=.3 _ 2 0 -2 _
the eigenvectors corresponding to the eigenvalue ? = 3 are th e nonzero solutions of
CLIFFS QUICK REVIEW
310
EIGENVAL UES AN D EIGENVEC TORS
-2 0 2
0 0 0
2 x1 0 0 x2 = 0 -2_ x3 _
~
-2x1 + 2x3 = 0 2x1 —2x3 = 0
_0
These equations imply that x 3 = x1 , and since there is no restriction on x2 , this component is arbitrary . Therefore, the eigenvectors of B associated with ?, = 3 are all nonzero vector s of the form (x1, x 2, x l)T = x1 (1, 0, 1)T + x2(0, 1, 0) T. The inclusion of the zero vector gives the eigenspace : 1 E3 (B) =
R
ER3 :
%
0
= t1 0 +l2 1 , 1
t2
ER
0
Note that dim E_1 (B) =1 and dim E3 (B) = 2 .
n
Diagonalization First, a theorem : Theorem O . Let A be an n by n matrix . If the n eigenvalue s of A are distinct, then the corresponding eigenvectors are linearly independent. Proof The proof of this theorem will be presented explicitl y for n = 2 ; the proof in the general case can be constructed based on the same method . Therefore, let A be 2 by 2, an d denote its eigenvalues by X1 and a'2 and the corresponding eigenvectors by v l and v2 (so that A v 1 = ? 1 v 1 and A v2 = X 2v2) . The goal is to prove that if X 1 k2, then v 1 and v 2 are linearly independent . Assume that
LINEAR ALGEBRA 31 1
EIGENVALUES AN D EIGENVECTORS
C i V i + CZ VZ =
0 (* )
is a linear combination of v l and v2 that gives the zero vector; the goal is to show that the above equation implies that c, an d c2 must be zero . First, multiply both sides of (*) by the matri x A: A(civI + c2 V2) = cl(Avl) + c2 (Av 2 ) = 0 Next, use the fact that Avg = Xvl and Av2 = kv 2 to write CI(XIVI)+C2(k2V2) =
0
(* I
Now, multiply both sides of (*) by X2 and subtract the resulting equation, c1k2v1 + c22 2v2 = 0, from (**) : c1 (k 1 —X 2 )v1 = 0 Since the eigenvalues are distinct, X 1 — a'2 0, and since v 1 0 (v1 is an eigenvector), this last equation implies that c 1 = G. Multiplying both sides of (*) by X1 and subtracting the resulting equation from ("4) leads to c 2 ( X2 — 1) v 2 = 0 and then, by the same reasoning, to the conclusion that c 2 = 0 also . ■
Using the same notation as in the proof of Theorem 0 , assume that A is a 2 by 2 matrix with distinct eigenvalues an d form the matrix
v= v, v2 whose columns are the eigenvectors of A . Now consider th e product AV; since Av g = and A v2 = X Z V Z ,
312
CLIFFS QUICK REVIE W
EICENVALUES AN D c IGENVECTORS
AV = A v l v 2
=
Avg Av2
J
= I ~'l vl k Z V Z JLL
0')
This last matrix can .be expressed as the following product : X,v,
k2V
2 = v, vZ
If A denotes the diagonal matrix whose entries are the eigenvalues of A ,
then equations (*) and (**) together' imply AY = VA . If v ~ and v2 are linearly independent, then the matrix V is invertible. Form the matrix V-1 and left multiply both sides of th e equation AV = VA by Y-' V -l AV = A= [XI
z
(Although this calculation has been shown for n = 2, it clearl y can be applied to an n by n matrix of any size .) This proces s of forming the product V-1A V , resulting in the diagonal matrix A of its eigenvalues, is known as the diagonalization o f the matrix A, and the matrix of eigenvectors, V, is said to diagonalize A . The key to diagonalizing an n by n matrix A i s the ability to form the n by n eigenvector matrix V and its in verse; this requires a full set of n linearly independent eigenvectors. A sufficient (but not necessary) condition that wil l
LINEAR ALGEBRA
313
EIGENVALUES AND EIGENVECTOR S
guarantee that this requirement is fulfilled is provided b y Theorem 0: if the n by n matrix A has n distinct eigenvalues . One useful application of diagonalization is to provide a simple way to express integer powers of the matrix A . If A ca n be diagonalized, then V -IA V = A , which implie s A = VAV—1 When expressed in this form, it is easy to form integer power s of A . For example, if k is a positive integer, then Ak
=
(VAV-1 )k
=
(VAV')•(VAV1) . . .(VAV') .(VAV' ) k factors
= VA(V-1V) .A(V-1V) . . .A(V-1V) .A V - 1 k factor s
V-1 = VAk The power Ak is trivial to compute : If X I, X. 2, . . ., kn are th e entries of the diagonal matrix A, then Ak is diagonal with en tries M, . . ., . Therefore, kk
V-1
2
Ak = V
Example 9 : Compute A 10 for the matrix 1 A
314
2
3 -4~
CLIFFS QUICK REVIE W
EIGENVALUES AN D EIGENVECTORS
This is the matrix of Example 1 . Its eigenvalues are X I = -1 and X Z = 2, with corresponding eigenvectors v~ = (1, 1) T and v2 = (2, 3)T. Since these eigenvectors are linearly independent (which was to be expected, since the eigenvalues ar e distinct), the eigenvector matrix V has an inverse , [1 2
[3
—I
Thus, A can be diagonalized, and the diagonal matrix A V-'A V is —1 L -21 n— Lxl A'2] — Therefore, A 10 =(VAY-t ~~o = VAioy- i 1 2][H)''' 3 -2 [1 3(_2)b0][_1 1 2 3 [1 3] 1 1024][— 1 ▪ [1 2 . [1024]F 3 -2 3 . 1024][—1 1]
1 1
3—2 . 1024 -2+2 . 10241 [3-3 . 1024 -2+3 . 1024) [—2045 204 6 -3069 3070
LINEAR ALGEBRA
■
315
EIGENVALUES AND EIGENVECTOR S
Although an n by n matrix with n distinct eigenvalues is guaranteed to be diagonalizable, an n by n matrix that doe s not have n distinct eigenvalues may still be diagonalizable . I f the eigenspace corresponding to each k-fold root X of th e characteristic equation is k dimensional, then the matrix will b e diagonalizable. In other words, diagonalization is guaranteed if the geometric multiplicity of each eigenvalue (that is, th e dimension of its corresponding eigenspace) matches its algebraic multiplicity (that is, its multiplicity as a root of the characteristic equation) . Here's an illustration of this result . The 3 by 3 matrix 1 0
2
B= 0 3
0
2 0
1
of Example 8 has just two eigenvalues: = -1 and X2 = 3 . The algebraic multiplicity of the eigenvalue A, = -1 is one , and its corresponding eigenspace, E 1(B), is one dimensional . Furthermore, the algebraic multiplicity of the eigenvalue ? 2 = 3 is two, and its corresponding eigenspace, E3(B), is two dimensional . Therefore, the geometric multiplicities of the eigenvalues of B match their algebraic multiplicities . The conclusion, then, is that although the 3 by 3 matrix B does no t have 3 distinct eigenvalues, it is nevertheless diagonalizable . Here's the verification: Since {(1, 0, -1) T} is a basis for th e 1-dimensional eigenspace corresponding to the eigenvalue = -1, and {(0, 1, 0)T, (1, 0, 1) T} is a basis for the 2-dimensiona l eigenspace corresponding to the eigenvalue X 2 = 3, the matri x of eigenvectors reads
316
CLIFFS QUICK REVIEW
EIGENVALUES AND EIGENVECTOR S
V=
Since the key to the diagonalization of the original matrix B is the invertibility of this matrix, V, evaluate det V and check that it is nonzero . Because det V = 2, the matrix V is invertible , 1 2
0
1 2
V-1 = 0 1
0
1 2
1 2
0
so B is indeed diagonalizable : 2 0 -2- 1 0 2
4 1 0 1-
V-1 BV = 0
1
0
0
3
0
0
1
0
_2
0
2r_ 2
0
1_ -1
0
1
1 =A
3 3
Example 10 : Diagonalize the matrix A=[
-3
-1
1
First, find the eigenvalues; since det( A — =
2— -1 = (2 — 20(4 — — 3 -3 4 = (A, — 1)(k — 5 )
the eigenvalues are k = 1 and k = 5 . Because the eigenvalues are distinct, A is diagonalizable . Verify that an eigenvecto r
LINEAR ALGEBRA
31 7
EIGENVALUES AND EIGENVECTOR S
corresponding to A, = 1 is v 1 = (1, 1)T, and an eigenvector corresponding to A, = 5 is v2 = (1, -3)T. Therefore, the diagonalizing matrix is V = vl
v2
and A = [XI 0 0 X2 j
F1 LO
0] 5
Another application of diagonalization is in the construction of simple representative matrices for linear operators . Let A be the matrix defined above and consider the linear opera tor on R 2 given by T(x) = A x. In terms of the nonstandard basis B = { v1 = (1, 1)T, v2 = (1, -3)T }for R2, the matrix of T relative to B is A. Review Example 16 on pages 270–271 . ■
318
CLIFFS QUICK REVIEW