A First Course in Module Theory

A First Course in Module Theory This page is intentionally left blank A First Course in Module Theory M E Keating ...

Author: Keating M.E.

272 downloads 1831 Views 108MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

A First Course in Module Theory

This page is intentionally left blank

A First Course in Module Theory

M E Keating

Imperial College, London

ICP

Imperial College Press

Published by Imperial College Press 203 Electrical Engineering Building Imperial College London SW7 2BT Distributed by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data Keating, M. E., 1941A first course in module theory / M. E. Keating. p. cm. Includes bibliographical references and index. ISBN 186094096X(alk. paper) 1. Modules (Algebra) QA247.K43 1998 512'.4--dc21

98-9963 CIP

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Copyright © 1998 by Imperial College Press All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher. This book is printed on acid-free paper.

Printed in Singapore by UtoCPdjOyrig/lfec/

Material

To Valerie and Christopher

This page is intentionally left blank

Introduction The purpose of this book is to provide an introduction to module theory for a reader who knows something of linear algebra and elementary ring theory. There is a very natural theme for a first course in module theory, namely the structure theory of modules over Euclidean domains. This theory is very explicit, and it has interesting and surprisingly disparate interpretations. An abelian group can be regarded as a module over the ring of integers Z, while a matrix with entries in a field F defines a module over the polyno mial ring F[X]. As both Z and F[X] are examples of Euclidean domains, the general theory of modules over Euclidean domains leads to specific re sults about abelian groups and about matrices. In the former, we obtain a classification of finitely generated abelian groups, and, in the latter, a description of the rational canonical form and the Jordan normal form of a matrix. Although the structure theory for modules over Euclidean domains is the core of this text, we also consider modules over more general, even noncommutative, rings of coefficients. This extra generality allows us to discuss the limitations and some of the extensions of our main results. The contents of this text are based on a final year undergraduate course that I gave a number of times at Imperial College, London, with some additional material. In the lecture course, I assumed that everyone was familiar with the elementary properties of rings, ideals and Euclidean do mains. Here I have provided an introduction to ring theory in the first two chapters, so that the text is more self-contained than the lectures, and a greater variety of rings can be used. Chapters 3 to 7 expound the basics of module theory, including meth ods of comparing, constructing and decomposing modules. The results in these chapters are rather general and do not depend much on the ring of coefficients. Chapters 8 to 12 are the heart of this text, since it is here that we obtain the strong results that are special to Euclidean domains. Chapter 12 also contains two applications of the theory, to abelian groups vn

Vlll

Introduction

and to lattices. In Chapter 13, we use the module theory to find two standard forms for a square matrix, namely, the rational canonical form and the Jordan normal form of a matrix. In addition to the usual version of the Jordan normal form of a matrix over the complex numbers, we give two further versions that apply to a matrix whose entries are taken from a field other than the complex numbers. The second of these variations is used, without proof, in a fundamental paper on representation theory by J A Green [Green], and a proof has not been published before at an elementary level, as far as I know. I am grateful to my colleague Gordon James for drawing my attention to these versions of the Jordan normal form. In the final chapter, we go beyond the bounds of Euclidean domains to look at some basic results on projective modules over rings in general. Here, we establish some basic facts about group algebras and the relationship between module theory and the representation theory of groups. As befits a book that is intended as a first course for undergraduates, arguments are given in considerable detail, at least, in the earlier part of text. There is an index entry "proofs in full detail" to help the reader to locate these agruments. There are many explicit illustrations and exercises, and some hints and partial solutions to the exercises are provided in an appendix. Material that was not covered in the original lecture course is indicated C by a "supplementary" symbol as in the margin. This material in not essen tial for the core results on modules over Euclidean domains, and it can be omitted if the reader wishes. Each chapter has a section headed "Further developments" in which the definitions and results of the chapter are placed in a wider setting. References are provided to enable the interested reader to follow up the topics that are introduced in these sections. I hope that these sections will provide a useful source of projects for students who are in the final year of a four-year undergraduate MMath or MSci course at a UK university, or who are taking the new one-year preliminary research degree, the "MRes". The text is divided into chapters and sections, which are numbered as you would expect: 1.1, 1.2, and so on. For ease of reference, results are numbered consecutively within each section, and so they appear in the form "12.1.1 Theorem", "12.1.2 Lemma", etc. This book was written while I was also working on the more advanced texts [B & K: IRM] and [B & K: CM] with Jon Berrick of the National Uni versity of Singapore. My collaboration with Jon has had many influences, both on my teaching of the lecture course and on the composition of this text, which it is impossible to acknowledge individually.

Introduction

IX

Finally, I should thank the Mathematics Department at Imperial for al lowing me the time and space in which to write textbooks. A special thanks also to my colleagues at Imperial for their assistance in mastering IMjrX. Phillip Kent of the METRIC Project ran an introductory course which got me started, and Oliver Pretzel provided me with the LXTJTJX software, and kindly and patiently explained why it sometimes did not do what I hoped.

This page is intentionally left blank

Contents Introduction

vii

1 Rings and Ideals 1.1 Groups 1.2 Rings 1.3 Commutative domains 1.4 Units 1.5 Fields 1.6 Polynomial rings 1.7 Ideals 1.8 Principal ideals 1.9 Sum and intersection 1.10 Residue rings 1.11 Residues of integers Exercises

1 1 3 4 4 5 5 7 8 9 10 12 14

2

17 17 18 18 19 20 21 22 23 25 26 28 29 31

Euclidean D o m a i n s 2.1 The definition 2.2 The integers 2.3 Polynomial rings 2.4 The Gaussian integers 2.5 Units and ideals 2.6 Greatest common divisors 2.7 Euclid's algorithm 2.8 Factorization 2.9 Standard factorizations 2.10 Irreducible elements 2.11 Residue rings of Euclidean domains 2.12 Residue rings of polynomial rings 2.13 Splitting fields for polynomials xi

rings

xii

Contents 2.14 Further developments Exercises

31 32

3

Modules and Submodules 3.1 The definition 3.2 Additive groups 3.3 Matrix actions 3.4 Actions of scalar matrices 3.5 Submodules 3.6 Sum and intersection 3.7 fc-foldsums fc-fold sums 3.8 Generators 3.9 Matrix actions again 3.10 Eigenspaces 3.11 Example: a triangular matrix action 3.12 Example: a rotation Exercises

35 35 37 37 39 40 41 43 43 45 46 47 48 48

4

Homomorphisms 4.1 The definition 4.2 Sums and products 4.3 Multiplication homomorphisms 4.4 F[X]-modules in general 4.5 F[X]-module homomorphisms 4.6 The matrix interpretation 4.7 Example: p = 1 4.8 Example: a triangular action 4.9 Kernel and image 4.10 Rank & nullity 4.11 Some calculations 4.12 Isomorphisms 4.13 A submodule correspondence Exercises

51 51 53 54 55 56 57 57 58 58 60 61 62 64 65

5

Free 5.1 5.2 5.3 5.4 5.5 5.6 5.7

69 70 71 73 74 76 77 79

Modules The standard free modules Free modules in general A running example Bases and isomorphisms Uniqueness of rank Change of basis Coordinates

Contents

xiii

5.8 Constructing bases 5.9 Matrices and homomorphisms 5.10 Illustration: the standard case 5.11 Matrices and change of basis 5.12 Determinants and invertible matrices Exercises

80 81 83 84 85 88

6

Quotient Modules and Cyclic Modules 6.1 Quotient modules 6.2 The canonical homomorphism 6.3 Induced homomorphisms 6.4 Cyclic modules 6.5 Submodules of cyclic modules 6.6 The companion matrix 6.7 Cyclic modules over polynomial rings 6.8 Further developments Exercises

91 92 92 93 94 96 99 100 102 102

7

Direct Sums of Modules 7.1 Internal direct sums 7.2 A diagrammatic interpretation 7.3 Indecomposable modules 7.4 Many components 7.5 Block diagonal actions 7.6 External direct sums 7.7 Switching between internal Sz external 7.8 The Chinese Remainder Theorem Exercises

107 107 109 Ill 112 113 114 115 116 118

8

Torsion and the Primary Decomposition 8.1 Torsion elements and modules 8.2 Annihilators of modules 8.3 Primary modules 8.4 The p-primary component 8.5 Cyclic modules 8.6 Further developments Exercises

123 124 125 126 128 130 130 131

9

Presentations 9.1 The definition 9.2 Relations 9.3 Defining a module by relations

133 134 135 136

xiv

Contents 9.4 The fundamental problem 9.5 The presentation matrix 9.6 The presentation homomorphism 9.7 F[X]-module presentations 9.8 Further developments Exercises

136 139 140 141 142 143

10 Diagonalizing and Inverting Matrices 10.1 Elementary operations 10.2 The effect on defining relations 10.3 A matrix interpretation 10.4 Row & column operations in general 10.5 The invariant factor form 10.6 Equivalence of matrices 10.7 A computational technique 10.8 Invertible matrices 10.9 Further developments Exercises

145 145 146 149 150 152 155 156 158 160 161

11 Fitting Ideals 11.1 The definition 11.2 Elementary properties 11.3 Uniqueness of invariant factors 11.4 The characteristic polynomial 11.5 Further developments Exercises

163 163 165 166 167 168 169

12 The Decomposition of Modules 12.1 Submodules of free modules 12.2 Invariant factor presentations 12.3 The invariant factor decomposition 12.4 Some illustrations 12.5 The primary decomposition 12.6 The illustrations, again 12.7 Reconstructing the invariant factors 12.8 The uniqueness results 12.9 A summary 12.10 Abelian groups 12.11 Lattices 12.12 Further developments Exercises

171 171 174 176 178 179 181 182 183 185 186 187 190 190

Contents

xv

13 Normal Forms for Matrices 13.1 F[X]-modules and similarity 13.2 The minimum polynomial 13.3 The rational canonical form 13.4 The Jordan normal form: split case 13.5 A comparison of computations 13.6 The Jordan normal form: nonsplit case 13.7 The Jordan normal form: separable case 13.8 Nilpotent matrices 13.9 Roots of unity 13.10 Further developments Exercises

193 194 195 197 200 202 203 206 209 209 211 211

14 Projective Modules 14.1 The definition 14.2 Split homomorphisms 14.3 Semisimple rings 14.4 Representations of groups 14.5 Hereditary rings Exercises

215 215 216 220 222 224 225

Hints and Solutions

229

Bibliography

243

Index

245

Chapter 1

Rings and Ideals Each module has a ring of scalars associated with it. Therefore, we must discuss rings before we can begin to tackle modules. In this chapter, we collect together the basic definitions and properties of rings that we will need in subsequent chapters. We consider ideals, which tell us about the internal structure of a ring, and we look at some special types of ring, particularly fields and polynomial rings. We also give the construction of residue rings, which is an important method for obtaining new rings from old. A reader who has already met rings, ideals and Euclidean domains may prefer to go directly to the start of our discussion of module theory in Chapter 3, using this chapter and the next for reference. We precede the definition of a ring with the definition of a more funda mental structure, namely, a group.

1.1

Groups

We encounter two notations for groups, additive and multiplicative, which are used according to the context in which the group arises. First, we introduce the notation that is most often met in ring theory and module theory. An additive group is a (nonempty) set A together with a law of compo sition + , called addition, which behaves as you might expect. Thus for any a, b e A, there is an element a + b £ A which is called the sum of a and b, and the following axioms must be satisfied. 1

Chapter 1. Rings and Ideals

2

Al:

Associativity. (a + b) + c = a + (b + c) for all a, b and c e A.

A2:

Commutativity. a + b = b + a for all a, b € A.

A3: Zero. There is a zero element 0 in A with a + 0 = a for all a in A. A4: Negatives. Each element a oi A has a negative —a so that a + (—a) = 0. It is usual to omit the bracket around a negative and write a + (—b) = a — b and (—a) + b = —a + 6. When studying groups in their own right, the law of composition is usually written in multiplicative notation instead of additive notation. In the multiplicative notation, a group is a (nonempty) set G in which any elements g,h in G have a product g ■ h, often written simply as gh, and the axioms are as follows. Gl:

Associativity. (/ • 9) ■ h = f ■ (g ■ h) for all / , g and

heG.

G2: Identity. There is an identity element 1 in G with g ■ 1 = g = 1 • g for all g in G. G3: Inverses. Each element g of G has an inverse g~l so that S - 9 " 1 = 1 = 0 - 1 •£■ A multiplicative group is abelian or commutative if g ■ h = h- g for all g,h e G. Notice that our list of axioms for a multiplicative group is not simply a translation of the list for an additive group. The difference is found in axiom A2, which demands that an additive group must be abelian. As groups are not our main concern in this text, we will not pause here to give examples of them. We will meet additive groups in profusion in our study of rings and modules, while multiplicative groups will be invoked to construct some examples of rings in Chapter 14.

1.2.

Rings

1.2

3

Rings

Informally, a ring is a set R in which arithmetic can be performed, that is, the members of R can be added and multiplied, in much the same way as integers or real numbers. However, there are two common properties of multiplication that do not hold in a general ring. The first of these properties is that a nonzero element of a ring need not have a multiplicative inverse in that ring. For example, the integer 2 has no inverse within the ring of integers Z, although it does have an inverse in the rational numbers Q. The second property is that multiplication need not be commutative, that is, we can have rs ^ sr for two elements r and s of a ring. Now for our formal definition of a ring. A ring is a nonempty set R, on which there are two laws of composition, addition and multiplication. Addition is indicated by the symbol +, so that for each pair r and s of members of R there is a sum r + s in R. Multiplication is usually indicated by simply writing the elements next to each other: for each pair r,s € R, there is a product rs in R. When it is more convenient to have a symbol for multiplication, we use a dot "•", so the product appears as r ■ s. We use the dot while writing out the axioms. Under addition, R must be an additive group as in the preceding section. The properties of multiplication, and the interaction between addition and multiplication, are given by the following axioms. RM 1:

Associativity. (r ■ s) ■ t = r ■ (s ■ t) for all r, s and t € R.

RM 2: Identity. There is an identity element 1 in R with r ■ 1 = r = 1 • r for all r in R. RM 3:

Distributivity.

For all r, s and t in R, (r + s)-t = r-t + s-t

and r ■ (s + t) = r ■ s + r ■ t.

We allow the possibility that 0 = 1 in a ring. In that event, r = rl = rO = 0 for every r 6 R, so R must be the trivial or zero ring 0 that has only one element. Many statements about rings or modules have trivial exceptional cases when they

Chapter 1. Rings and Ideals

4

s

are interpreted for the zero ring; as a rule, we will not state these exceptions separately. Familiar examples of rings are the ring of integers Z, the ring of rational numbers Q, the ring of real numbers R and the ring of complex numbers C. We shall assume the basic properties of these rings when we need to, as any attempt to establish them in full detail would take us too far from the point of this text. The more leisurely introduction to ring theory given by Allenby [Allenby] does cover these topics. Given a ring R, the set Mn(R) of n x n matrices with entries in R is also a ring under the usual addition and multiplication of matrices. The verification of this assertion is a worthy but lengthy exercise.

1.3

Commutative domains

We now introduce two properties which will be satisfied by many of the rings that we consider in this text. C: A ring R is commutative if rs = sr for all r,s G R. D: A ring R is a domain (alternative names are an integral domain or an entire ring) if R is not the zero ring, and whenever rs = 0 for r,s € R, then either r = 0 or s = 0 already. Familiar examples of domains are the rings Z, Q and R. The standard examples of rings which fail to be commutative or to be domains arise as matrix rings. Take R to be any (nontrivial) ring, Z for instance, and let M2{R) be the set of all 2 x 2 matrices over R. Then M2(R) is neither commutative nor a domain, for if we take r — and s = (

1.4

o o

), then rs =£ 0 but sr — 0.

Units

Let R be a ring. An element u of R is said to be a unit (or invertible) in R if the following condition holds. U: There is an element w in R so that uw = 1 and wu = 1. Such an element w is unique, and it is called the inverse of u. It is usually written u _ 1 or 1/u.

1.5. Fields

5

The inverse u _ 1 of a unit is itself a unit, with inverse u, and the product of two units u, v is again a unit, with inverse v~lu~l. This means that the set U(R) of units in R is a multiplicative group as in section 1.1. Thus U(Z) = { + 1 , - 1 } , while U(Q) is the set of all nonzero rational numbers. Notice that an element that is not a unit in one ring may become a unit in a bigger ring. An important type of ring is defined by the requirement that the zero element is the only non-unit.

1.5

Fields

F. A field is a nonzero commutative ring F in which every nonzero element is a unit of F. The rational numbers Q, the real numbers R and the complex numbers C are all examples of fields. It is a fact that any commutative domain R is contained in a field Q whose elements are of the form r ■ s~x = r/s for r , s € i J , s / 0 . The field Q is called the field of fractions or quotient field of R. For instance, Q is the field of fractions of Z. As the existence of the field of fractions is extremely believable in all the concrete examples that we consider, we will take it for granted. The technical details of the construction are given in several texts - for instance, section 3.10 of [Allenby] or [B & K: IRM] (1.1.12). A discussion of the existence and construction of rings of fractions for more general types of ring can be found in [Rowen] and [B & K: CM], among other texts.

1.6

Polynomial rings

We now introduce rings of polynomials, which play an important role in this text. For the moment, let F be any ring - later F will usually be a field. A polynomial with coefficients in F, or "over F", is an expression / = f(X) = f0 + fxX + f2X2

+■■■+

fmXm

with fo,fi,...,fm € F, where X is an "indeterminate" or "variable". The polynomials / and g = go + giX + g2X2 + ■■■ + gnXn in F[X], with, say, m < n, are equal if /* = #, for i = 1 , . . . , m and 9m+i

= ■ ■ ■ = 9n = 0-

Chapter 1. Rings and Ideals

6

We prefer the notation / to f(X) unless there is a special reason to mention the variable X. If ft = 0 for all t, then / is the zero polynomial 0. If ft = 0 for i > 1, then / is a constant polynomial. When / is not the zero polynomial, the degree of / is the largest index m with fm ^ 0, and we write deg(/) = m. Then fm is the leading term of / . For convenience, the zero polynomial is allocated the degree —oo. We write F[X] for the set of polynomials over F. Addition and multi plication of polynomials are defined by the standard rules: given f = fo + fiX + f2X2

+ ■■■ +

fmXm

and 9 = go + giX + g2X2 + ■■■ + gnXn in F[X], with m < n, their sum / + g is given by f + 9 = (fo+9o) + (fi+gi)X

+ --- + (fm + gm)Xm + gm+1Xm+1

and their product fg = f(X)g(X)

+ ---+gnXn

is the polynomial

ho + hxX + ■ ■ ■ + hkXk + ■■■ +

hm+nXm+n

where hk = fo9k H

h fi9k-i H

1- fk9o for k = 0,... ,m +n.

In particular, h0 = fogo, hi = f0gi + figo, and hm+n = fmgn. The verification that F[X] is actually a ring is a matter of careful calcu lation. The zero element of F[X] is the zero polynomial, while the identity element is the constant polynomial with /o = 1. Note: our definition of a polynomial is rather informal; for instance we have not defined the "variable" X. A more thorough construction of polynomials is given in section 1.6 of [Allenby]. Our first listed result is elementary but crucial. 1.6.1 Lemma (i) If a ring F is commutative, so also is the polynomial ring F[X]. (ii) If a ring F is a domain, so also is the polynomial ring F[X]. Proof Let f(X) = fo + fiX + hX2 + ■■■ +

fmXm

1.7. Ideals

7

and 9(X) =g0 + giX + g2X2 + ■■■+ gnXn be in F[X], so that fo,fi,...,fm product fg is the polynomial

and go,gi,...,gn

h0 + hiX + --- + hkXk + ■■■ +

are in F; then their hm+nXm+n

with hk = fo9k H 1- fi9k-i H 1- AitoWhen F is commutative, the /c-th coefficient of / # is the same as the fc-th coefficient of gf for k = 0,... ,m + n, since both are the sums of all possible terms ftgk~i, i = 0 , . . . , k, but in different orders. Thus fg = gf for all polynomials / , g, which gives the first assertion. To prove the second, suppose that / and g are both nonzero. We may as well assume that the polynomials have been written so that the coefficients fm and gn are both nonzero. Then the product hm+n = fmgn is also nonzero, so fg is nonzero. □

1.7

Ideals

Ideals are crucial to the investigation of modules. At an elementary level, they provide the first examples of modules, and, at a much deeper level, a knowledge of all the ideals of a ring sometimes enables us to describe the modules over the ring. Ideals come in three types. A left ideal of a ring R is a subset I of R with the following properties. Idl: Zero. The zero element 0 of R is in I. Id2: Additive closure. If x, y € / , then x + y e I. IdL3: Multiplicative closure. If r £ R and x € I, then rx € / . A right ideal I satisfies axioms Idl and Id2, but instead of IdL3 we have IdR3: IfreR and x € I, then xr € / . If I is simultaneously both a left and right ideal of R, we say that I is a two-sided ideal. If R is a commutative ring, then rx = xr for any x and r in R, so that conditions IdL3 and IdR3 are the same. Thus every ideal of R is both left and right and therefore two-sided. In this case, we refer simply to an ideal of R. We will soon see an example in which left and right ideals differ from one another. For any ring R, R itself is a two-sided ideal of R. A (left, right or two-sided) ideal / of R is called proper if I ^ R. The subset {0} of R is a two-sided ideal, called, naturally enough, the zero ideal of R. We usually write 0 for the zero ideal. Next, we see how ideals of a ring arise from the elements of the ring.

8

1.8

Chapter 1. Rings and Ideals

Principal ideals

For any fixed element a of a ring R, let Ra = {ra | r 6 R}. Then Ra is a left ideal of R, called the principal left ideal generated by a. The element a is called a generator of Ra. The principal right ideal generated by a is aR = {ar | r S R}. When R is commutative, Ra = aR is the (two-sided) principal ideal generated by a. The concept of a principal ideal is one of the fundamental notions in this text, since we will usually impose conditions on the ring R which guarantee that all its ideals are principal. To make sure that we get off on the right foot, we give the almost trivial verification that Ra is actually a left ideal. Idl: 0 is in Ra since 0 = Oa. Id2: Suppose that ra and sa are in Ra, where r, s are in R. Then ra + sa = (r + s)a € Ra. Id3: Suppose that r £ R and so € Ra, where s £ R. Then r{sa) = (rs)a is also in Ra. Examples. 1. The principal (left or right) ideal generated by the zero element 0 of R is always the zero ideal, since rO = 0 = Or for every r in R. 2. The principal (left or right) ideal generated by the identity element 1 of R is always R itself, since r l = r = l r for every r in R. 3. In the ring of integers Z, there are ideals 2Z, 3Z, 4Z,....

g

By Lemma 1.8.1, these ideals are all distinct, and, in the next chapter, we see that they are the only proper nonzero ideals of Z (Theorem 2.5.3). Note: our usual rule is that we write a two-sided principal ideal as a left ideal Ra. However, it looks unnatural to write Z2 in place of 2Z, etc. 4. For an example in which left and right ideals differ, we take R to be the ring M 2 (R) of 2 x 2 matrices over the field R of real numbers. Let { 1 0 J. Then 611 = { 0 0

Ren

= {{Z

2) | a i i ' a 2 i G R }

and e

"

R

= { { 7

a

o)\an,a12eRy

1.9. Sum and intersection

9

In the case of greatest interest to us in this text, it is straightforward to determine whether or not two elements of a ring generate the same principal ideal. 1.8.1 Lemma Let R be a commutative domain, and let a and b be nonzero elements of R. Then Ra = Rb if and only if a = ub where u is a unit of R. In particular, Ra = R if and only if a is a unit of R. Proof Suppose that Ra = Rb. Then a = 1 • a is in Rb, so a = ub for some u in R. Similarly, b = wa for some u>. Then a = uwa, and so a(l — uw) = 0. As R is a domain and a ^ 0, we have 1 = uw, which shows that u is a unit with inverse w. Conversely, suppose that a = ub with u a unit. Then ra = (ru)b is in Rb for all r in R, and so Ra C Rb. But b = u~la, so the reverse inclusion also holds. The final assertion is obvious. □

1.9

Sum and intersection

Next we introduce some useful operations on ideals. Suppose that / and J are both left ideals, or both right ideals, or both two-sided ideals of R. Their sum I + J is defined as I + J = {x + y | x € I, y e J}, while their intersection I n J is the usual intersection of sets: 7 n J = { x | a ; € / and x 6 J } . 1.9.1 Lemma If I and J are both left ideals, right ideals or two-sided ideals of a ring R, then I + J and I tl J are also correspondingly left, right or two-sided ideals of R. Proof Suppose that I and J are both left ideals. First, we have 0 e I and 0 e J, giving 0 + 0 = 0 6 / + J . Next, let x + y and x' + y' be members of / + J, with x,x' £ / and y,y' e J. Then (x + y) + ix' + y') = (x + x') + (y + y'),

Chapter 1. Rings and Ideals

10

with x + x' e I and y + y' £ J, so I + J is closed under addition. For any r in R, we have rx £ I and ry £ J. Thus r(x + y) = rx + ry also belongs to J + J, which verifies the final condition. The verification for the intersection is even easier and we leave it to the reader, along with the remaining cases. D

1.10

Residue rings

The construction of a residue ring is a very useful method of obtaining a new ring with interesting properties. We will use this technique to construct some finite fields (Lemma 1.11.2), and to extend a field to a larger field which contains the roots of a given polynomial (Proposition 2.13.1). Let R be a ring and let / be a two-sided ideal of R. Informally, the idea behind the construction of the ring R/I of residues of R modulo I is that two elements r,s of R which differ by an element x in I should give the same element r in R/I. Thus, if r = s + x in R, then f = s in R/I. The point of such a construction can be illustrated by the special case where we take the ideal 2Z in the ring of integers Z. Then two integers r, s differ by an element 2a of 2Z precisely when 2 \r — s, that is, either r and s are both odd or r and s are both even. Thus we expect there to be two members of the residue ring Z/2Z, namely 0 and 1, corresponding in turn to the set of even integers and the set of odd integers. This illustration also explains the use of the term 'residue ring': when an integer is divided by 2, there are two possible 'residues' or 'remainders', 0 or 1. (However, some authors prefer the terms factor ring or quotient ring to residue ring.) Now we turn to the formal construction of the residue ring. Given a two-sided ideal / of a ring R, we define a relationship between the elements of Rby r = s mod / <=> r — s £ I. The above expression is read "r is congruent to s modulo J", and the relationship is called congruence modulo I. When I = Ra is a principal ideal, we usually write r = s mod a rather than r = s mod Ra. Thus for integers r, s, r = s mod 2 <=> r - s £ 2Z <=3> 2 | r - s. We record the basic property of congruence. 1.10.1 L e m m a Let I be a two-sided ideal of a ring R. Then congruence modulo I is an equivalence relation on R.

1.10. Residue rings

11

Proof We need to check the three properties which define an equivalence relation, namely, that the given relation is reflexive, symmetric and transi tive. For the first, we need to verify that r = r always, which is obvious. The requirement for symmetry is that if r = s, then s = r. But r - s £ I implies s-r e I. For transitivity, we assume that r = s and s = t for three elements r,s,t € R, and we have to show that r = t. But r - 1 = (r - s) + (a -1) e I. D Given an element r of R, we define the residue class of r mod / to be r = {s&R\s

= r mod / } .

Since congruence is an equivalence relation on R, the residue class of r is its equivalence class under this equivalence relation. Thus, by the general properties of equivalence relations, the ring R is partitioned into disjoint residue classes, that is, for r,s G R, either r = s or r P i s = 0. The residue ring R/I is defined to be the set of all residue classes r mod / of elements of R. Thus, as promised, Z/2Z = {0,1}. To make R/I into a ring, we must define addition and multiplication of residue classes. For r,s~ 6 R/I, we put r + ~s = r + s and r • s = rs. The hardest point in the verification that these laws of composition make R/I into a ring is to check that they are well-defined. This problem arises since one element of R/I can be expressed as the residue class of many different elements from R, and we need to know that the sum and product in R/I are not affected by such variations. We record the statement and proof as follows. 1.10.2 Proposition Let I be a two-sided ideal of a ring R. Then (i) addition and multiplication in R/I are well-defined; (ii) R/I is a ring with zero element 0 and identity element 1; (Hi) if R is commutative, then R/I is also a commutative ring. Remark: Exercise 1.8 gives a nontrivial example in which R/I is commu tative although R is not.

Chapter 1. Rings and Ideals

12

Proof (i) Suppose that r = ?i and s = si, so that r = n + x and s = si + y for elements x, y € /. Then r + s = (ri+si) + (x + y) with x + y € I and hence r -f s = T\ + Si, that is, the two alternative methods for computing the sum F + s give the same result. Also, rs = r i s i + (r x y + s\x + xy) with rxy + six + xy € I, so the two computations of r • s have the same outcome. (ii) This is a matter of checking that the addition and multiplication in R/I satisfy the axioms as given in section 1.2, granted that the axioms hold in R already. We give two sample checks to show how easy it is. For any r e f l , r+ 0

= =

r +0 f

which shows that 0 is indeed the zero element of R/I. For any r,s,t 6 R, (r + s) -t

= =

r + s -t (r + s)t

=

rt + st

= =

ri + li r -t + s -t

which establishes that the distributive law holds. (iii) Suppose R is commutative. Then, for any elements f,s of R/I, have r-~s=f~s = ~sr = ~s-r.

we

D

1.11

Residues of integers

Our first explicit examples of residue rings arise from the various ideals of the ring of integers Z. The calculations depend on some well-known facts about factorization and division of integers which we take as granted in this chapter. In the next chapter, these facts will be established as special cases of results that hold for Euclidean domains in general. Let m be a positive integer and let / = mZ be the principal ideal generated by m. We introduce the special notation Z m for the ring Z/mZ, since these rings appear frequently in this text.

1.11. Residues of integers

13

To describe Z m , we use the fact that, for any integer s, there are integers q and r with s = qm + r and 0 < r < m — 1. Now, for integers r, s, s = f in Z m <=*> s = r mod m <=> s — r € mZ <=> m | s — r. Thus the set of residue classes mod m is Z m = { 0 , T , . . . , m - 1}, and these are all distinct since m cannot divide r — s if 0 < r, s < m — 1 unless r = s. Note that the ring Z m need not be a domain even though Z is a domain - for example 2 • 2 = 0 in Z4, although 2 ^ 0 . However, in the important case when p is a prime number, the ring Z p is a field, which we deduce from a more general result that is our next theorem. Before we can state this theorem, we need a preliminary definition. A two-sided ideal I of a ring R is maximal if I is a proper ideal of R, and there is no two-sided ideal J of R with I C J C R. Here, we use the symbol "C" to indicate strict containment, that is, I ^ J and J / R. Maximal left ideals and maximal right ideals are defined in the obvious way. Granted the unique factorization of integers, and that every ideal in Z is principal, the maximal ideals in Z have the form pZ where p is prime. 1.11.1 T h e o r e m Let I be an ideal of a commutative ring R. ments are equivalent.

Then the following state

(i) I is a maximal ideal of R. (ii) The residue ring R/I is a field. Proof (i) => (ii): By Lemma 1.10.2, R/I is a commutative ring, so we only need to find an inverse for a typical nonzero element f € R/I. By Lemma 1.9.1, Rr+I is an ideal of R, and, as / is maximal, either Rr+I = I or Rr+I = R. But if Rr + I = I, then r = l-r + 0el and so f = 0, contradicting our assumption that r is nonzero. Thus Rr + I = R, and so there is an element s of R and an element x of I with sr + x = 1. This equality gives s • f = 1 in R/I, that is, f has an inverse. (ii) => (i): Suppose that J is an ideal of R with I C J, (so I ^ J). Then there is some element r £ J with r $ I. Since F ^ 0 in R/I, s • f = 1 for some s € R.

Chapter 1. Rings and Ideals

14

But then 1 = sr + x in R for some element x of J, and so 1 £ J, which implies that J = R. Thus / is a maximal ideal of R, as desired. □ 1.11.2 Corollary Let p be a prime number. Then Z p is a

field.

D

Exercises 1.1

1.2 1.3

1.4

1.5

1.6

g

1.7

A nonzero element r of a ring R is called a proper zero divisor if rs = 0 for some nonzero element s £ R. Show that a proper zero divisor cannot be a unit of R. Show that a commutative ring R is a field if and only if 0 is the only proper ideal of R. Let R be a commutative ring. An ideal / of R is said to be prime if the following holds. P: If r, s £ R and rs £ I, then either r £ I or s £ I (possibly both are in / ) . Show that (i) 0 is a prime ideal of R «=>• R is a domain. (ii) / is prime <=> R/I is a domain. (iii) A maximal ideal of R must be a prime ideal. Let m > 2 be an integer. Show that Z m is a domain <=>• m is a prime number. Show by direct calculation that U(ZQ) = {1,5} and that the set of proper zero divisors in I,Q is {2,3,4}. Find the corresponding results for Zg and ZioLet A be a commutative domain and let R = A[X] be the polynomial ring over A. Prove the following assertions. (a) / = f{X) = /o + fiX + ■ ■ • + fnXn is a unit of R <=> n = 0 and /o is a unit of A. (b) X divides a product fg in i? <=>■ X divides either / or g. (c) X = fg with f,g £ R «=*> * either / or g is a unit of R. Let F be a field and let fl = F[X, Y] (= F[X][Y]) be the polynomial ring in two variables over F. Show the following. (a) R is a domain. (b) RX is a prime ideal of R. (c) i?X + RY is a maximal ideal of R but .RX + RY is not principal. Let F be a field and let D be the set of all diagonal matrices ( ^

]

with r,s € F. Verify that D is a commutative ring under the usual sum and product of matrices.

Exercises

15 Show also that the only proper nonzero ideals of D are De and

Df, where e =(

J

jj J and / = ( J

J J, and that De + D / = D

and DedDf = 0. Show that there is a bijective map 9 : D/De —► F given by r

0

x 0

s

and that 9{d + d1) = 9{d) + 6(d') and 6(d ■ df) = 9(d) ■ 9(d')

1.8

for all d, d' in D. Thus we can identify the ring D/De with F. Remark in other words, 9 is an isomorphism of rings - see Exercise 4.4. Let F be a field and let T be the set of all upper triangular matrices C

/ r

t \

I n . ) with r, s,t in F. Verify that T is a ring under the usual sum and product of matrices. Show that T is not commutative. Let

H

$) | r , t 6 i r }

={{I 0 0

t s

t,seF

'-{(SO1"'} 1.9

Show that i7, J and J are all two-sided ideals in T. Using the methods of the preceding exercise, show that T/H = T/I = F, while T/J = D. (Thus, as promised earlier, a noncommutative ring can have a commutative residue ring.) Let F be a field and let R be the ring of all 2 x 2 matrices over F . Show that R has no two-sided ideals except 0 and R. (See Exercise 7.8.)

Chapter 2

Euclidean Domains We now introduce the type of ring which is of greatest interest to us in this text, namely, a Euclidean domain. Such a ring shares with the integers Z the property that long division is possible, that is, given elements a, b of the ring, then there are elements q, r of the ring with a = qb + r where the remainder r is 'smaller' than b in some sense. Consequently, many results that hold for the ring of integers can be extended to Euclidean domains in general. Apart from the integers themselves, most of Euclidean domains that we encounter in this text are polynomial rings of the form F[X] for a field F. We briefly consider the Gaussian integers Z(»], and some further examples are mentioned in the exercises. This chapter also contains a detailed analysis of the residue rings of polynomial rings. This analysis is used immediately to give an algebraic method for constructing roots of polynomials, and later, in Chapter 13, to find normal forms for matrices.

2.1

The definition

A Euclidean domain is a commutative domain R together with a function ip : R —► Z that has the following properties. ED 1: ip(a) > 0 for all r € R, and a = 0. ED 2:
Chapter 2. Euclidean Domains

18

The element q is called the quotient and r the remainder. Either of them may be 0, and they may not be uniquely determined by the pair a, b. Axiom ED 3 is called the division algorithm, or, less formally, long division. Any field F is a trivial Euclidean domain in which
2.2

The integers

The ring Z of integers, with
2.3

Polynomial rings

Next we show that for any field F, the polynomial ring F[X] is a Euclidean domain. As there are infinitely many distinct fields (Q, R, C, Z p for p a prime,...) we obtain an infinite supply of Euclidean domains. Since a field is a commutative domain, so is the polynomial ring FIX] (Lemma 1.6.1). The function ip is defined in terms of the degree of a polynomial. Let / = /o + fiX + f2X2 + ■■■ + fmXm with /o, fu.■.,/m G F be a polynomial in F\X}. If / / 0, we assume that / has been written with highest coefficient fm ^ 0, so that m is the degree deg(/) of / . Then the function ip is given by nJ>

1 o,

/ =o

2.4. The Gaussian integers

19

Axiom ED 1 holds by definition, and axiom ED 2 is clearly satisfied. The verification of axiom ED 3 requires a bit of work. Suppose that f = f0-\ 1- fmXm and g = g0 H + gnXn are nonzero polynomials in F[X], of degrees m and n respectively. We have to show that / = qg + r where q and r belong to F[X] and either deg(r) < deg(g) or r = 0. We use induction on deg(/). If m < n then we can take q = 0 and r = f. This remark also covers the initial case in the induction, when m = 0, for if m = n = 0 then both / and g are units in F[X] and the assertion is trivial. Suppose now that m > n, and put / = / - fng^n1Xm~ng. By con struction, the coefficient of Xm in / is 0, so deg(/) < m. By induction hypothesis, / = qg + r with deg(r) < deg(g) (or r = 0), and a rearrange ment gives the desired form for / . Unlike the situation for Z, the quotient and remainder are unique in polynomial rings. To see this, suppose that / = qg + r and / = q\g + n with deg(r) < deg(g) and deg(ri) < deg(^). Then (q — q\)g = T% — r. But deg(ri — r) cannot be a nonzero multiple of deg(g), so ri — r = 0 and 9 - 9i = 0.

2.4

The Gaussian integers

A Gaussian integer is a complex number of the form a + bi where a, b are ordinary integers. The set of all Gaussian integers is written 1\i]. If c + di is also a Gaussian integer, the sum and product, (a + bi) + (c + di) = (a + c) + (c + d)i and (a + bi) ■ (c + di) = (ac — bd) + {ad + bc)i respectively, are again Gaussian integers. The fact that Z[i] is a ring can now be confirmed, in one of two ways, according to the reader's inclination. The hard but secure route is to check all the axioms, one by one. A simpler method, which tends to arouse suspicion at first sight, is to observe that the set C of complex numbers is a ring (which we take for granted), and that properties such as associativity, distributivity,..., are therefore inherited by Z[»]. Similarly, there are two ways to verify that Z[i] is a commutative domain. For each Gaussian integer a + bi, define (p(a + bi) = (a + bi)(a — bi) = a2 + b2. A direct check shows that axioms ED 1 and ED 2 hold in Z[i].

Chapter 2. Euclidean Domains

20

To establish ED 3, suppose that a+bi and c + di are nonzero Gaussian integers, and put x + yi = (a + bi)(c + di)^1 G C. Choose integers u and v with \x - u\ < 1/2 and \y - v\ < 1/2, and write s = (x - u) + (y - v)i. Extending the definition of ip to C in the obvious way, we see that
2.5

Units and ideals

We now return to the investigation of the properties of a general Euclidean domain R. First, we note that the units of R are characterized easily in terms of the function tp. 2.5.1 Lemma Let R be a Euclidean domain and let u be in R. Then u is a unit of R if and only if f(u) = 1. Proof Consider the identity element 1 of R. Since l 2 = 1, axiom ED 2 shows that (
As the value of ip{u) is positive, this gives ip(u) = 1. Conversely, suppose tp(u) = 1. By axiom ED 3, 1 = qu + r for q, r G R with
2.6. Greatest common divisors

21

2.5.3 Theorem Let R be a Euclidean domain. Then every ideal of R is principal. Proof It is obvious that the zero ideal 0 is principal, its unique generator being the zero element 0. Now suppose that 7 ^ 0 , and choose an element a of I so that 0 <
2.6

Greatest common divisors

Next we show that any two elements of a Euclidean domain have a greatest common divisor. First, we must define this term, and we choose the defi nition which is most useful for our applications, although it is perhaps not the most transparent extension of the usual definition in Z. Let a and b be elements of a Euclidean domain R (one or both of o, b may be 0). Then a greatest common divisor (or GCD) of the pair a,b is any element d of R which satisfies the following requirements. GCD 1: d | a and d | b, GCD 2: d = sa + tb for some elements s,t of R. To see that this definition is reasonable, notice that if x G R divides both a and b, then, by axiom GCD 2, x divides d also and so Rd = Ra + Rb. In particular, a, b always have a greatest common divisor, which can be written d = sa + tb for some s,t € R. Proof =x By GCD 2, d € Ra+Rb and so Rd C Ra + Rb. For the reverse inclusion, suppose that r 6 Ra + Rb, with r = xa + yb for some x,y S R. By GCD 1, a — a'd and b = b'd for a',b' e R, and so r = (xa' + yb')d belongs to Rd.

Chapter 2. Euclidean Domains

22

<S=: Obviously, GCD 2 is satisfied. For GCD 1, note that a = l-a+0-6 £ Ra + Rb, hence a = a'd for some a', and likewise for b. By Theorem 2.5.3, the ideal Ra + Rb must be principal, so a, 6 have a greatest common divisor rf, which can be written as claimed. □ Choosing a G C D In a general Euclidean domain, there are many different greatest common divisors of a pair a, b, since if d is one GCD, then so is ud for any unit u of R. Thus, in Z, - 2 and 2 are both greatest common divisors of the pair 6, 8, while in a polynomial ring F[X] over a field F, a greatest common divisor can be multiplied by any nonzero constant. In practice, we will make standard choices for greatest common divisors in the integers and in polynomial rings -F't-X']. In Z we will always choose the unique positive greatest common divisor of a nonzero pair of integers. In a polynomial ring F[X], any nonzero polynomial / = /oH \-fmXm with / m ^ 0 can be written / = fmg with g a unique monic polynomial, that is, g = go + g\X + ■ ■ ■ + gm-iXm~l + Xm. We always choose the greatest common divisor of a nonzero pair of polynomials to be monic. The notation (a, b) will be used for the standard choice of the greatest common divisor of a and b in the ring of integers or a polynomial ring, and for some arbitrary choice of a greatest common divisor in other rings, such as the Gaussian integers, where there is no evident "standard" choice.

2.7

Euclid's algorithm

At this point, we should explain the origin of the term Euclidean domain. In Book VII, Propositions 1 and 2, of his Elements [Euclid], Euclid gives an algorithmic procedure for the computation of the greatest common divisor of a pair of integers. His method uses long division with successively re ducing remainders. The properties of Z that Euclid requires are just those listed in Axioms ED 1 - 3 , and so the algorithm can be implemented in any ring which satisfies these conditions. Here is the method. Suppose that we are given elements a,b of a Eu clidean domain R. If one (or both) is 0, say 6 = 0, then the greatest common divisor (a, b) is trivially a. If both are nonzero and b \ a, then the computation is again trivial: (a, b) = b. In the remaining case, we have a, b ^ 0 and a = qb + r with r ^ 0 and V?(r) < ip(b). Since a € Rb+Rr and r € Ra+Rb, we have Ra+Rb = Rb+Rr and so (o, 6) = (b,r). If r divides 6, we are done. If not, we write 6 = q\r+ri with
2.8.

Factorization

23

Then rk - (rk-i,rk)

2.8

= •■• = (a,b).

Factorization

Our aim now is to show that, in an arbitrary Euclidean domain, there is a unique factorization theorem analogous to that for the integers. We must first extend some definitions from Z to Euclidean domains. An element p of a commutative domain R is said to be irreducible if p is neither zero nor a unit, and whenever p = ab for a, b in R, then either a or b is a unit. When we are working in Z, we defer to tradition and say 'prime number' in preference to 'irreducible number'. A nonzero, nonunit element which is not irreducible is called reducible. If p is irreducible, so also is up for any unit u of R. Two elements x, y of R are said to be associates if x = uy for some unit u of R, and such elements are not regarded as being genuinely distinct when computing factorizations. Thus, when the irreducible elements p, q are said to be distinct irreducible elements, it is to be understood that they are not associates, rather than simply being unequal. In Z, prime numbers appear in pairs of associates {2, —2}, {3, — 3 } , . . . , and custom dictates that we always choose the positive prime. In a poly nomial ring F[X] over a field F, irreducibles occur in families of associates {up | u € F, t i / 0 } , since the unit polynomials are the nonzero constants. Here, the custom is to choose the unique monic polynomial in each family of associates. Two elements a, b of R are said to be coprime if their greatest common divisor (a, b) is 1 (or a unit, which can always be replaced by 1). The following sequence of results is the key to unique factorization, and also to some other important results that we will meet later. 2.8.1 P r o p o s i t i o n Let R be a Euclidean domain and suppose that a, b G R. Then a,b are coprime •$=> 1 = sa + tb for some s,t € R. Proof Suppose a, b are coprime. By Lemma 2.6.1, R = Rl = Ra + Rb, giving 1 = sa + tb for some s,t € R. Conversely, if 1 can be written in this for, any common divisor x of a, b must divide 1, and so x must be a unit.

□

24

Chapter 2. Euclidean Domains

2.8.2 Proposition Let R be a Euclidean domain and suppose that a,b S R are coprime. If a | be with c £ R, then a | c. Proof By the previous result, we can write 1 = sa + tb for some s,t £ R. Since be = xa for some x e R, c = sac + tbc = (sc + tx)a. □ 2.8.3 Corollary Let R be a Euclidean domain and suppose that a,b € R are coprime. Then RaC\Rb = Rab. □ 2.8.4 Corollary Suppose that p G R is irreducible and that p \ be where b and c are in R. Then either p\b or p\c. Proof Since (p, b) is a divisor of p, either (p, b) = up for some unit u or (p, b) = 1. In the first case, p | b, and in the second, p | c by Proposition 2.8.2 above. □ We come to the main result on factorization. 2.8.5 The Unique Factorization Theorem Suppose that a is a nonzero element of a Euclidean domain R. Then there are irreducible elements p\,..., ps of R and a unit u of R with a = upi ■ ■ -ps.

Furthermore, if a = wq\ ■ • ■ qt, with q\,..., qt irreducible and w a unit, then s = t and there is a permutation -K of {!,...,$} so that pi and q^u) are associates for i = 1,... ,s. Proof The existence of a factorization is proved by induction on 1. If a is already irreducible, it is its own one-term factorization. If a is reducible, a = be with neither b nor c a unit. Then 1 < which means that t = 1 also.

2.9. Standard factorizations

25

In the case s > 1, we have p\ | wgi • • • qt. By the above argument, pi and <7j are associates for some index j , and, since i? is a domain, we have UP2 ■ • ■ Pa = vqi ■ ■ ■ <jj_i ■ qi+i

---qt

where v is a unit of -R. By induction hypothesis, s — 1 = t — 1 and we can pair off the sets {P2,--,Ps} and {qi,..

.,<7j_i,<7j+i,... ,qt}

as required.

2.9

Q

Standard factorizations

In the irreducible factorization a = up\ ■ ■ -pk that we obtained in the pre ceding theorem, it is possible that two or more of the irreducible factors are associates of one another. In applications, it is often more convenient to ensure that a given irreducible element can appear in one form only. We do this by selecting a single member from each set {up | u a unit} of associated irreducible elements of R. The resulting irreducible elements are called the standard irreducible elements of R. By construction, no two standard ir reducible elements are associates. As in section 2.8, in the integers Z we always take the positive primes as the standard primes, and in a polynomial ring F[X] we choose the monic irreducible polynomials. We can now rewrite the irreducible factorization of an element a of R so that each irreducible term is a standard irreducible and the occurrences of each standard irreducible are grouped together. Thus the factorization takes the form n(l) n(fc) a = up1K ' ■■■Pk' ',

where u is a unit and pi, ■ ■ ■ ,Pk are distinct (that is, non-associated) irreducibles. Such a factorization is called a standard factorization of a. The unique ness part of the theorem above tells us that the set of irreducibles pi,...,pk is uniquely determined by a (apart from the order in which it is written), and that the exponents n ( l ) , . . . ,n(fc) are uniquely determined by a once an order of listing for irreducibles is fixed.

Chapter 2. Euclidean Domains

26

2.10

Irreducible elements

As the Unique Factorization Theorem will be one of our main tools in the description of modules, it will be both interesting and useful to know something about the irreducible elements in the various Euclidean domains that we encounter.

P r i m e numbers We are obliged to assume that we know a prime number when we see it. Although we know that there are infinitely many prime numbers (see Exercise 2.1), we cannot list any infinite subset of them, and the problem of determining very large primes is an active area of research.

Irreducible Gaussian integers As noted in Corollary 2.5.2, there are four unit Gaussian integers, ± l , ± i , which means that each irreducible Gaussian integer comes in four associated disguises. From the definition of

Irreducible complex polynomials The Fundamental Theorem of Algebra assures us that any polynomial with complex coefficients has a complex root. This means that the only irre ducible members of the polynomial ring C[X] are the linear polynomials X - A for A € C - see Exercise 2.2. The Fundamental Theorem cannot be established by purely algebraic methods since analytic properties of func tions must be used at some point. A proof can be found in §7.4 of [Cohn 2].

2.10. Irreducible elements

27

Irreducible real polynomials There are two types of irreducible polynomial over the field of real num bers. These are the linear polynomials X — A, A £ K, and the quadratic polynomials X2 + bX + c with b, c € R and b2 - Ac < 0. Such a quadratic polynomial has a pair of complex conjugate roots A, A in C with A = (-b + Vb2 - 4c)/2. To see this, take an irreducible polynomial / in R[X], of degree at least 2. Over C, / = (X - Ax) • • • (X - Xn). Since the coefficients of / are real, they are unchanged by complex conju gation, and so / must also have the factorization f =

(X-X1)---(X-Xn).

By the uniqueness of the factorization, Ai = A^ for some h. We can't have h = 1 since this would be contrary to the irreducibility of / , so k > 1 and we may as well renumber the indices so that h = 2. But then (X — Xi){X — A2) has real coefficients and it is a factor of / . Thus / = (X - Xx)(X - A2). Writing b = —(Ai + A2) and c = A1A2 gives the required form for / .

Irreducible rational polynomials There is no general description of the irreducible polynomials over Q[X]. (If there were, algebraic number theory would be an easier subject!) We quote two useful results for handling rational polynomials; the first is proved in several of our references: [Allenby], [Cohn 1] and [Marcus], and the second is a not-too-difficult consequence - see Exercise 2.3. Gauss' Lemma. Suppose that / is a monic polynomial with integer coefficients and that / = gh with / , g £ Q[X] both monic. Then g and h also have integer coefficients. Eisenstein's Criterion. Let / = Xn + / n _ i X n _ 1 + • • • + f\X + f0 be a monic polynomial with integer coefficients, and suppose that there is a prime number p such that p \ fi for i = 0 , . . . , n — 1 but p2 does not divide /o. Then / is irreducible over Q. These results remain true if Q is replaced by the field of fractions Q of a Euclidean domain R and p is taken to be an irreducible element of R.

28

Chapter 2. Euclidean Domains

Irreducible polynomials over finite fields Although there are irreducible polynomials of every degree over a finite field ([Allenby] p. 163), the argument that predicts their existence is indirect and does not tell us what they look like. For small fields, it is possible to list the irreducible polynomials of a given small degree by enumerating those that are reducible. For example, take F = 1>2 = {0,1} - here, it is convenient to omit the "bars" that indicate we are working with residue classes. There are two linear monic polynomials in Z2[X], namely X and X + 1, and four quadratic monic polynomials, X2, X2 + 1, X2 + X and X2 + X + 1. The first two are squares (note that 2 = 0 in Z2) and the third is a product. Thus X2 + X + \ is, by elimination, the only possible irreducible polynomial of degree 2 over Zg. It actually is irreducible, since neither element of Z2 is a root.

2.11

Residue rings of Euclidean domains

The fact that every ideal of a Euclidean domain is principal, together with the Unique Factorization Theorem, leads to a good description of the residue rings of a Euclidean domain. We will postpone the general treatment of this topic until we discuss cyclic modules in Chapter 6, since the discussion can be simplified when we have some machinery from mod ule theory at our disposal. In the remainder of this chapter, we will show how fields arise as residue rings of Euclidean domains and we will give an explicit description of the residue rings of polynomial rings. Combining these, we obtain an algebraic construction for roots of polynomials. First, we show how fields arise. 2.11.1 Proposition Let R be a Euclidean domain, I an ideal of R. Then the residue ring R/I is a field
2.12. Residue rings of polynomial rings

2.12

29

Residue rings of polynomial rings

Let F[X] be the polynomial ring over a field F, and let / = /o + fxX + ■ ■ ■ + fn-iXn~l

+Xn,n

= deg(/) > 1

be a monic polynomial in F[X], Our aim is to give an explicit description of the residue ring F[X\/F[X]f that we need in subsequent applications. We show that each element of F[X]/F[X]f can be written as a polynomial 90 + 9ie + 92£2 H

1- 3 n - i « n _ 1

where e is a root of / . The addition and multiplication of such polynomials follows the expected rules, with the relation /(e) = 0 being used to eliminate the powers en,en+1, ... from products. Before we get into the technicalities, here is a familiar example. Take F to be the field R of real numbers, and let / = X2 + 1. Then the elements of RfXj/Rpf]/ have the form go + gie, where e2 = —1. Apart from nota tion, this is the classical definition of the complex numbers as pairs of real numbers. Now we start our formal analysis of the residue ring F[X]/F[X]f. First we note that we have not lost any generality by imposing the requirement that / is monic, since the ideal F[X]f is unchanged if we replace / by uf for any nonzero constant u (see Lemma 1.8.1 and Corollary 2.5.2). There is no loss either in requiring that deg(/) > 1, since when / is a nonzero constant polynomial, F[X]f = F[X] and so F[X]/F[X]f is the trivial ring. Now let g € F[X) be arbitrary. As we saw in section 2.3, we can write g = qf + r with the remainder r having degree less than n. Since g = f in F[X\/F[X]f, the elements of F[X]/F[X]f can be taken to have the form g with g £ F[X], deg(g)
g0,...,gn-i

€ F,

where any (or indeed all) of the coefficients gi can be 0 - the notation is not to be interpreted to imply that deg(
=

Ifi + giX + g2X2 +

hffn-iX™"1

Chapter 2. Euclidean Domains

30

=

m+&X~

+ teX2 + --+g^VT~1.

(2.1)

To make this expression tidier, we notice that for scalars k, k' € F, we have k = F in F[X]/F[X]f <=> k = k' in F and furthermore k + k' = k + F and k ■ k' = k~ ■ k'. We can therefore regard F as being contained in F [ X ] / F [ X ] / by agreeing to identify k with fc for each element k € F. For k € F and 7; € F [ X ] / F [ X ] / , we interpret k ■ ~g as fcg, so that F[X]/F[.Y]/ becomes a vector space over F. Put e = X. We can now rewrite Equation 2.1 in the friendlier form ffn-ieB~l.

9 = 90 + gie + 92
(2-2)

The scalars go,gi, ■ ■ ■, gn-i are uniquely determined by the element ~g of F[X]/F[X]f, since otherwise we could find another polynomial h of degree at most n — 1 with h = ~g, contrary to the uniqueness of g. It follows that the set { l , « , c a , . . . , c n - l } > n = deg(/), is a basis of F [ X ] / F [ X ] / as a vector space over F . This basis is called the canonical basis of F[X]/F[X]f. It is clear that the addition of elements of F[X]/F[X]f follows the rule for addition of polynomials. By construction,

m = / = o, so that the multiplication in F[X]/F[X]f plication, together with the rule en = - / o - fit -fie2

follows from polynomial multi /n-ie"-1

(2.3)

where f = fo + fiX + --- + / n - i X " - 1 +Xn,n

= deg(/).

Note: the "root" e that we have constructed for the polynomial / is an algebraic entity. Even if / has real coefficients, it may not be possible to identify e as a real or complex number. For example, if / = X2, then e2 = 0 but e -fi 0. This observation illustrates the fact that the residue ring F[X]/F[X]f will not be a field if / is not irreducible.

2.13. Splitting fields for polynomials

2.13

31

Splitting fields for polynomials

Given a polynomial / with coefficients in a field F, it is possible to construct g a bigger field E in which f(X) has "all its roots". More precisely, / is said to split in E if, in E[X], every irreducible factor p of / is linear, that is, p = X - A for some A in E. The field E is then called a splitting field for / . Each linear factor of / corresponds to a root A of / - see Exercise 2.2 so a splitting field of / is indeed a field in which / has all its roots. 2.13.1 Proposition Let F be a field and let f be a polynomial with coefficients in F. Then there is a field E containing F in which f is splitProof Write the irreducible factorization of f(X) f(X)

= (X-\1)-.-(X~

in F[X] as

Afc)pi(X) ■ • -p,(X)

where we have gathered all the linear factors of f(X) at the start. We allow the possibilities k = 0 or s = 0, and the Aj need not be distinct. Let n = deg(/) and put d = d(f,F) = n — k. We induce on d. The initial case is that d = 0. Then k = n, so we must have s = 0, that is, F is already a splitting field for / . Suppose that d > 0, so that s > 1. Let F' = F[X]/F[X]pi be the ring constructed in section 2.12. Then F' contains F. Further, by Proposition 2.11.1, F' is a field, and Equation 2.2 shows that F' contains a root of p\. Thus the factorization of / over F' has more than k linear terms, so that

d(f,F')
2.14

Further developments

• Some authors replace axiom ED 2 by a weaker statement: tp(ab) > ip(a) for all a , i e i J .

Chapter 2. Euclidean Domains

32

This weaker axiom has the advantage that it allows more rings to be considered to be Euclidean domains than does ED 2, while causing only a small increase in the complexity of proofs. However, all the Euclidean domains that we meet in this text satisfy the stronger axiom. • The technique used to show that the Gaussian integers are a Euclidean domain can be extended to some similar types of ring - see Exercise 2.8 below and [Allenby] §3.7 for some of the easier cases. Comprehensive dis cussion of the limits of the technique are given in Chapter 14 of [H & W] and in [E, L & S]. • A commutative domain in which every ideal is principal is called a prin cipal ideal domain. By Theorem 2.5.3, a Euclidean domain is a principal ideal domain. The Unique Factorization Theorem holds over a principal ideal domain, as do most of the results that we subsequently obtain for modules over Euclidean domains, but the arguments are more technical. As genuine (that is, non-Euclidean) principal ideal domains are rarely en countered in undergraduate mathematics, and not too often at any level, this text keeps to Euclidean domains. The proof of unique factorization in a principal ideal domain can be found in section 2.15 of [Jacobson] or section 10.5 of [Cohn 1], among others. The most accessible non-Euclidean principal ideal domain is the ring Z[(l + \/—19)/2]. The fact that this ring is not Euclidean is proved in the sources mentioned in the previous paragraph, while some algebraic number theory is needed to show that it is nevertheless a principal ideal domain. [Marcus] gives a nice account of the calculations needed; Exer cise 9 of Chapter 5 is particularly relevant. • The results of this chapter (and book) can be extended to noncommutative versions of Euclidean domains ([B k. K: IRM], Chapter 3) and to noncommutative principal ideal domains ([Cohn: FRTR], Chapter 8; [Rowen]).

Exercises 2.1

2.2

Suppose that pi,- ■ ■ ,Pk are distinct prime numbers. Show that the product Pi•■•Pk + 1 has a prime factor q with q ^ Pi for any i. Deduce that there are infinitely many prime numbers. Let f(X) and X - A be polynomials over a field F. Show that f(X) = q{X)(X - A) + r where r G F. Deduce that A is a root of f(X) if and only if X - A | f{X). Prove further that f(X) can have at most deg(/) distinct roots in F

Exercises 2.3

2.4

2.5

2.6 2.7

2.8

33

Let p e Z b e prime. Arguing directly from Gauss's Lemma, show that the polynomials Xn — p are all irreducible for n > 2. Generalize your argument to a proof of Eisenstein's Criterion. Let R be a Euclidean domain and let a in R be neither a unit nor irreducible. Show that the ring R/Ra contains a nontrivial divisor of 0. Let f(Y) = f(Y + 1) be the polynomial obtained by the change of variable X = Y +1 from f(X) € F[X], F a field. Show that fg = fg. Deduce that f{X) is irreducible if and only if f(Y) is irreducible. Let p € E be a prime number. Prove that the polynomial Xp~ * + • • ■ + X + 1 = (X" - 1)/(X - 1) is irreducible in Q[X]. Show that the polynomial ring F[X, Y] in two variables over a field is not a Euclidean domain. Let R = Z[%/—5]. Show that 3, 2+\J—5 and 2 — yj— 5 are all irreducible elements of R and that no two of them are associates. Verify that 3 2 = (2 + v / ^ 5 ) ( 2 - V ^ 5 ) . This means that unique factorization does not hold in R and hence that R cannot be a Euclidean domain. Confirm this by showing that the ideal 3R + (2 + y/—5)R is not principal. Some more Euclidean domains. Here are some rings which can be shown to be Euclidean domains using mild variations of the technique for the Gaussian integers. i: Z[yf^\, = (— 1 + -
2.9

hi: Z[\/2], ip(a + by/2) = \a? - 2b2\, the absolute value. In Z[i/2]i verify that 1 + v7^ is a unit of infinite order. Using the method of (2.10), draw up a list of "standard" irreducibles in the Gaussian integers Z[i] which contains all the distinct irreducible factors of the (ordinary) primes 2,3,5,7. Hence give a standard fac torization of 4200 in Z[i]. Remark It is not too hard to compute the irreducible factorizations of small prime integers p = 2,3, 5 , . . . in the rings Z[>/—2] and Z[w]. The same computations in "L[y/2\ are already quite tough with the elementary methods at our disposal, as is the proof that U (z[v/2]) = {±(1 + \/2) 1 | t € Z } . Chapter XV of [H & W] contains a good account of these topics.

Chapter 3

Modules and Submodules Now that we have finished our introductory survey of rings, we can intro duce the main objects of our enquiries in this text, namely, modules. This chapter is concerned with the definitions and general properties of mod ules and their submodules, together with concrete examples arising from additive groups and from matrices acting on vector spaces. The latter type of module will prove to be crucial to our investigation of normal forms for matrices in Chapter 13. Although we are mainly interested in modules over commutative do mains, we allow the ring of scalars to be arbitrary in our basic definitions.

3.1

The definition

Let R be a ring. A left R-module M is given by two sets of data. First, M is to be an additive group. Thus for m, n £ M, there is a sum m + n £ M, and the addition satisfies the requirements listed in section 1.1. Second, the elements of the ring R must act by left multiplication on the members of M, so that for m £ M and r £ R, there is an element rm £ M. This action is called scalar multiplication, and it must satisfy the following axioms. SML 1: (rs)m = r(sm) for all m € M and r,s £ R. SML 2: r(m + n) = rm + rn and (r + s)m = rm + sm for all m,n £ M and all r,s £ R. SML 3: l m = m for all m in M, where 1 is the identity element in R. 35

36

Chapter 3. Modules and Submodules

The last axiom can be stated as "M is a unital module". Sometimes it is convenient to allow non-unital modules, but we shall not do so in this text. The ring R is called the ring of scalars for M. If the elements of R act on the right of M, then we obtain a right module, and the rules for scalar multiplication are changed accordingly: SMR 1: m(rs) = (mr)s for all m e M and r,s 6 R. SMR 2: (m + n)r = mr + nr and m(r + s) = mr + ms for all m,n e M and all r,s € R. SMR 3: m l = m for all m in M, where 1 is the identity element in R. Extremely important examples of R-modules arise from the ring R itself, which can be regarded either as a left module or as a right module. By the definition of a ring, R is an additive group. To obtain a left scalar multiplication, we simply view the multiplication in R in a new way: a product rs, r,s G R, is interpreted as the result of r acting on s. The axioms for scalar multiplication hold since they are just re-intepretations of the axioms for ring multiplication listed in section 1.2. On the other hand, we can interpret the product rs as the result of s acting on r and so turn R into a right R-module. When R is viewed as a left or right R-module in this way, it is often called the (left or right) regular .R-module. Equally ubiquitous are the zero modules. For any ring R, the set {0} is both a left and a right R-module, with rO = 0 = Or always. We denote any zero module by 0 - thus we use the same notation for a zero module as we do for a zero element, but this should not cause any confusion in practice. Suppose that the ring of scalars R is commutative. Then we can convert any right module into a left module by the rule rm = mr for all r € R,m € M, and likewise, any left module is equally a right module. Thus, in the case of main interest in this text, we need not distinguish between left and right modules. We will use the left-handed notation for a module over a commutative ring. In general, a careful distinction must be made between left and right modules, as we can see from Exercise 3.10 below, and from Chapter 14. The modules over one special kind of ring are familiar from elementary linear algebra. When F is a field, an F-module is the same thing as a vector space over F . Since a field can be regarded as a trivial Euclidean domain (2.1), the theory of modules over Euclidean domains includes the theory of vector spaces. However, we need to assume a prior knowledge of vector space theory so that we can introduce some of our basic examples of modules.

3.2. Additive groups

3.2

37

Additive groups

Next, we show how an additive group can be viewed as a Z-module. Since a multiplicative abelian group can be re-written as an additive group (section 12.10), this observation will allow us to obtain results about abelian groups from our general theory of modules over Euclidean domains. Let A be an additive group (1.1). Intuitively, the action of a positive integer n on an element a in A is given by na = a + ■■■ + a, where there are n a's in the sum. A more formal inductive definition runs as follows. To start, define 0a = 0, where the first "0" is the zero in Z and the second "0" is the zero in A. Then, for n > 0, put na = (n - l)a + a, and for n < 0, put na = ~{—n)a. Using the natural notation m, n for integers and a, b for members of the abelian group A, the scalar multiplication axioms now read (mn)a = m(na), (m + n)a = ma + na, m(a + b) = ma + mb and la

=

a.

These are the expected rules for computing multiples and we will take them for granted. Formal proofs by induction are not hard, and so they are left to the reader. Aside on notation. From time to time, we will use the symbol m in two ways. Sometimes it will denote a typical element of a module M and at others it will indicate an integer. This should not cause any confusion in practice, since the context will make it clear which is meant.

3.3

Matrix actions

Perhaps the most important examples of modules, at least as far as this text is concerned, are modules that arise through the action of a matrix on a vector space. The relationship between matrices with entries in a field F and modules over the polynomial ring F[X] is the key to the derivation of normal forms of matrices in Chapter 13. Let F be a field and let Fn be the vector space over F consisting of all / vi

\

V2

with each v< € F.

column vectors v

\vn

)

Chapter 3. Modules and Submodules

38

Addition and scalar multiplication in Fn are given by the expected rules:

/ n \

(

(

\

Wi

Vi + VJi >

W2

V2 + W2

\ wn /

\ vn + wn )

V2

+ )

\Vn

and /

vi

\

( kVl

\

kv2

V2

V vn j

fc

^n

\

/

where k is in F. Now suppose we have a n n x n matrix / a n c-12

A=

O-il

ain

\

ann

j

0-i2

\ o.„i a„2

...

with entries O;J belonging to F. Then for each v € Fn, we can form the vector / auvi + ai2V2-\ Av =

\-ainvn

\

auvi + ai2V2 + ■ ■ ■ + ainvn \ a-niV\ + an2V2 H

V annvn

J

which again belongs to Fn; this is what we mean when we say that "the matrix A acts on the space Fn ". Careful calculation shows that for any such matrix A, any vectors v, w in Fn and any scalar k in F, we have A(v + w) = Av + Aw and A(kv) = k(Av), that is, A acts as an F-linear transformation on Fn.

39

3.4. Actions of scalar matrices

Since the powers A2, A3,..., A1,... of A are also nxn matrices over F, n they all act on F , giving vectors A2v, A3v,..., Aiv,... for v e Fn. This allows us to define a scalar multiplication in which polynomials / in F[X] act on vectors in Fn. Given / = /o + fiX + ■ ■ ■ + fiX1 + ■ ■ ■ + fnXn

e F[X],

put fv = f0v + fxAv

+ ■■■ + fiA{v + ■■■ + fnAnv

With the convention that ^4° = / , the nxn fv as the sum

for v e Fn.

identity matrix, we can write

fv = YdfiAiv. i=0

A great deal of checking (which is left to the reader as a very good exercise) confirms that Fn becomes an F[X]-module M with this scalar multiplication. We use the expression "M is given by X acting as A" to indicate that the F[X]-module M is Fn made into an F[X]-module by this construction. The value of n, the choice of F and the fact that A is an n x n matrix over F will usually be clear from the context. Note that each choice of an n x n matrix gives a different module structure on Fn. For this reason, we sometimes use the notation M(A), M(B),... to indicate the modules given by X acting as the matrices A, B,.... Here are some uncomplicated examples.

3.4

Actions of scalar matrices

A scalar matrix is an n x n matrix A = XI, where / is the nxn identity matrix and A is in F. (We use A rather than A; for a scalar in this context since in most applications A will be an eigenvalue of some matrix.) Because P = I always, A1 = X'l for any i > 0. Also, Iv = v for all v in Fn, which means that, with X acting as A, the result of scalar multiplication on Fn by / = /o + fiX + ■ ■ • + ftXl is

+ ■ ■ ■ + fnXn

fv = f0v + fiXv + ■■■ + fiX{v + ■■■ + fnXnv,

e F[X] for v e Fn.

An important special case occurs when A = 0, that is, A is the zero nxn matrix. Then Xv — 0 always, and fv = f0v for any polynomial / . In this

Chapter 3. Modules and Submodules

40

case, the corresponding F[X]-module is called the trivial F[X]-module, on which X acts as 0. Be careful not to confuse the trivial module with the zero module; the latter has only one element 0. When n = 1, the vector space F1 is simply the field F regarded as a vector space over itself. (This is a special case of the fact that any ring R can be thought of as an .R-module.) A 1 x 1 matrix is, for all practical purposes, the same thing as an element A in F, so we find that for each choice of A, F can be made into an F[Af]-module with X acting as A. The action of a polynomial / is given explicitly by fk = f0k + fx\k + ■■■ + fiXk + ■■■ + fnXnk,

keF.

For example, if A = —1, then fk = f0k -fik

3.5

+ f2k + --- + {-l)lfik

+ ■■■ + (-l) n /„fc.

Submodules

Before we discuss some further explicit examples of modules, we take a first look at the internal structure of a general module. Let M be a left module over an arbitrary ring R. An R-submodule of M is a subset L of M which satisfies the following requirements. SubM 1: 0 6 L. SubM 2: If I, /' € L, then I + I' e L also. SubM 3: HI € L and r £ R, then rl € L also. A submodule L of a left .R-module is itself a left .R-module, since the axioms for addition and scalar multiplication already hold in the larger module M. If the ring R of scalars can be taken for granted, we omit it from the ter minology and say that L is a submodule of M rather than an i?-submodule. An .R-submodule of a right module M is defined by making the obvious modification to axiom SubM 3: SubMR 1: HI € L and r £ R, then Ir € L also. Clearly, a submodule of a right /^-module is again a right module. General statements and definitions about submodules of left modules have obvious counterparts for submodules of right modules. As our main interest is with modules over commutative rings, for which we use the lefthanded notation, we will as a rule give only the left-handed versions of such statements and definitions. The reader should have no problem in providing the right-handed versions where desired.

3.6. Sum and intersection

41

When the ring of scalars R is commutative, then any left module M can be regarded as a right module, and any submodule of M as a left module is equally a submodule of M as a right module. A left module M is always a submodule of itself. A submodule L of M with L / M is called a proper submodule of M. At the other extreme, any module has a zero submodule {0}, which we usually write simply as 0. The zero submodule is a proper submodule of M unless M is itself the zero module. A left module S is said to be a simple module if S is nonzero and it has no submodules except 0 and itself. The description of the simple modules over a ring R is one of the fundamental tasks in ring theory. Next, we look at some situations where submodules are already known to us under different names. • Z-modules. As we noted in section 3.2 above, a Z-module A is the same thing as an abelian group, written additively. A subgroup B of A, is, by definition, a subset of A which satisfies conditions SubM 1 and 2. However, if B does satisfy these conditions, then it also satisfies SubM 3, since scalar multiplication by an integer is essentially repeated addition or subtraction. Thus, a Z-submodule of A is the same thing as a subgroup of A • Vector spaces. Let F be a field. Then an F-module is a vector space V over F, and an F-submodule W of V is more familiarly called a subspace of V. • Ideals. As we noted when we first defined modules in section 3.1, a ring R can be considered to be both a left i?-module and a right R-module. If we compare the definition of a submodule with that of an ideal (1.7), we find that a left ideal of R is the same thing as an R-submodule of the left regular R-module R, while a right ideal of R is an R-submodule of R when it is considered to be a right .R-module. This distinction is illustrated in Exercise 3.10.

3.6

Sum and intersection

A fundamental problem in module theory is the description of a given module in terms of a collection of submodules, each of these submodules being in some sense "simpler" than the original module. As a first step toward this goal, we give two basic methods of constructing new submodules from old. We assume throughout that our modules are left modules; the modifications for right modules are straightforward.

Chapter 3. Modules and Submodules

42

Suppose that L and N are both submodules of a module M. Their sum is

L + N = {l + n\leL,

n e N}

and their intersection is LC\N = {x\x£L

and x £ iV},

which is the intersection of L and N in the usual sense. The elementary properties of the sum and intersection are given in the following lemma, which we prove in great detail as it is our first use of the definitions. 3.6.1 Lemma (i) Both L + N and LC\ N are submodules of M. (ii) L + N = L N C L. (Hi) LnN = L <=> LCN. Proof (i)

We check the submodule conditions one by one.

SubM 1: 0 is in L + N since 0 = 0 + 0 where the first 0 belongs to L and the second 0 belongs to N - both zeroes are, of course, the zero element of M. SubM 2: Suppose m,m' e L + N. Then m = l + n and m' = /' + n' where 1,1' € L and n, n' € N, so that m + m' £ L + N since l + l' € L and n + n' € N. SubM 3: Carrying on the same notation, if m = I + n is in L + N and r £ R, then rm = rl + rn is in L + N since rl £ L and rn £ N. The argument for L n N is even easier, so it is left to the reader. (ii) Suppose L + N = L. Given n £ N, n = 0 + n i s also in L + N, so that n £ L, that is, N C L. Conversely, if N C L, then we must have / + n £ L for any I £ L and n £ N. (iii) This assertion is a result in set theory rather than module theory.

□

3.7. k-fold sums

3.7

43

/c-fold sums

It will be necessary to extend the definition of a sum to allow for an arbitrary number of submodules, rather than just two. Let L\,..., Lk be any set of submodules of a left module M, where k > 0 is an integer. Their sum is defined to be L1 + --- + Lk = {li + --- +

lk\l1eLu...,lk€Lk}.

When k = 0, this sum is to be interpreted as the zero submodule 0 of M, and for k = 1 the "sum" is simply the submodule L\ itself. For k = 2 we regain the previous definition in different notation. The proof that L\ + ■ ■ ■ + Lk is a submodule of M is similar to that given above and is left to the reader.

3.8

Generators

A convenient way to specify a module or one of its submodules is in terms of generators. In fact, our investigation of the structure of modules over Euclidean domains will be based on an analysis of the generators of the modules. Again, we work only with left modules, and leave right-handed definitions to the reader. First, we consider a single generator. Let M be a left .R-module and let x be an element of M. The cyclic submodule of M generated by x is defined as Rx = {rx | r e R}. The left module M itself is said to be cyclic if M = Rx for some x e M, and the element x is called a generator of M. The confirmation that Rx is actually a submodule of M is a trivial extension of the argument given for principal ideals in section 1.8; also, in a moment we will give a more general calculation which includes the cyclic case. Notice that a principal left ideal Rx is a special type of cyclic submodule, where we take M = R and x in R. We next consider finite sets of generators. Let X = {x\,... ,xt} be a finite subset of M, and put L{X) = { n i l H

\-rtxt | ri,...,rt

which is the same as saying that L(X) =Rxi-\

\-Rxt,

6 R},

Chapter 3. Modules and Submodules

44

the sum of the cyclic submodules Rx\,..., RxtThen L(X) is called the submodule of M generated by the set X. The fact that L(X) is a submodule follows from the following equations, which are easy consequences of the axioms for a module. SubM 1: 0 = Qxx + • ■ ■ + 0x t £ L{X). SubM 2: (rxxi H

h r t x t ) + (siZi H

for all r\,..., SubM 3:

h stxt) = (n + s x ) xx H

(- (r t + s t ) x t

rt and S i , . . . , st £ R-

r ■ (rii! H

h rtxt) = (r • r i ) X\ +

h (r • r t ) i t

for all r, r i , . . . , rt £ i?. If M = L(X), then we say that X is a set of generators for X, or that "X generates M". A finitely generated module is one that does have a finite set of generators - these are the modules which interest us in this text. The most familiar examples of generating sets occur in linear algebra, as bases of vector spaces. We review the definitions briefly. Let V be a vector space V over a field F. In elementary linear algebra, a generating set for V as an F-space is more often called a spanning set for V; it has the property that for any v £ V, there are scalars k\,..., kt £ F so that v = kiXi + 1- ktxt. A basis of V is a spanning set X which is linearly independent, that is, if k\X\ + ■ ■ ■ + ktxt = 0 with k\,..., fct £ F, then fcx = • ■ ■ = kt = 0. If V has a finite generating set X, then a finite basis of V can be obtained from the generating set by successively omitting elements. Moreover, any linearly independent subset Y of V can be extended to a basis by adding suitable members of X, and any two bases of V have the same number of members, this number being the dimension of V. Naturally enough, we refer to a finitely generated vector space as definite dimensional vector space. The problem of extending these definitions and results to modules over Euclidean domains in general will occupy us in later chapters. It suffices for the moment to warn the reader that we will encounter phenomena that do not occur in vector spaces.

3.9. Matrix actions again

45

For example, the sets X = {1} and Y = {2,3} are both generating sets of Z, considered as a module over itself. The set X is linearly independent in an obvious sense, but Y is not. Further, neither element of Y can be omitted to give a generating set with one member.

3.9

Matrix actions again

Let A be an n x n matrix over a field F and let M be the F[X]-module obtained from the vector space Fn with X acting as A (3.3). We will see that the F[X]-submodules of M are determined by the action of A on the subspaces of Fn. Suppose first that L is an F[X]-submodule of M. Then L is a subset of Fn, and, by axioms SubM 1 and 2, L must contain the zero vector and it must be closed under addition. Since the elements of the field F can be regarded as constant polynomials and L is closed under scalar multiplica tion by polynomials (SubM 3), L is closed under scalar multiplication by elements of the coefficient field F. These remarks show that L must be a subspace of the space Fn. Appealling to axiom SubM 3 again, we have XI € L for any I £ L. Since XI = Al, we have A- L C L, which means that the subspace L is invariant under A. Conversely, if U is a subspace of Fn which is invariant under A, then for any u € U, we have Au e U and hence A2u € U, A3u 6 U, and Alu e U for all i. Thus, for any polynomial / = /o + hX + ■ ■ • + fiX* + ■ ■ • + fnXn

£ F[X],

and any vector u £U, f • u - f0u + fiu H

+ fiA*u H

+ fnAnu

is in U,

which shows that £/ is closed under scalar multiplication by polynomials and so defines a submodule of M. The correspondence between submodules and subspaces will be used frequently, so we state it formally as a theorem. 3.9.1 T h e o r e m Let F be a field, let A be an n x n matrix over F, and let M be the F[X)-module obtained from the vector space Fn with X acting as A. Then there is a bijective correspondence between (i) F[X)-submodules L of M, and

46

Chapter 3. Modules and Submodules

(ii)

F-subspaces U of Fn which are invariant under A, that is, AU C U.

□ Now we have given the general description of submodules of modules defined by matrix actions, we look at some increasingly specific calculations.

3.10

Eigenspaces

Given an n x n matrix A over a field F, an eigenspace for A is a nonzero subspace U of Fn with the property that there is a scalar A g F s o that Au = Aw for all u € U. The scalar A is the eigenvalue of A corresponding to U, and a nonzero vector u £ U is an eigenvector. Notice that we allow the possibility that A = 0. The remarks in section 3.4 show that any eigenspace gives an F[X]submodule of the F[X]-module that arises from Fn with X acting as A. The converse is far from true - usually, there are invariant subspaces which are not eigenspaces (see Exercise 3.6). However, a reasonable first approach to the problem of determining invariant subspaces is to compute the eigen values and then the eigenspaces of A. We recall from elementary linear algebra how this is done. Let I = In be the n x n identity matrix. We have Au = Au for some u ^ 0

■<=►

(XI — A) u = 0

<=*

det(AJ

-A)=Q,

where det(5) denotes the determinant of a matrix B. It is well known that the expression det(XI — A) is a polynomial in the variable X, of degree n. It is in fact the characteristic polynomial of A, which will play an important role later in these notes. If we can find a root A of det(XI — A) in F (there is no guarantee that this can be done), then we can find the eigenvectors u and the eigenspace U by solving the system of linear equations (XI — A)u = 0. In one elementary but important special case, a submodule must be given by an eigenvector. 3.10.1 Lemma Let F be a field, let A be annxn matrix over F, and let M be the F[X]module obtained from the vector space Fn with X acting as A. Further, suppose that the subspace U of Fn is one-dimensional over F.

3.11. Example: a triangular matrix action

47

Then U gives an F[X]-submodule of M if and only if U = Fu is an eigenspace of A, where u is an eigenvector for some eigenvalue A of A. Proof Suppose that U does give a submodule, so that U is invariant under A. Since U has dimension 1, U — Fu for some vector u / 0 . But Au e U, so we have Au = A • u for some A e F. The converse is clear from the preceding discussion. □ Here are some concrete examples to illustrate the theory.

3.11

Example: a triangular matrix action

l i " Let F be any field and put A = I o l 1. Let M be F2 regarded as an F[X]-module with X acting as A, so that for

"'-- \ l I ^ A/> x+y

Xm-i

y

A proper subspace of F2 must have dimension 1, and hence a proper submodule L of M must be given by an eigenvector of A. The eigenvalues of A are the roots of

det( X~l

X11)=(X-1)\

so the only eigenvalue is 1. The eigenvectors are found by solving the equations

. 0 0 ){y

)

=

{0

and so have the form

(o for x / 0. It follows that there is exactly one proper submodule of M, given by the subspace U = F ( _ j of F2. Noice that this result is not influenced by any specific properties of the field F - it holds if we take F to be K, C, Z p where p is a prime, or any other field.

Chapter 3. Modules and Submodules

48

3.12

Example: a rotation

In this example, the nature of the field of coefficients F is important. Let A = (

~

) and let M be F2 regarded as an .F[X]-module with X

acting as A. Thus for

m =

Xm

(

X

}

e

M,

=

Again, any proper submodule L of M must be given by an eigenvector of A. The eigenvalues of A are the roots of

det (

£ J =X2 + 1.

X x

Now suppose that F = R, the field of real numbers. Then A has no eigenvalues, so M cannot have any proper submodules, that is, it is a simple R[X]-module. This result is also intuitively true geometrically, since A corresponds to a rotation of the plane through it/2. However, if we take instead F = C, the complex numbers, then there are two eigenvalues, +i, —t, with respective eigenvectors v+ = (

\

I and V- = ( l

J

V -»

We obtain two one-dimensional C-submodules of M, namely L + = Cv+ and L_ = Cv_.

Exercises 3.1

3.2

Show that the set {6,10,15} generates Z as a Z-module, but that no subset of {6,10,15} generates Z. For each k > 4, find a set A). = {ai,.,.,afe} of integers which generates Z, so that no subset of Ak generates Z. Let p e Z be a prime number. Show that if £ is a nonzero element of the additive group Z p , then x is a generator of Z p as a Z-module. Deduce that Z p is a simple Z-module. Hint: Corollary 1.11.2. Remark this result is more frequently given in the form "a cyclic group of prime order has no nontrivial subgroups".

Exercises 3.3

49

Let A be an abelian group and let r 6 Z. Show that rA = {ra | a G >!} is a subgroup of A. Take A - Z m - Prove that rA = A <=> (r, m) = 1 and that rA = 0 <=> m divides r. More generally, let R be a Euclidean domain and let M be an .R-module. Given an element r e R, show that rM = {rm | m £ M }

3.4 3.5

3.6

3.7

is a submodule of M. Suppose that M = R/Rx for some x e R. Determine conditions on r so that (i) rM = M and (ii) rM = 0. Repeat Example 3.12 for the finite fields Z p , p prime. (Note that y/—l € Z p <=> p = 1 mod 4 for odd p.) Let M be the C[X]-module given by an n x n matrix A acting on C n . Show that M is simple if and only if n = 1. /0 1 1\ Let M be the C[X]-module given by the matrix A = I 0 0 1 I \ 0 0 0 / acting on C 3 . For each vector v € C 3 , let L(v) be the cyclic C[X]submodule of M generated by v, and write L0 for L{e\) where e\ = 1 0 0 Show that Cei is the only eigenspace of C 3 . Deduce that LQ C L(v) for any v ^ 0. Find all v with (a) dimL(v) = 2, (b) dimL(v) = 3. Do your answers change if the field C is replaced by an arbitrary field F? / 0 1 0 Let N be the C[X]-module given by the matrix B = j 0 0 1 \ 1 0 0 acting on C 3 . (a) Prove that there are exactly 3 submodules of iV that are onedimensional as vector spaces over C. (b) Show that TV is a cyclic C-module. (c) Show that the submodule generated by I — 1 I is two-dimensional.

Chapter 3. Modules and Submodules

50

3.8

(d) Investigate what happens if the field of complex is replaced by the real numbers E, or by the finite fields Zj, Z3, or Z7. Let R be an arbitrary ring and let M be an R-module. Suppose that J is a two-sided ideal of R with the property that IM = 0, that is, xm = 0 for all x £ / and m € M. Show that the rule r m = rm for all r € R / i and m G M

gives a well-defined action of R/I on M and that M is an R/J-module with this action as scalar multiplication. 3.9 Let R be a ring and let M be a simple left .R-module. Show that any nonzero element of M is a generator of M. C 3.10 Let R be the ring of 2 x 2 matrices over a field F. Let / be the set of matrices of the form I

n

12

I and let J be the set of matrices

n of the form ( 021 „0 |. Show that, under the usual rules for matrix multiplication, / is a right ideal of R (and hence a right R-module) but that I is not a left ideal. Show also J is a left ideal but not a right ideal in R. Prove that I is simple as a right module and that J is simple as a left module. Generalize these results to the ring o f n x n matrices over F. (See also Exercise 7.8.)

Chapter 4

Homomorphisms We next introduce a fundamental concept in module theory, that of a homomorphism, which is a map from one module to another that respects the addition and scalar multiplication. A knowledge of the homomorphisms be tween two modules allows us to compare their internal structures. In later chapters, we use an analysis of homomorphisms to obtain the fundamental results on the structure of a module over a Euclidean domain. Just as a vector space over a field F is a special type of module, a linear transformation between vector spaces is another name for a homomorphism between them. We will show how to describe the homomorphisms between modules over a polynomial ring F[X] in terms of linear transformations between the vector spaces that underlie the modules. We also show that a general .F[X]-module arises through "X acting as a linear transformation" on an underlying vector space, which need not be a standard column space Fn.

4.1

The definition

Let R be a ring and let M and N be left .R-modules. An R-module homo morphism from M to N is a map 9 : M -> TV which respects the addition and scalar multiplication for these modules. More formally, 6 must satisfy the following axioms. HOM 1: 6(m + n) = 6(m) + 6{n) for all m,n e M. HOM 2: e\rm) = rO{m) for all m G M and all r £ R. If M and N are both right modules, the second condition is replaced by 51

52

Chapter 4.

Homomorphisms

HOMR 1: 6{mr) = (6(m))r for all m e M and all r £ R. We will work only with left modules and their homomorphisms in this chapter. Alternative terms are module homomorphism, when there is no doubt about the choice of coefficient ring R, or simply homomorphism if it is obvious that we are dealing with modules rather than groups, rings, or some other mathematical structure. When the ring of scalars is a field F, so that M and N are then vector spaces over F, an F-module homomorphism is more familiarly known as an F-linear transformation or an F-linear map, or simply a linear transfor mation or linear map. Some authors extend the vector space terminology to modules in gen eral, and speak of ".R-linear transformations" or ".R-linear maps". However, as we shall be concerned with the relationship between F-linear maps and F[X]-module homomorphisms when we analyse the structure of modules over polynomial rings, it will be convenient to limit the use of the term "linear" to vector spaces. When M and N are Z-modules, that is, additive groups, the second axiom HOM 2 follows automatically from the first. Thus a Z-module ho momorphism is another name for a group homomorphism from M to N. (The reader who has not studied group theory can take this as a definition.) Here are three homomorphisms that are always present. • Given any module M over any ring R, the identity

homomorphism

idM : M ->■ M

is defined by idM (TTI) = m for all m € M. • Given a submodule L of M, there is an inclusion

homomorphism

inc : L ->■ M, defined by inc(l) = I for all / e L. At first sight, it may seem pointless to give names to these "do nothing" maps, but there are circumstances where it is very useful to be able to distinguish between an element of L regarded as an element of L and the same element regarded as an element of M.

4.2. Sums and products

53

• If TV is also an R-module (the possibility TV = M is allowed), the zero homomorphism 0 : M -+ TV is defined by 0(m) = 0 e TV for all m 6 M. Notice that the symbol "0" is used in two ways in this expression: once for the zero map that we are denning, and again for the zero element of TV. An attempt to introduce a separate label for every zero that we encounter would lead to overcomplicated notation.

4.2

Sums and products

Suppose that 9 : M -> TV and V : M —> TV are both 7?-module homomorphisms. Their sum e + ip : M -+ TV is defined by (9 + ip){m) = 9(m) + ip(m) for all m e M. That the sum is again an /^-module homomorphism is confirmed by routine checking: for m,m' 6 M and r € R, we have (9 + ip)(m + m')

= = =

9{m + m')+4>(m + m') 9{m) + 9{m') + tp(m) + ip(m') {9 + V)(m) + (9 + iP)(m')

and (9 + ip)(rm)

= =

9{rm) + ip(rm) r ■ 9(m) + r ■ ip(m)

=

r ■((
If we are given i?-module homomorphisms 9 : M -> TV and : TV —> P, their product 4>9:M-+P is defined by {<j>9){m) = 9 is also an i?-module homomor phism. Further properties of sums and products are developed in Exercises 4.1 and 4.2 below.

54

4.3

Chapter 4.

Homomorphisms

Multiplication homomorphisms

The scalar multiplication between a module and its ring of scalars can be used to define some homomorphisms, the multiplication homomorphisms, which turn out to be surprisingly useful, despite their elementary nature. Let R be any ring and let M be an i?-rnodule. Choose an element x in M, and define a map T{X) : R —> M by r(x)r = rx for all r £ R. It is easy to confirm that T{X) satisfies HOM 1 by using the distributive property of scalar multiplication (3.1), and HOM 2 follows from the iden tities r(x)(rs) = (rs)x = r(sx) = r(T(x)s), r,s € R, i £ M. For the next definition, we must suppose that R is commutative. Fix an element a of R and define a map a(a) : M —)• M by
4.4. F[X]-modules in general

55

Thus different integers can give the same homomorphism of M. By direct observation, cr(l) = id,M = o'(7), and it is fairly obvious that a ( l + 6k) = id,M for any k € Z. We can also see that cr(0) =
4.4

F[X]-modules in general

So far, our examples of modules over a polynomial ring F[X] have been constructed from the action of a matrix A on a vector space Fn. This is not quite the full story of how F[X]-modules can arise. Suppose that M is an F[X]-module, where F is a field. Since the elements of F can be regarded as constant polynomials, there is a scalar multiplication of F on M, and clearly M is then a vector space over F. We call this space the underlying space of M, and denote it by a new symbol, V. The multiplication homomorphism <J{X) : M —► M has the property that a(X){km) = k ■ a(X) for all me M and k £ F, and so a(X) : V -> V is an F-linear transformation. Conversely, suppose that V is any vector space over F and that a:V

-> V

is an F-linear transformation. Then a* : V -> V is an F-linear transformation for any i > 1, and we can define a scalar multiplication of polynomials f(X)

= fo + fiX + --- + fsX°

in F[X]

on vectors v in V by f(X)

-v = f0v + fiav + ■■■ + ftcSv + ■■■ +

fsasv.

A great deal of checking, which is left to the reader, confirms that V has become an F[X]-module. We give this a new name, M, and say that M is defined by X acting as a on V.

Chapter 4.

56

Homomorphisms

Notice that V is the underlying space of M and that u(X) = a. Suppose now that V = Fn and that A is an n x n matrix. Then the map a : Fn —»• Fn defined by a ■ v = Av is an F-linear transformation, and so the .FfXJ-modules given by matrix actions as in section 3.3 are special cases of the general construction. In the next chapter, we will see that a linear transformation of a finite dimensional space V can be represented by a square matrix A once we have chosen a basis of V. Thus any F[X]-module with a finite dimensional un derlying space can be regarded as arising through the action of a matrix. However, it is essential to use the more general description of F[X]-modules in terms of actions of linear transformations, since a given linear transfor mation can be represented by many different matrices, depending on the bases we choose for V. One of our prime objectives in these notes is to find normal forms for matrices by discovering the bases that are best adapted to a given transformation. Notice that the underlying space V of an F[X]-module need not be finite dimensional; for example, F[X] itself must be infinite dimensional since the set {1, X, X2,...} is linearly independent over F - in other words, a polynomial is 0 only if all its coefficients are 0.

4.5

F[X]-module homomorphisms

We now wish to describe the F[X]-module homomorphisms 9 : M —t N between two F[X]-modules, M and N. We write V for the underlying space of M and a : V —¥ V for the F-linear transformation that defines M, and W and /? for the corresponding data for N. Suppose that such a homomorphism 9 is given. Since the elements of F are constant polynomials, axiom HOM 2 gives 9(kv) = k6(v) for all v € V, so that 0 is also an F-linear transformation from V to W. Using axiom HOM 2 again, we must have 6{X ■ v) = X ■ 6(v) for all v, that is, 9a(v) = /38(v) always. Thus we have obtained the fundamental equality 9a = 130. Conversely, suppose that 9 : V -4 W is an F-linear transformation which satisfies this equality. Retracing our steps, we have 9(X ■m) = X ■ 9{m) for all m e M

4.6. The matrix

interpretation

57

and so 6{Xi ■m)=Xi

■ 6(m) for all m and all i > 1.

Thus, for any polynomial f(X) d(f(X),m)

= f0 + fiX +

h fsX3

and any m € M,

■ m) + ■ ■ ■ + 0(fsXs

=

8(f0m) + 0{hX

= = =

f06(m) + he{Xm) f0e(m) + hXB(m) f(X)-6(m),

+ ■■■ + + ■■■ +

■ m)

fse(X"m) fsXse{m)

which shows that 6 is an F[X]-module homomorphism from M to TV. In view of the importance of this discussion, we summarize it as a formal theorem. 4.5.1 T h e o r e m Let F be a field, and suppose that the F[X]-module M is given by the action of the F-linear transformation a on the F-space V and that N is given by the action of P on W. Then there is a bijective correspondence between (i) F[X]-module homomorphisms 9 : M —> N and (ii) F-linear transformations 6 : V —»■ W such that 6a = /39. □

4.6

The matrix interpretation

Suppose that we are in the special case where M is Fp made into an F[X]module by the action of a p x p matrix A over F and N is Fn made into an F[X]-module using an n x n matrix B. Any F-linear map 6 from Fp to Fn is given by an n x p matrix T such that 6{v) = Tv for all v £ Fp (this is another fact from elementary linear algebra that we will re-establish in the next chapter). Thus the F[X]module homomorphisms from M to N are given by those matrices T which satisfy the equality TA = BT.

4.7

Example: p = 1

As an illustration, we look at the elementary (but sometimes confusing) case in which p = 1 but n is arbitrary. The space F = F1 has dimension 1, and we can take the single element 1 G F to be a basis. The action of the

Chapter 4.

58

Homomorphisms

variable X on F is given by the constant A in F such that X ■ (1) = A • 1 (see 3.4). Thus F becomes an F[-X"]-module in a different way for every choice of A in F. An F-linear transformation 9 from F to F " is given by an n x 1 matrix, that is, a vector w in F n so that 0(k) = kw for all "vectors" k € F . Choose some A, let M be the corresponding F[X]-module, and suppose that F n is an F[X]-module TV through a n n x n matrix B. Then 9 defines an F[X]-homomorphism from M to TV if and only if w\ = Bw. Thus a nonzero vector w in F " gives a homomorphism precisely when it is an eigenvector of B. If we take w to be the zero vector, 9 is evidently the zero homomorphism.

4.8

Example: a triangular action

Let F be any field and let M be the F[AT]-module on F 2 given by the matrix ' 0 1 0 0 We describe all the 2 x 2 matrices T that give F[X]-homomorphisms from M to itself. Write T = [

n

12

V 121

*22

I . An easy computation shows that /

0 0

TA

in i2i

and AT =

*21

*22

0

0

so T gives a homomorphism precisely when i n = i22 and i2i = 0, that is, T = I

.

I with i n and ii2 arbitrary.

4.9

Kernel and image

Let R be an arbitrary ring and let 9 : M -> TV be a homomorphism of left Rmodules. We define two submodules, one of M and one of TV, that measure the failure of 9 to be injective or surjective. The kernel Ker(0) of 9 is Ker(0) = {m e M | 9{m) = 0}

4.9. Kernel and image

59

and the image lm(9) of 6 is lm(0) = {n G TV | n = 0(m) for some m 6 M } . The corresponding definitions for right modules are left to the reader. 4.9.1 Lemma (i) (ii) (Hi) (iv)

Ker(0) is a submodule of M. 0 is injective •<=> Ker(0) = 0. \m(6) is a submodule of N. 6 is surjective <=> Im(6) = N.

Remark. Some texts use the terms one-to-one for an injective map and onto for a surjective map. Proof (i) We prove this claim in full detail to provide a model for arguments of this type. We have to check the axioms listed in (3.5). First, we need 0 G Ker(0). In M, 0 + 0 = 0. Thus, in N, 6(0) = 6(0) + 6(0). But 6(0) has a negative in TV, so we get the equations 0

=

6>(0) - 6(0)

= =

(0(0)+ 0(0)) -0(0) 6(0)+ (6(0) -6(0))

=

6(0) +0

=

0(0).

Next, suppose that m,n € Ker(6'). Then 6(m + n)

= =

6(m) + 6(n) 0+ 0

=

o,

which shows that m + n € Ker(6). Finally, let m G Ker(0) and let r G R be arbitrary. We have 6(rm)

= = =

r ■ 9(m) r-0 0,

so that rm G Kei(6). (ii) ■$=: Suppose that 6(m) = 6(n) for elements m, n of M. Then 6(m - n) — 0, so m - n G Kev(6), giving m - n — 0 and m = n a s required.

Chapter 4.

60

Homomorphisms

= > : If m € Ker(0), then 0(m) = 0 = 0(0), which forces m = 0. (iii) This follows from the identities 0 = 0(0), 6(m) + 6(n) = 0{m + n), r ■ 0(m) = 6(rm). (iv) There is nothing to prove as the claim is simply a restatement of the definition of a surjective map. P

4.10

Rank & nullity

When 0 is a linear transformation from Fp to Fn given by an n x p matrix T, the kernel and image have a more familiar interpretation. The kernel of 0 is Ker(0) = {v € Fp \ Tv = 0}, which is the set of solutions of a system of n linear equations in p unknowns. We will write Ker(T) for this space; it is sometimes called the null space of T. The dimension dim(Ker(T)) of Ker(T) is the nullity null(T) of T or null(0) of 0. Suppose w e lm(0). Then w = Tv for some v £ FP, and w = v{Te\ + ■ ■ ■ + vpTep, where / 1 \ 0

1 ,e2 =

ei =

\° /

0

, . . . , ep —

\° /

\l )

is the standard basis of Fp. A direct calculation, to be spelt out in a wider context in the next chapter, shows that the vectors { T e i , . . . ,Te p } are simply the columns of T, so that lm(0) is the subspace of Fn spanned by the columns of T. This space is familiarly known as the column space of T. The dimension of lm(0) is called the rank of 0 or of T, and written rank0 or rank(T). The rank and nullity are connected by the following well known result, which is called variously the Rank and Nullity Theorem or Kernel and Image Theorem: rank(0) + null(0) = p.

4.11. Some calculations

4.11

61

Some calculations

Here are some computations of kernels and images. First, we look at the multiplication homomorphisms (4.3). (i) Let R be commutative. Given an .R-module M and a fixed element a G R, the homomorphism a(a) : M -» M was defined by a(a)m = am for all m e M. Then Ker(cr(a)) = {m | am = 0} and Im(a(a)) = {ne M \n = am for some m € M } = aM. When R = 7L and M = Z 6 , we have for instance Ker(<7(3))= {0,2,4} and Im(<7(3)) = {0,3}. (ii) Let R be arbitrary. For a fixed a; in M, r ( i ) : fi —\ M is given by T(X)T = rx. The image Im(r(a;)) is simply the cyclic submodule Rx of M generated by x (as in section 3.8), so that T(X) is surjective precisely when M is cyclic with x as a generator. The kernel of T(X) is called the annihilator Ann(:r) of x: Ann(x) = {r G R \ rx = 0}. Ann (a;) is a left ideal of R that will play an important part in our discussions in later chapters. (iii) Finally, we take M to be the module of section 4.8, which is given by the matrix A = I

n

„ I over an arbitrary field F, and we compute

the kernel and image of each F[X]-homomorphism

from M to itself. Recall ' t t that such a homomorphism is given by a 2 x 2 matrix T - ' n 0 in x

with tn and £12 arbitrary, so that, for v = f

\

- "2

) G F , we have Tv =

tnx + ti2y tny There are two approaches, one by direct assault and one that uses a little subtlety. We give both for illustration. The direct calculation of the ' 0 kernel, those v with Tv = I 1, falls into three cases. If i n / 0, then tnV = 0 gives y = 0 and then x = 0, so that Ker(T) = 0.

Chapter 4.

62

Homomorphisms

On the other hand, if t n = 0, we require only ti2y = 0. If t i 2 ^ 0, then x is arbitrary and y = 0, so that Ker(T) = (

1 . Finally, if t i 2 = 0 also,

then T gives the zero map and Ker(T) = M. The calculation of the image falls into three cases as well. We have to find all w = I ,

1 £ F2 for which we can solve the equations

tnx + t12y \ _ / a tuy ) \ b If t n 7^ 0, we can solve for y and x in succession, so T is then surjective. If tn = 0, we can only obtain those w with 6 = 0. If t 12 ^ 0, we can always solve the equation ti2y = a, which shows that Im(T) = I _

I . Finally,

T = 0 implies Im(T) = 0. Now we take a more intellectual line. To start, we notice that any proper, nonzero submodule of M must have dimension 1 as an F-space and so it must be given by an eigenspace of A - see Theorems 3.9.1 and 3.10.1. But the unique eigenspace of A is I _ , so M has exactly one proper 0 nonzero submodule. Thus the only possibilities for Ker(T) and Im(T) are 0, L and M, whatever T. By the Rank and Nullity Theorem (4.10), these possibilities are not independent; we must have Ker(T) = M and Im(T) = 0, or

Ker(T) = L and Im(T) = L,

or Ker(T) = 0 and Im(T) = M. The first combination corresponds only to T = 0. For the second, we must have t n = 0, by direct observation, and ti 2 ^ 0. Thus if both tu and t i 2 are nonzero, we must be in the third case.

4.12

Isomorphisms

In our study of the structure of modules, it will be important to know when two superficially different modules are in essence the same. This is the case when there is an isomorphism between them.

4.12.

Isomorphisms

63

The definition is as follows. An R-module homomorphism 0 : M -> TV is said to be an isomorphism if 9 is bijective, that is, it is both injective and surjective. If there is an isomorphism from the module M to the module TV, then M and TV are said to be isomorphic; the notation is M^N. It is often convenient to have an alternative description of isomorphisms in terms of invertibility. An .R-module homomorphism 9 : M -> TV is invertible if there is an .R-module homomorphism : TV -> M such that 4>9 = id,M, the identity map on M, and 9<j> = id,N, the identity map on TV. The equivalence of invertibility and isomorphism, and a little more, is given in the next result. As with Lemma 4.9.1, the proof is given in full detail to ensure that the reader has a model for arguments with modules and their homomorphisms. 4.12.1 Proposition Let R be a ring, let M and TV be R-modules, and let 9 : M —> TV be an R-module homomorphism. Then the following statements are equivalent, (i) 9 is an isomorphism, (ii) 9 is invertible. (Hi) Ker(0) = 0 and lm(0) = TV. Proof (i) => (ii): We have to construct an inverse map <j> : TV —► M. Take an element n e TV. Since 9 is surjective, there is some m € M with 9(m) = n, and, since 9 is injective, m is uniquely determined by n. We can therefore define a map <j> : TV —>■ M by <j>{n) = m, and the method of construction guarantees that, as a map, 4> is an inverse of 9. However, we also need to check that (p is an R-module homomorphism. Suppose that 4>(n) = m and that 4>(n') = m'. Then n + n' = 9{m) + 6(ri) = 9(m + m'), which shows that 4>{n + n') = m + m', as desired. Similarly, (j>(rn) = r<j>(n) for all r € R. (ii) => (hi): If m e Ker(0), then m = idM(m)

= {9{m)) = 0(0) = 0,

so Ker(0) = 0. If n £ TV, then n = idN(n) = 9{4>{n)) e lm(0), so that lm(0) = TV. (iii) => (i): Immediate from Lemma 4.9.1.

D

Chapter 4.

64

4.13

Homomorphisms

A submodule correspondence

Given a homomorphism 9 : M -+ N of left P-modules, we can find a relationship between the submodules of M and the submodules of N. In applications, the submodules of one module will be known to us, so we can then describe some of the submodules of the other. For a submodule L C M of M, we put fl.(L) = {0(1) 11 6 L}, the image of L, and for a submodule P C JV of iV, we define 0*(P) = { m e M | 0 ( m ) g P } , the inverse image of P . A routine verification confirms that 9„(L) is a submodule of TV and that 9*(P) is a submodule of M. 4.13.1 P r o p o s i t i o n Let R be a ring, let M and N be left R-modules, and let 9 : M —>■ N be a surjective R-module homomorphism. Then the following assertions hold. (i) Let L be a submodule of M with Ker(0) C L. Then 9*9*{L) = L. (ii) Let P be a submodule of N.

Then 9,9*(P) = P.

(Hi) The maps 9, and 6* are mutually inverse bisections between the set of submodules L of M that contain Ker(0) and the set of submodules P of N. Explicitly, L^6,(L) and P^9*(P). Proof (i) It is obvious that L C 9*9*(L). To prove equality, take any m € 6*9,(L). Then 9(m) = 9(1) for some I € L, so that m - I = k G Ker(0) C L. Thus m € L.

Exercises

65

(ii) It is clear from the definition that 0*6* (P) C P. But if p £ P, then p = 9{m) for some m £ M, as 0 is surjective. Thus m € 0*{P), again by the definition, so we have equality. The final assertion is now obvious.

D

Exercises 4.1

Let R be a commutative ring and let M and N be R-modules. Let Hom(M, N) be the set of all .R-module homomorphisms 9 : M -> N'. Show that the addition defined in section 4.2 makes Hom(M, N) into an additive group, with zero element the zero map, and with — 9 defined by -9(m) = -(0(m)) for all m £ M. For any r £ R, define r9 by (rO){m)

4.2

=r(0{m))

for all m. Verify that r0 £ Hom(M, TV) and hence that Hom(M, N) is also an .R-module. Let R be a ring, let M, N and P be .R-modules and suppose that 0,i> e Hom(M, N) and <j>,p £ Hom(iV, P). Verify that (f,(9 + i>) = 00 + 4>ip

and that (cf) + p)9 = 09 + P9.

Let Q be another .R-module and let w £ Hom(P, Q). Show that u{9) = (w)9.

4.3

4.4

Combine the results from the above exercises, show that Hom(M, M) is a ring. Aside: this ring is called the endomorphism ring of M. Usually, the endomorphism ring of a module is noncommutative (see Exercise 5.9). Thus endomorphism rings do not play an explicit role in this text, in marked contrast to their fundamental importance in ring theory in general. Let R be a commutative ring, let M be an R-module and write S = Hom(M,M). For a in R, let cr(a) : R —> £ , (x(a)m = am,

Chapter 4.

66

Homomorphisms

be the multiplication map defined in section 4.3. Verify that for any elements a,b 6 R, a(a + b) = a (a) + a(b) and a(ab) = cr(a)a(b) and that o-(lfi) = I s = idM-

4.5

4.6

4.7

4.8

Aside: in general, a map a from a ring fitoa ring S that satisfies the above equalities is, by definition, a ring homomorphism. A bijective ring homomorphism is a called a ring isomorphism. Ring homomor phisms play only a minor role in these notes, although we have used them implicitly in our discussion of residue rings of polynomials in section 2.12 - when we regard a scalar k G F as the same as its image k G F[X]/F[X]f, we are using the fact that the map k —> k is an injective ring homomorphism. Take M = R in the preceding exercise. Show that a(a) is an injective homomorphism on R for all nonzero a € R «=^> R is a domain. Show also that Hom(i?, M), r(a;)r = rx, be as in section 4.3. Show that r is an isomorphism of .R-modules. Let F be a field, let M be F p made into an F[X]-module through a p x p matrix A, and let N be F made into an F[X]-module through a constant A (so we are in the reverse situation to the example in section 4.7). Show that the F[X]-module homomorphisms from M to TV are given by the row vectors w of length p with wA = Aw. Note: a square matrix A has two sets of eigenvectors, those satisfying Aw = Xw and those satisfying vA = Av. These are the right and left eigenvectors of A, respectively. As we nearly always regard a matrix as a left operator, we use the term "eigenvector" to mean a right eigenvector. Let F be a field and let M and N be F[X]-modules given by the F-linear transformations a and /? respectively. Let M(h) denote the F[X]-module given by ah for h > 1 and define N(h) similarly. Show that if 6 : M —> N is an F[X]-module homomorphism, then 9 : M{h) —► N(h) is also a F[X]-module homomorphism for all h > 1.

Exercises 4.9

67

Here are 5 modules over R[X], listed with the 2 x 2 matrices which define them. 0 1 1 0

L :A =

0 M :B = I " 1

-1 ■■ " 1 (see the example in 3.12) 0

N :C =

I

1 0

1 ) 1

(see the example in 3.11)

P :D =

(

0 0

n

1 j 0

(see the example in 4.8)

Q :E =

0 -1

1 0

Using the fact that an isomorphism R2 —> R2 must be given by an invertible 2 x 2 matrix, determine which of these modules are isomorphic to one another. Hint: The preceding exercise helps! 4.10 Let L be the R[X]-module given by I

I, as in Exercise 4.9, and

let Z be the R[X]-module given by the 3 x 3 matrix

/0 1 0\ 0 0 0 1.

\0

0

l)

Show that Hom(L, Z) has dimension 1 as an R-space, and that there are no injective or surjective homomorphisms from L to Z. Discuss Hom(Z, L). 4.11 Let 9 : M —> TV be a homomorphism, and define 9* and 9* as in section 4.13. Describe 9*9,(L) when the submodule L of M need not contain Ker(0). For a submodule P of TV, describe 9*9* (P) when 9 is not neces sarily surjective. Deduce that 6* and 9* need not be inverse bijections when the conditions of Proposition 4.13.1 are relaxed.

Chapter 5

Free Modules A fundamental result about vector spaces is that any finite dimensional vector space has a basis. In contrast, a module over an arbitrary ring of scalars need not have basis, and so we must give a special place to those modules that do have a basis, namely, the free modules. As we shall see, the theory of free modules and their bases is a generalization of the familiar theory of vector spaces. In this chapter we give the definition of a free module in terms of bases, and we show how the alternative bases of a free module are related by matrices. We also give the matrix description of the homomorphisms between free modules. The fact that results about free modules can be re-interpreted in terms of matrices is crucial to our analysis of modules over Euclidean domains in subsequent chapters. This chapter also contains a brief survey of the properties of determi nants, up to the computation of the inverse of a matrix through its adjoint. The "supplementary topic" sign that adorns the margin refers not to g the individual topics that are covered in this chapter, but to the treatment of them. In the lecture course on which these notes are based, time did not permit me to spell out the details of the derivation of the properties of change of basis matrices, or of matrices of transformations. Instead, the students were assured that everything was essentially the same as it was in a previous course on vector spaces. However, I have provided the extra details in this text so that it is more self-contained, and hopefully easier for the reader to follow.

69

Chapter 5. Free Modules

70

5.1

The standard free modules

Let R be a ring. The standard free left R-module of rank k is the set R

In \

of all fc-tuples m =

.

, where n , r 2 , . . . , rk are arbitrary elements of

\n ) the ring R. The set .R-module structure on Rk is given by the expected rules for addition and scalar multiplication: if / si

\

Si

1"2

are in R ,

and n

m:

\rk J

\sk

)

then ( ri + si \ r 2 + s2 m +n

and for r m. R, ( rri rr2

\

V rrk

)

rm =

A routine verification confirms that Rk is a left R-module. When the ring of coefficients is a field F, Fh is familiar to us as the standard column space of dimension k. The standard basis of Rk is the set

/o\

/ 1 \ , ej —

ei =

\0J

( 0 \ ■i efc

\0 J

=

\ i

/

where, for j = 1 , . . . , k, ej has the entry 1 in the j-th place and zeroes elsewhere.

5.2. Free modules in general

71

For any element m of Rk as above, we can write m = rid H

h TjCj H

h rkek.

Note that the coefficients r\,..., rk are uniquely determined by m since members m,n oi Rk are the same precisely when tj = s3 for all j . When k = 0, the convention is that R° = 0, the zero module, whose standard basis is taken to be the empty set 0. For k = 1, we have R1 = R and ei = 1, the identity element of R.

5.2

Free modules in general

We extend the definition of a basis from vector spaces to modules in a straightforward way. Let R be a ring and let M be an R-module. Recall from (3.8) that a subset B = {b\,..., bk} of M generates M as an .R-module if for each m e M there is a set of coefficients {ri,... ,Tk} C R with m = 7-i&! + . . . + rkbkWe say that the subset B is linearly independent over R if the equality r-ybi H

\-rkbk

=0

holds only if

n = • • ■ = r;. = o. Then B is a basis of M if it is linearly independent over R and it generates M as an .R-module. A free R-module is denned to be an R-module M that has a basis. The number of elements in the basis is called the rank of M, and written rank(M). It is clear that the standard basis of Rk, as defined in the preceding section, is actually a basis of the standard free module Rk, and consequently Rfc is indeed free, of rank k. By our conventions, the zero module 0 = R° is free of rank 0 since its basis is the empty set. Before we commence a detailed discussion of bases, here are some points to bear in mind. • When the ring of scalars is a field F, a finitely generated F-module is the same thing as a finite dimensional vector space over F. By elementary linear algebra, such a vector space V always has a basis, and the rank of V is usually referred to as the dimension of V.

72

Chapter 5. Free Modules

• Two basic results in elementary linear algebra are that any linearly inde pendent subset of a finite dimensional vector space V over a field F can be extended to a basis of V, and that any generating set of V contains a basis of V. These results do not hold for modules over more general rings. For example, the subset {2} of Z is linearly independent but Z has no basis of the form {2, o , . . . } , whatever the choice of a , . . . in Z. The set {2,3} generates Z as a Z-module, since 1 = 2 - 2 - 3 , but no subset of {2,3} is a basis. • A consequence of the results quoted above is that every finite dimensional vector space over a field F is a free F-module. When the coefficient ring R is not a field, we expect to find .R-modules that are not free (Exercise 5.1). For example, the Z-modules Z m , m > 0 (1.11) contain no linearly independent subsets, since mx = 0 for any x e Z m . • Warning! Some authors define bases in a different way which allows the possibility that Z m has a basis as a Z-module; however, the definition of a free module must then be altered. • Whether or not a module is free depends on the ring of scalars, as do the concepts of linear independence and generation. Consider the residue ring Z p where p is a prime. This is a field (1.11.2) and so free of rank 1 as a Zp-module, but it is not free as a Z-module. • Another illustration of the same type is provided by the standard vector space Fk over a field F, made into an F[X]-module M with X acting as 0. Then M is not free as an F[X]-module as it has no linearly independent subsets over F[X]. • According to our definition, the rank of a free module depends on the basis B. We shall see soon (Theorem 5.5.2) that the rank is in fact independent of the choice of basis (at least, for the rings of most interset to us in these notes). However, there are rings for which the rank of a free module can vary with the choice of basis - see section 2.3 of [B & K: IRM]. • Our definition of a free module requires that the rank is finite. Since free modules of infinite rank play only a minor role in this text, we have relegated their definition to Exercise 7.10. The extension of the definitions and results of this chapter to modules with infinite ranks is discussed in [B & K: IRM], (2.2.13). The following restatement of the definition of a basis is very useful. 5.2.1 Lemma Let R be a ring, let M be a left R-module, and let B = {&i,..., 6fe} be a subset of M.

5.3. A running example

73

Then the following assertions are equivalent, (i) B is a basis of M as an R-module. (ii) Given m e M, there is a unique set of coefficients {r\,..., with m = ribi H \-rkbk-

rj,} in R

Proof (i) => (ii): Suppose that B is a basis. Since B generates M, we can write m = rxb\ + ■ ■ • + rkbk for some r\,..., r^ in R. If also m = s\b\ + ■ ■ ■ + Skbk with s i , . . . ,Sjt in R, then 0 = ( n - &i)bi +

h (rfc - sfc)6fe

and so rt = Si for all i by linear independence, which shows that the coefficients are unique. (ii) <= (i): Our assumption obviously implies that B generates M. To prove linear independence, suppose that 0 = nbi H for some coefficients r\,...,

\-rkbk

7>. Since 0 = Oh +

1- 0bk

also, the uniqueness of the coefficients guarantees that r; = 0 for all i.

5.3

□

A running example

Our general discussion of bases is necessarily rather formal, so we will anal yse an example to provide concrete illustrations of the various concepts that we encounter. We let &i = I

) and 62 = I

I be elements of Z 2 , where, for the

moment, a is any integer, and we ask if set B = {61, 62} is a basis of 1?. To see if B is linearly independent, we must try to solve the equation 0 = ri^i + r2bi with r i , r 2 e Z . This gives the pair of equations n + 2r 2 - r i + ar2

= =

and hence the equation ( 2 + a ) r 2 = 0.

0 0

Chapter 5. Free Modules

74

Thus the set B is linearly independent provided that a ^ - 2 . To see if B generates Z 2 , we try to write ei = r\b\ + r2b2 for some r i , r 2 € Z. The pair of equations is now n + 2r 2 —r\ + ar 2

= =

1 0

which give the equation (2 + a)r 2 = 1. This can be solved in Z only if 2 + o = ± 1 , that is, a = - 1 , - 3 . In the case a = —1, we have e\ e2

= =

—b\ + 62 — 2£>i + 62

and so, for an arbitrary element

*=(sH we have x = ( - i i - 2x2)h + (xi + x2)b2. This confirms that, for a = —1, B is a basis of Z 2 ; the reader is recom mended to make the corresponding calculation for a = —3.

5.4

Bases and isomorphisms

Now we give a useful characterization of the bases of a free module in terms of the isomorphisms between the given module and a standard free module. 5.4.1 T h e o r e m Let R be a ring and let M be a left R-module. Then there is a bijective correspondence between (i) the set of all R-module isomorphisms 6 : Rk —> M and (ii) the set of all bases B = {b\ ..., bk} of M.

5.4. Bases and isomorphisms

75

Under this correspondence, an isomorphism 9 corresponds to the basis 0 ( e i ) , . . . , 9(ek), where e i , . . . , ek is the standard basis of Rk. In particular, an R-module M is free if and only if M = Rk for some k. Proof Suppose that 9 : Rk -> M is given, put 8(ej) = bj for j = 1 , . . . , k and define B = {bi,... ,bk}. We show that B is a basis by verifying in turn that B generates M and that B is linearly independent. Given m € M, we have m = 6(x) for some x € Rh, since # is surjective. But x = X\e\ + ■ ■ - + xkek for some x\,..., Xk £ R, and therefore m

=

^(x)

=

6>(a;iei) + ••■

= =

xi9(ei)-\ xih-\

+6{xkek)

\-xk9(ek) \-xkbk,

which shows B generates M. If 0 = x\bi + ■ ■ - + xkbk, then 0 = 0(x\ei + ■ ■ - + xkek). But 9 is injective, so 0 = x^e\ + ■ ■ ■ + xkek in Rk, from which x\ = ■ ■ ■ = xk = 0. Conversely, suppose a basis B is given. By Lemma 5.2.1, each element m in M can be written m = r\b\ + ■ ■ ■ + rkbk with unique coefficients. We can therefore define maps 9 : Rk -> M by #(nei H

h rkek) = nh

-\

h rkbk

and %l) : M ->■ Rk by iKn&i H

h rkbk) = rxei H

h rkek.

A direct verification confirms that 9 and V a r e homomorphisms oi .Rmodules, and they are obviously inverses of one another, which shows that 9 is an isomorphism - see Proposition 4.12.1. The correspondence must be a bijection since 9 and B determine one another uniquely. The final assertion is now clear. □ For an illustration, take the basis

*-(_! u= .; of Z 2 which we constructed in section 5.3. The theorem predicts that there is a corresponding isomorphism 9 : Z 2 -> Z 2 with 9{e\) = b\ and #(e 2 ) = b2-

Chapter 5. Free Modules

76

I of Z 2 , we have

For a general element x = I

e{x) = Xlbi + x2b2 = ( Since x = (-xi-2x2)b\

+ (xi + x2)b2, the inverse of 6 is the homomorphism - x i - 2x2

tp(x) X

5.5

11**11

Xi+

X2

Uniqueness of rank

Our aim now is to show that the rank of a free module is unambiguously defined as the number of elements of any basis. To simplify the discussion, we assume the fact that a commutative domain -R has a field of fractions Q whose elements can be written in the form r/s with r , s e i i , s / 0 see §3.10 of [Allenby]. The benefits of our assumption are contained in the following technical lemma. 5.5.1 Lemma Let R be a commutative domain with field of fractions Q. following statements hold.

Then the

(i) If qi = Ti/Si, i = 1 , . . . ,k is any finite set of elements in Q, there is an element s € R and elements ai € R with qi = Oj/s for all i. Note: s is called a common denominator of q\,... ,qk, and this rewriting process is known as "placing over a common denomina tor". (ii) Rk C Qk for anyk>\. (Hi) If v £ Qk, then v = (l/s)m for some s 6 R and m € Rk. (iv) If B is an R-basis of Rk, then B is also a Q-basis of Qk. Proof (i) We have qi = n/si e Q, ri,Si £ R for each i = l,...,k, with all Si ^ 0. Let s = s\.. .Sk and put ai = (s\... Si-iSi+i... Sk)ri for i = 1,... ,k. (ii) This is obvious since R C Q. (iii) Let v 6 Qk. Then (91

\ = ((V»)

V =

\9k

)

n\ i

V ak )

= (l/s)m

5.6. Change of basis

77

where qx,... ,qk are in Q, s is a common denominator as in part (i), and / a!

eR

m

k

\ ak (iv) Write B = {bi,...,bh}. (We do not assume k = hi) First, we show that B remains linearly independent in Qk. Suppose that qxbi + ■ ■ ■ + qhbh = 0 with g* 6 Q. Keeping the above style of notation, we have (l/s)(aibi H h ahbh) = 0 and hence a%bi H h ahbh = 0 in Rk. Thus all a, = 0 and hence all qi = 0. To see that B generates Qk, take v G Qk, write w = ( l / s ) m with m £ M, and note that m = ri6 x + ■ • ■ + r^fy, for some elements rt e R. □ The uniqueness of rank follows easily. 5.5.2 T h e o r e m Let R be a domain and let M be a free R-module. Then any two bases of M have the same number of elements. Proof Suppose that M has two bases, one with h elements and one with k elements. By the previous result, there are isomorphisms 6 : Rh —5- M and <)> : Rk —» M, and hence an isomorphism 6~14> : Rk —> Rh. Thus the standard free module Rh itself has a basis B with k elements, by Theorem 5.4.1. But the dimension of a vector space over the field of fractions Q is unique, and, as B is also a basis of Q , we have h = k. □ Remark: the rank of a free .R-module is unique for any commutative ring R (Exercise 5.2).

5.6

Change of basis

We next explore the relationships between the various possible bases of a free module over a commutative domain R. To set the scene, we consider the trivial but instructive case of the standard free module of rank 1, that is, R itself. The standard basis of R is the set {1}, and any other basis must consist of one element, say b. The submodule of R generated by b is the principal ideal Rb, so that {b} generates R precisely when Rb = R, that is, b is a unit (see Lemma 1.8.1). If b is a unit, then the equation rb — 0 holds only if r = 0, so that {b} is linearly independent as well. Thus the bases of R are the sets {6} with b a unit. If {c} is another basis, the two bases are

Chapter 5. Free

78

Modules

related by t h e innocuous equations b = (bc~1)c and c = ( c 6 - 1 ) 6 in which both "change of basis coefficients" 6 c _ 1 and c 6 _ 1 are themselves units. Now consider a general free i?-module, with bases B = {b\,... ,bk} and C = {ci,...,Cfc} - by Theorem 5.5.2, t h e bases must contain t h e same number of elements. To relate t h e bases, we use t h e fact t h a t a member of a free module can be written uniquely as a linear combination of t h e elements of a given basis (Lemma 5.2.1). We can therefore write b\ = pnci + P21C2 H 62 = Pl2Cl + P22C2 H

+ PkiCk 1" Pk2Ck

bk = PlfeCi + p2kC2 H

1" PkkCk

c\ = quh + q2ib2 H C2 = quh + 92262 H

+ qkih 1- qk2bk

(5.1)

and (5.2)

Cfc = qikbi + ?2fc&2 H 1- 9fcfc&fc in which the coefficients p y and qij,i,j = \,...,k, are uniquely determined elements of .R. T h e order of t h e suffices may be unexpected, but it is best suited to computations, which fact will become apparent as we progress (see Theorem 12.11.1 in particular). T h e underlying reason for this choice of ordering of suffices is t h a t we write both scalars and transformations on the left-hand side of module elements. Substituting, we obtain a n expression for t h e basis B in terms of itself: for each h = 1 , . . . , k, we have k b

h = ^PihCi

k

( k

= ^2pih

^qjibj

\

k

= J2[

/ k

^QHPih

\

I bJ-

(5-3)

However, t h e unique way of expressing B in terms of itself must be 61 = I61 + 06 2 + • ■ • + Obfc 62 = O61 + 16a + • • • + Ofcfc (5.4)

bk = O61 + 06 2 + • • ■ + l i t which gives t h e identities k

f 0 if h ^ \ lifh =

(5.5)

5.7.

Coordinates

79

The last equation can be summarized conveniently in matrix form. Put Pc,B = (pih) and PB,c = (qji), so that both PC,B and PB,C are k x k matrices over R, the change of basis matrices for the pair B and C. Then Eq. (5.5) reads PB,CPC,B

= I,

(5.6)

where / is the k x k identity matrix. Reversing the roles of B and C, we find that PC,BPB,C

= I

(5.7)

also, so that PC,B and PB,C are mutually inverse matrices over R. We illustrate these calculations with the basis B = {61,62} of Z 2 ,

♦.-(-DM.;)which we considered in sections 5.3 and 5.4, taking C = E, the standard basis. It is clear that

M-l -0-

that is, the columns of Pc,s are simply the vectors 61,62 themselves. To compute PB,C, recall from section 5.3 that ei e2 thus

= =

-61 + 62 -26j + 62;

Ml "0-

If we are given a basis B of a free module and an invertible matrix Q of the correct size, we can construct a new basis C so that Q = PB,C but, before we do so, it is convenient to discuss coordinates.

5.7

Coordinates

Given a free .R-module M with basis B = {&i,...,6fc}, we know from Lemma 5.2.1 that for m e M, there is a unique set of coefficients { r i , . . . , r^} in i? with m = ri&iH +r fc 6 fc .

Chapter 5. Free Modules

80

The element

MB

= I i I e Bk

is called the coordinate vector of m with respect to B. (Strictly speaking, (m)s isn't a vector unless R is a field, but it is convenient to extend the use of the word "vector" to the more general situation.) Now suppose that C is another basis of M and that B and C are related as in Eqs. (5.1) and (5.2). Substituting, we obtain m

= =

ri(pnci H hPfciCfc) + V rk(j>ikci H 1- PfcfcCfc) {pun H hpifcrfc)ci +

r {pkin +

r PkkTk)ck

which can be summarized as {m)c = Pc,B{m)B.

(5.8)

Notice that when E is the standard basis of the standard free module Rk, we have k (X)E = x for all x € R . In our running example, the calculations in section 5.3 show that \

Xi xi + X x22

J

for our illustrative basis B, while (with C = E),

«■* - - 1 1 )) by the computation in section 5.6. An easy verification confirms Formula 5.8.

5.8

Constructing bases

Suppose that we have a basis B of the free module M of rank k, and that we have an invertible kxk matrix Q = (g^) over R, with inverse P = (py)We now interpret Eq. (5.2) as the definition of a set C = { c i , . . . , cjt} of elements of M, and we claim that C is a basis of M.

5.9. Matrices and homomorphisms

81

First, notice that the identities in Eq. (5.1) still hold - this is confirmed by substituting for Ci,...,Cfc in the right-hand side of the equation and using the fact that P is given to be the inverse of Q. It follows that C generates M, for any m e M has the form \-rkbk, m T-I&H and so can be expressed in terms of c\,..., ck by substitution. Suppose next that 0 = siCi + ■ ■ ■ + SkCk for some scalars 8\,..., s& in R. Substituting for each c, in terms of the bi's and calculating coefficients as in section 5.7 above, we find that

0 ='-

(0)B

= Qs

where s\ sk Then s = PQs = 0, so that C is linearly independent and hence a basis for M. Notice that Q = PB,c and P = PC,B-

5.9

Matrices and homomorphisms

Suppose that M and TV are free modules over a commutative domain R. Our aim in this section is to show that an .R-module homomorphism 0 : M —> N can be represented by a matrix T, and conversely, that a matrix of the correct size defines a homomorphism from M to N. Let B = { 6 i , . . . , bk} and C — {c\,..., C/} be bases of M and N respec tively, so that M has rank k and N has rank I. Suppose that 9 : M —> N is an R-module homomorphism. The image 0(bi) of a member of B is an element of N, and so, by Lemma 5.2.1, can be written as a linear combina tion 6(bi) = t\iCX H Ytuci with unique coefficients tu, ■ ■ ■ ,tu in the ring of scalars R. We can thus associate with 6 an I x k matrix

T =

{6)C,B

/ tn

t 12

hk

thl

th2

thk

V tn

ti2

Uk )

\ (5.9)

Chapter 5. Free Modules

82

with entries in R. The matrix T is called the matrix of the homomorphism 6 with respect to the pair of bases B, C. Notice that the i-th column of T is the coordinate vector {0(bi))c of 9(bi) with respect to C. A computation very similar to that used to derive Eq. (5.8) shows that for any element m in M, the coordinate vector of 6(m) is related to that of m by the formula (0(m))c = (8)c,B(m)jj. (5.10) (In fact, (5.8) can be obtained as a special case of (5.10); see Exercise 5.4.) Conversely, if an I x k matrix T is given, then Eq. (5.10) can be used to define an .R-module homomorphism 6 from M to N, since the image 9(m) of an element m G M is uniquely determined by specifying its coordinate vector (6(m))cThe fact that the entries thi of (6)c,B are uniquely determined by 9 confirms that, for fixed bases B and C, the correspondence 0 <->

(5.11)

(0)C,B

defines a bijection between .R-module homomorphisms from M to N and I x k matrices over R. The following results are often useful. 5.9.1 Proposition Let R be a commutative domain and let M, N and P be free R-modules, with bases B, C and D respectively. Then the following hold. (i) {idM)s,B = I, where id^ is the identity map on M and I is a kx k identity matrix, k = rank(M). (ii) If 0 : M —¥ N and 4>: N —»■ P are R-module homomorphisms, then {
(4>0)D,B-

Proof Let B = {blt..., bk}, C = {ct,..., a) and D = {dly..., dm}. (i) This is obvious, since idM(h) = bi for all i. (ii) Write (6)C,B =T = (Uj), an / x k matrix, and {(J))D,C = S = an m x / matrix. Then

4>6(bj) =

4>(J2UjCi\ i

=

y2uj4>{ci) i=\

(shi),

5.10. Illustration: the standard case

83

fc

i=\ m

1

IIL

'\J2Sh \h=l / I

idh I \

1 dh, h=\ \i=\

J

which shows that, for all h = 1 , . . . , m and j = 1 , . . . , k, the (h, j)-entry of the m x p matrix (4>9)D,B is the same as that of the product matrix ST.

D 5.9.2 Corollary Let M and N be free R-modules, with bases B and C respectively, and let 9 : M —> N be an R-module homomorphism. Then 6 is an isomorphism if and only if (0)c,B is an invertible matrix, in which case {{0)C,B)~ = ( # - 1 ) B CProof =>: Suppose that 6 is an isomorphism. By Proposition 4.12.1, 6 has an inverse, and by the preceding result (6)c,B(0-1)B,c

=

{6-1)B,G{6)C,B

= (idM)B,B

(idN)c,c=I

and = I.

4=: If (9)c,B has an inverse S = (shi), take <j>: N —> M to be the homomor phism with matrix S. Then both products 9(/> and <j>6 have as matrix the identity matrix, so they are both identity maps, on N and M respectively. Thus (j> = 9~1.

5.10

U

Illustration: the standard case

Suppose that M and N are the standard free modules Rk and tively, and that we take as bases the standard free bases, which E{k) and E(l). An element m of M can then be identified with nate vector {m)Eiky Likewise, if 9 is a homomorphism from Rk have 9(m) = {9{m))E(iy Thus, Eq. (5.10) appears as 9(m) = {9)E(l)Mk)m so that the homomorphism 9 is given by the matrix T

=

(9)E(i),E(k)

Rl respec we denote its coordi to Rl, we

(5.12)

Chapter 5. Free Modules

84

which is an / x k matrix whose i-th column is simply the image 0(e;) of the i-th standard basis vector e^ in E(k). Thus for many purposes the homomorphism 9 can be viewed as being effectively the same as the matrix T it defines. However, this identification of a homomorphism and a matrix is dependent on the fact that we use the standard bases, and so it can be misleading when we use nonstandard bases of standard free modules. When the ring of scalars is a field F and the free modules are the standard vector spaces Fk and Fl with their standard bases, we recover the correspondence between F-linear transformations and matrices that is familiar from elementary linear algebra, as promised in sections 4.6 and 4.10.

5.11

Matrices and change of basis

Next, we show how the correspondence between homomorphisms and ma trices depends on the choice of bases for the free modules M and TV. This relationship will be important for our subsequent analysis of modules in general. Let B = {b\,..., bk} and B' - {b[,..., b'k} be bases of M, and let C = { c i , . . . , c;} and C = {c[,..., c[} be bases of N. Then an .R-module homomorphism 6 : M —>• N has two associated I x k matrices, TC,B

= (thi)

and TC,B>

= (t'gj)-

We claim that these matrices are related by the formula Tc,B'

= PC,CTC,BPB,B',

(5.13)

where Pc,c aud PB,B> a r e change of basis matrices. There are two ap proaches to the verification of this formula. One is simply to expand the matrix on the right and check that we have the desired equality. Although this method is elementary, it does require a lot of detailed calculation. A more sophisticated approach is to use the calculations that we have already performed, together with a nice observation. If we compare the formulas given in Eqs. (5.1) and (5.9), we see that the change of basis matrix PB,B' is just the basis of the identity transformation on M with respect to the pair of bases B',B, so that PB,B'

= B{idM)B'-

(5.14)

5.12. Determinants and invertible matrices

85

Likewise

Pc,c = c(idN)cSince 6 = idjv • 0 ■ id,M,

we obtain the relation 5.13 immediately from the product formula for ma trices of homomorphisms that we obtained in Proposition 5.13.

5.12

Determinants and invertible matrices

Invertible matrices play an important role in the theory of modules, since they represent both change of bases within free modules and also isomor phisms between free modules. It will therefore be useful to develop some criteria that allow us to decide if a given matrix is invertible. In this section we give one such criterion in terms of the determinant - later we will give a more constructive approach based on row and column operations (section 10.8). We adopt an inductive definition of the determinant. Let A = (a^) be a k x k matrix with entries in a commutative ring R. If k = 1, we have det(A) = a n , and for k = 2, det(A) = ana22 — ai2«2iFor general k, we assume we know how to calculate the determinant of a k — 1 x k — 1 matrix. For each pair i,j of indices, let m^ be the matrix formed from A by eliminating row i and column j of A, and define the (i,j)-cofactor of A to be >4ii = ( - l ) < + i d e t ( m y ) . We can then take as our working definition the formula det (A) = aii-An + ai2An

H

h aut-Aut,

that is, expansion from the first row. For example, when k = 2, An = a22 and A\i = —021, so we recover the familiar formula. We assume the basic properties of the determinant, which we list below for future reference. These results hold for matrices with entries in any commutative ring. Proofs can be found in Chapter 7 of [Cohn 1]. Det 1: Let A be a k x k matrix over a commutative ring R. Then det (A) E R. Det 2: If B is also k x k matrix over R, then

det(AB) = det (A) det (£).

Chapter 5. Free Modules

86

Det 3: det(AT) = det(^4), where AT is the transpose of A. Det 4: If A has two identical rows or columns, det(A) = 0. Det 5: If B is formed from A by adding a scalar multiple of one row (or column) to a different row (or column), then det(B) = det{A). Det 6: The determinant is an additive function of the rows of a matrix. More precisely, let a, be the i-th row of A, which is a row vector of length k and suppose that a,i = bi + Ci

for row vectors T>i and Cj. Write B for the matrix which is formed from A by replacing row a^ by b*, and define C similarly. Then det(A) = det(B) + det(C). There is a corresp onding result for columns. Det 7: If /on 0 ... 0 021

^22

• ••

flfc-1,1

a/c-i,2 afc2

• ■• ■ ■

0 0

0

\

A= \ Ofel

a-k-i,k-i afc,fc-i

0 afcfc /

is a lower triangular matrix, then det(yl) = ana22 ■

■a-kk-

In particular, det(7) = 1. Det 8: The following relations hold:

-{

ahiAn

+ • ■ • +dhkAik

Auaij

+ ■ ■ ■ + Akidkj

and

-{

det(A) 0

for for

h=i

det (A) 0

for for

i = j

In the case h = i, the first relation tells us that the determinant can be expanded from row h for any h = 1 , . . . , k. Similarly, for i = j , the second tells us that we can expand from the j - t h column. Det 9: The adjoint of A is adj(A) = (Aij)T, the transpose of the matrix of cofactors of A. Then the formulas of Det 8 can be re-interpreted as the matrix product formulas A ■ adj(A) = det(A)I = adj(A) • A where the middle term is simply the diagonal matrix with all diagonal terms equal to det(A).

5.12. Determinants and invertible matrices

87

This last property leads to the invertibility criterion that we have been seeking. 5.12.1 Theorem Let A be a k x k matrix over a commutative ring R. Then A has an inverse with entries in R if and only i/det(,4) is a unit in R. Furthermore, if A is invertible, then A-1

= (det(A))- 1 adj(yl).

Proof Suppose A has inverse A~l with entries in R. Then det( J 4 _ 1 ) G R also. Since det{A) det(A" 1 ) = det(A • A~l) = det(/) = 1, we see that det(j4) is a unit in R. Conversely, suppose det(^4) is a unit of R. Each cofactor Aij is the determinant of a matrix with entries in R and so belongs to R. Thus adj(^4) has entries in R, as does (det(A))'1 &d)(A), which is therefore the inverse of A from the relations in Det 9. □ Remark the strength of the above result is that it tells us when a matrix has an inverse with entries in the given ring, rather than some larger ring. For example, let A be a square matrix over the ring of integers Z. The condition that A have an inverse with entries in Z is the stringent requirement that det (^4) = ± 1 , since the only units in Z are ± 1 . On the other hand, A will have an inverse in Q provide only that det(j4) ^ 0. For a final illustration, we return to the problem considered first in section 5.3, that of determining the values of a in Z for which the elements b\ = I basis, then

) and b2 = I

) form a basis of Z 2 . If B = {61,62} is a

'-*.-(-1.1)

must be invertible, by the results of section 5.6. On the other hand, if P is invertible, then B is a basis by section 5.8. Now det(P) = a + 2, which is a unit in Z when a = - 1 or a = - 3 , thus confirming the calculation made in section 5.3.

88

Chapter 5. Free Modules

Exercises 5.1

Let R be a commutative ring and suppose that the nonzero element r € R is not a unit of R. Show that the i?-module R/Rr is not a free i?-module. (Hence a ring which is not a field has non-free modules.) 5.2 (a) Let P be an invertible matrix over a commutative domain R. Using Theorem 5.5.2, show that P must be a square matrix. (b) Let R now be any commutative ring. Show that if & : R —> R is an isomorphism, then there is an invertible I x k matrix T with Tv = 9(v) for all v e Rk. (c) If T is not square, add zero rows or columns to obtain a square matrix, and so obtain a contradiction, using Det 8. Deduce that Theorem 5.5.2 holds for any commutative ring. 5.3 Let P be k x k matrix over a commutative domain R. Show that P is invertible if and only if the columns of P form a basis B of Rk, in which case P = PE,B5.4 Let B and C be bases of a free i?-module M. Use the fact that {idM)c,B = Pc,B to obtain Formula 5.8: (m)c = Pc,B{m)B from Formula 5.10: (9(m))c 5.5

(6)c,B(m)B.

Let B, C and D be bases of a free i?-module M. Show that PD,C

5.6

=

Let

• PC,B =

PD,B-

- - U h - -0)}

and

D 2

5.7

- * - -i -*=U))}

be bases of Z (see section 5.3). Compute PD,BFind all values of a in the Gaussian integers Z[i] for which the elements b\ = [

I and 62 = I

j form a basis of Z[i] 2 .

Exercises 5.8

89

Let g(X) be a polynomial over 11 the elements b\ = (

Find all f(X)

I and hi " ( fun i > /""""-

in R[X] for which

formabasisofR x 2

[ ] -

Let R be a commutative ring and let E be the standard basis of Rn. For each 0 in Eom(Rn,Rn), put a(0) = E{0)E- Show that a is a ring isomorphism from Hom(/? n , Rn) to the ring Mn(R) of n x n matrices over .R. Let M be any free R-module with rank(M) > 1. Deduce that the ring Hom(M, M) is not commutative. 5.10 Let 0 . 0 0 \ . 0 0 a22 021

5.9

A = fl

\

fc-i,i ajti

afc- 1,2 afc2

■

•

ajt-i,fe-i

0

•

afc,fc-i

«fcfc

/

be a lower triangular matrix over a commutative ring R. Show that A is invertible if and only if all the diagonal terms 011,022, • • ■ ,afcfc a r e units in R. Hint: Det 7. Show further that if A is invertible, the inverse of A is also lower triangular.

s

Chapter 6

Quotient Modules and Cyclic Modules Let R be a ring and let 6 be an .R-module homomorphism from an R-module M to an .R-module N. In Chapter 4 we associated to 8 a submodule Ker(#) of M, the kernel of 6, which measures the failure of 6 to be injective - 0 is injective precisely when Ker(#) = 0. Our first construction in this chapter is in a way a reverse procedure. Given a submodule L of M, we find a module M/L, the quotient of M by L, and a homomorphism IT from M to M/L which has kernel L. We then use this construction to manufacture an injective homomorphism from a hon-injective homomorphism 6; the new homomorphism, 6, is the mapping from M/Ker(#) to N that is induced by 6. This construction leads us to a crucial result, the First Isomorphism Theorem, which is a very useful tool for the production of isomorphisms and hence, ultimately, for the description of modules. We illustrate this approach by examining the cyclic modules over a ring R, that is, the modules of the form M = Rx for an element x in M. We prove that a cyclic module is isomorphic to a quotient module R/I for a left ideal / of R. Thus the structure of the cyclic R-modules is determined by the nature of the ideals in R. In particular, when R is Euclidean we are able to give a complete description of the submodules of a cyclic module in terms of the factorization of the elements of R. Finally, we make a detailed analysis of the action of a polynomial vari able X on a cyclic module F[X]/F[X]f over the polynomial ring F[X], which leads to a first result on normal forms of matrices. 91

92

6.1

Chapter 6. Quotient Modules and Cyclic Modules

Quotient modules

Let R be an arbitrary ring and let L be a submodule of a left i?-module M. We construct the quotient module (sometimes called the factor module) M/L of M by L in much the same way as we constructed the residue ring R/I from a ring R and ideal / in section 1.10. Define a relation on M by the rule that m = n <=$• m — n € L. The verification that = is an equivalence relation is a matter of routine checking, similar to that performed in detail in the proof of Lemma 1.10.1. The equivalence class of an element m of M is the set m = m + L = {m + I | I G L}; we usually prefer the notation m. The quotient module M/L is defined to be the set of all such classes, with addition given by the rule m + n = m + n for m, n G M/L, and scalar multiplication given by r -m = rm for r G R and m G M/L. More routine verifications, very similar to those made in the proof of Propo sition 1.10.2, show that these operations are well-defined and make M/L into a left .R-module, with zero element 0.

6.2

The canonical homomorphism

The map 7r : M —y M/L defined by wijiz) = ui is called the canonical homo morphism from M to M/L. The fact that 7r is an .R-module homomorphism is immediate from the definition, and 7r is surjective since every element of M/L has the form m for some m in M. In Lemma 4.9.1, we showed that the kernel Ker(#) of an R-module homomorphism 6 : M —> N is a submodule of M. Now we obtain a converse. 6.2.1 L e m m a Let L be a submodule of M and let w : M —> M/L homomorphism. Then Ker(7r) = L

be the canonical

6.3. Induced

hoznomorphisms

93

Proof m e Ker(7r)

<=>

m = 0

«=> <=>

m - 0€ L m e L. D

6.3 6.3

Induced homomorphisms Induced homomorphisms

An important use of the quotient module construction is that it enables us to construct new homomorphisms from old. The new homomorphism is often an isomorphism, which observation is a key contribution to the task of describing an arbitrary module in terms of a collection of standard modules. Suppose that we are given an R-submodule L of a left R-module M and an .R-module homomorphism from M to a left .R-module N. Suppose also that L C Ker(#). We can then define a mapping 9 : M/L ->■ N by 0(m) = 9(m) for all m e M/L; 6 is called the homomorphism induced by 6, or sometimes, the induced mapping. First, we must check that 9 is actually well-defined. Suppose that m,n are elements of M with m = n in M/L. Then m = n + / for some / € L, so that e(m) = 9{m) = 9(n) + eil) = 9{n) = 9{n). Next, we note that 6 is an R-module homomorphism because, for any m, n e M/L and any r € R, we have 6~(m + n)

= 6 ■ (m + n) = 9{m + n) = 9{m) + 9{n) = 9{m) + 9(n)

and 9{r • m)

= =

9 ■ (rrn) 9{rm)

= =

r ■ 9(m) r-9(rn).

Chapter 6. Quotient Modules and Cyclic Modules

94

We summarize the basic properties of induced mappings in the following theorem. 6.3.1 The Induced Mapping Theorem Let R be an arbitrary ring and let M and N be left R-modules. Suppose that L is a suhmodule of M, that 9 : M -» N is an R-module homomorphism and that Ker(0) C L. Then the induced homomorphism 9 : M/L -> N has the following prop erties. (i) (ii) (Hi) (iv)

Ker(0) = {m \ m £ Ker(6>)} C M/L. IfKer(9) = L, then 9 is injective. If 9 is surjective, so also is 9. If Ker(0) = L and 0 is surjective, then 6 is an isomorphism.

Proof (i)

This follows from the implications m e Ker(0)

«=> <=>

9m = 0 m£ Ker(0).

(ii) If Ker(0) = L, then Kei(9) = 0 by the first part, so 9 is injective by part (ii) of Lemma 4.9.1. (hi) If 9 is surjective, each element n € N has the form n = 9{m) for some m € M, and so n = 9(m) also, showing 9 to be surjective. (iv) This is clear from the preceding results. □ The last part of this result is important enough to be stated separately as a theorem in its own right. 6.3.2 The First Isomorphism Theorem Let 9 : M —)• N be a surjective R-module homomorphism. induced homomorphism 9 : M{ Ker(#) —>• N is an isomorphism.

6.4

Then the □

Cyclic m o d u l e s

As an application of the First Isomorphism Theorem, we show how to describe cyclic modules and their submodules. Recall from section 3.8 that an .R-module M is cyclic if M = Rx for some element x in M. This is equivalent to saying that the multiplication homomorphism of section 4.3, T

= T(X) : R—> M, r(r) = rx for all r S R,

6.4. Cyclic modules

95

is a surjective .R-module homomorphism. From the definitions, the kernel of r is the left ideal Ker(r) = {r £ R | rx = 0}, which is the annihilator ideal Ann(x) of x (sections 3.5 and 4.11). Write I = Ker(r). Then, by the First Isomorphism Theorem, f is an isomorphism from R/I to M. In the other direction, suppose we are given a left ideal / of R. Then the quotient module R/I is cyclic with generator 1, since r = r ■ 1 for any reR. The left ideal I that we have associated with a cyclic module M depends on the generator x of M that we have chosen. To complete the classification of cyclic modules, we need to show that I depends only on M. In fact, we prove an apparently stronger result, that the ideal I is uniquely determined by the isomorphism class of the cyclic module. Suppose that M and N are cyclic left .R-modules and that there is an .R-module isomorphism a : M -> N. Let I and J be left ideals of R so that there are isomorphisms a : R/I -> M and (3 : R/J -> N. Then there is a composite isomorphism 7

Now, if x G J , then, in

= (j3)-laa

: R/I -► R/J.

R/J, 0 = x-j(T)

= j(x),

so x — 0 in R/I, from which x £ I. Thus J C J, and by symmetry I = J. We record our findings as a theorem. 6.4.1 Theorem Let R be a ring and let M be a cyclic left R-module. Then there is a left ideal I of R such that M = R/I as an R-module. If N is also a cyclic left R-module with N = R/J for some left ideal J, then M = N as an R-module if and only if I = J. In particular, the left ideal I is uniquely determined by the cyclic module

M.

□

96

Chapter 6. Quotient Modules and Cyclic Modules

Remarks (i) These results include the two extreme cases of cyclic modules, which we have not mentioned explicitly until now. If / = 0, the zero ideal, then R/I = R. U I = R, then R/I = 0, the zero module. (ii) Suppose that ideal / is two-sided (as is always the case when R is com mutative). Then the quotient module R/I is, in essence, the same as the residue ring R/I of section 1.10 (which is the reason that we can use the same notation for these two constructions). Both have the same rule of addition, and the scalar multiplication is related to the residue ring multiplication by the formula r -x = Tx. (iii) As we have seen in our discussion of matrix actions on vector spaces, a single additive group can usually be viewed as a module in many different ways. Any module whose underlying group A is cyclic is itself necessarily cyclic. Then A must be isomorphic to Z or to Z p for some prime p. The converse is far from true. For example, the ring of Gaussian integers Z[t] is cyclic as a module over itself, but not as a module over Z since Z[i] = Z 2 . Other examples are provided by the C[X]-modules of Exercises 3.6 and 3.7; both these modules are cyclic, but the underlying vector space C 3 is not cyclic over C, nor does it give a cyclic C[X]-module when X acts as 0.

6.5

S u b m o d u l e s of cyclic m o d u l e s

We now combine the description of cyclic modules given above with the submodule correspondence that we obtained in section 4.13 to find the submodules of a cyclic module. Let R be an arbitrary ring. We make repeated use of the observation that the left ideals of R are precisely the i?-submodules of the left regular i?-module R. Choose a left ideal I of R and write n : R —» R/I for the canonical homomorphism of left .R-modules. If P C R/I is an i?-submodule of R/I, then the inverse image of P is H = ir*{P) = {r£R\Tr(r)

£ P},

which is a left ideal of R that contains / . Conversely, if we have a left ideal H of R, the image jr»(JEf) of H is a submodule of R/I. We can therefore restate Proposition 4.13.1 as follows. 6.5.1 P r o p o s i t i o n Let I be a left ideal I of R and let M = R/I be the cyclic left R-module defined by I. Then there is a bijective correspondence between

6.5. Submodules of cyclic modules

97

(i) left ideals H of R with I Q H and (ii) submodules P of M, in which H <+ jr.(Jff) and P^iv*(P).

D For a general ring of scalars R, there is no reason why a submodule of a cyclic module should itself be cyclic. The assertion that every submodule of the left i?-module R itself is cyclic is the same as saying that every left ideal of R is principal, which is a very strong condition, although it happens to hold for Euclidean domains. (An example of a non-principal ideal domain was given in Exercise 1.6.) When R is Euclidean, we obtain a complete result. 6.5.2 Theorem Let R be a Euclidean domain and let R/I be a cyclic R-module, where the ideal I is neither 0 nor R. Then the following statements are true. (i) I = Ra, where a has a standard factorization of the form n(l)

n(Jfe)

for distinct irreducible elements Pi, ■ ■ ■ ,Pk of R and unique positive exponents n ( l ) , . . . ,n(fc). (ii) The submodules of R/Ra are cyclic modules of the form Rd/Ra where d = p™{l) ...p™ik\ (Hi) Rd/Ra = R/Rd' Proof (i)

0<m(z)
l,...,k.

where dd' = a.

By Theorem 2.5.3, the ideal I is principal, say / = Ra' where l

nil)

n(k)

a = upj v ' .. .pk

.,

, u a unit,

is a standard factorization of a' (section 2.9), and a = u _ 1 a ' is also a generator of / (Lemma 2.5.3), with the desired standard form, (ii) By the preceding result, the submodules of R/Ra are given by the ideals H of R with Ra C H C R.

Chapter 6. Quotient Modules and Cyclic Modules

98

But then H = Rd for some d, and since a € Rd, d is a divisor of a. Discarding unit factors as before, we can take d to have a factorization as claimed. (iii) Define 0 : R -> Rd/Ra by 0(r) = rd for all r € R. It is easy checked that 6 is an /^-module homomorphism, and 6 is evidently surjective. We have r £ Ker(0)

rd = 0 rd € -Ra r € fid'

so that Ker(0) = Rd'. The First Isomorphism Theorem (6.3.2) now shows

that R/Rd' 3!

fid/ito.

□

Diagrams. If the element a has a straightforward factorization, the submodules of R/Ra can be described by drawing a diagram. Suppose that a = pq, where p, q are distinct irreducible elements of the Euclidean domain R. Then the diagram is as follows. M

•

pM

qM

If a — pn, it is more convenient to turn the diagram on its side: • • • • • p""1M

,n-2 pn~2!M

p2M

pM

M

6.6. The companion

6.6

99

matrix

The companion matrix

Let F[X] be the polynomial ring over a field F, and let / = f(X) be a polynomial in F[X|. As we remarked in section 6.4, the quotient mod ule F[X]/F[X]f is the residue ring of F[X] modulo F[X]f, but with a different interpretation of the multiplication. In section 2.12, we found a canonical basis for F[X]/F[X]f as a vector space over the field F, which we can now use to find a nice matrix representatation for the action of X on F[X]/F\X]f. This is a first step towards finding normal forms of matrices, a problem we consider in depth in Chapter 13. To avoid trivialities, we suppose that / is not a constant polynomial. Since the nonzero constants are the unit polynomials (Lemma 1.4) and the quotient module is unchanged if we multiply / by any unit, we can take f = fo + fiX + f2X2

+ ■■■ + / n - i X " " 1 + Xn

to be monic, with n = deg(/) the degree of / . By (2.12), the canonical F-basis of F[X]/F[X]f 1 e

ei

is

en~l

where 1 = 1, e = X and in general el = X . The multiplication in F[X]/F[X]f is derived from the relation en = -fo-he

fn-ie71'1.

The F[X]-module structure of F[X]/F[X]f is completely determined by the action of the variable X on F[X]/F[X]f, which we now describe. By definition, X ■ g = ~X~g for any g e F[X]/F[X]f. Thus X acts as a linear transformation on the space F[X]/F[X]f, and we wish to find the matrix of this linear transformation relative to the canonical basis. It is easy to see that we have equations X X

1 e

X X

£ n --2

=e = e2 (6.1) -n —1

e™"-1 = - / o - 1 --he- _ / 2 £ 2

/n-ie n - 1

100

Chapter 6. Quotient Modules and Cyclic Modules

Thus, by the definition in section 5.9, the matrix of the linear transfor mation corresponding to X is

(o 1 0

0 • • 0 ■ • 1 ■ •

0 0 0

0 0 0

0 ■ ■ 1 0 • • 0

0 1

C(f) = 0

-/o

\

-h -h

(6.2)

— fn-2 -/n-1 j

The matrix C(f) is called the companion matrix of the polynomial / or the rational canonical block matrix associated to / , since it is a typical building block for the rational canonical matrices that we encounter later. Here are some special cases. A linear polynomial / = X — a, has C ( / ) = ( a )i a 1 x 1 matrix. Since a e F is arbitrary, we see that any method of turning F into an F[X]-module must result in a cyclic module, which fact is obvious anyway. For a quadratic polynomial / = X2 + aX + 6, we have

C(f) It is far from the case that an action of X on F2 necessarily gives a cyclic F[X]-module; a trivial example is given by allowing X to act as 0.

6.7

Cyclic modules over polynomial rings

We can reverse the above construction. Given a rational canonical block matrix C, that is, a matrix which has the form exhibited in Eq. (6.2), we work backwards to obtain a cyclic F[X]-module. First, note that it is evident that there is a unique monic polynomial / such that C = C(f). Next, let M be the F[X]-module given by X acting as C on the standard F-space F n in the usual way (section 3.3). Then the action of X on the elements of the standard basis { d , . . . , e n } is given by the equations Xei = 62, Xe2 = e 3 , . . . , Xen-\ = en Xen = -foei - / i e 2 fn-l^n

(6.3)

which mimic those in Eq. (6.1). Let 6 : M -> F[X]/F[X]f by the F-linear transformation defined by 6(ei) = l,d(e2)=e,...,

6(en) = e n - 1

6.7. Cyclic modules over polynomial rings

101

Comparing Eqs. (6.1) and (6.3), we see that 9 is an isomorphism of F[X}modules. We summarize our discussion, and a little more, as a theorem. 6.7.1 T h e o r e m Let B be an n x n matrix over a field F, and let M be Fn made into an F[X]-module with X acting as B. Then the following assertions are equivalent. (i) M is isomorphic to a cyclic F[X}-module F[X}/F\X]f for some monic polynomial f € F[X]. (ii) There is an invertible n x n matrix T so that TBT~l is a rational canonical block matrix C. When these assertions hold, C = C(f) for a unique monic polynomial f in

F[X]. Proof Before we can get started, we need to set up some notation. Given B, the action of X as B on Fn defines an F-linear transformation (3:Fn^

Fn, P(v) =Xv = Bv for v € Fn,

and the matrix of (5 is {P)E,E — B relative to the standard basis E of Fn, by the results of (5.10). On the other hand, given a monic polynomial / , the preceding dis cussion shows that the action of X on F[X]/F[X]f defines an F-linear transformation 7 : F[X]/F[X]f -► F[X]/F[X]f which has matrix (7)2 % = C ( / ) w i t r i respect to the canonical basis Z of F[X]/F[X]f. (i) => (ii): Suppose that 9 : M -> F[X]/F[X]f is an F[X]-module isomor phism. By Theorem 4.5.1, OP = 16, so that (0)Z,E(P)E,E

by Proposition 5.9.1. Put T = and

{0)Z,E\

=

(l)z,z(8)z,E

then T is invertible (Corollary 5.9.2),

TB = C(f)T as desired.

Chapter 6. Quotient Modules and Cyclic Modules

102

(ii) => (i): Given T and C, define 6 by the relation (6)Z,E = T, and let / be the monic polynomial with C = C(f). Reversing the above argument, we see that 6 is an F[X]-module homomorphism from M to F[X]/F[X]f. Finally, we show that the monic polynomial / is uniquely determined by M. If M £ F[X]/F[X]f and also M S F[X]/F[X]fc, then F [ X ] / F [ X ] / a F[X]/F[X]h, so that F [ X ] / = F[X]h by Theorem 6.4.1 and therefore

f = h.

6.8

□

Further developments

Theorem 6.5.2 holds if R is a principal ideal domain, since it depends only on the fact of unique factorization ([Cohn 1], §10.5). It can even be extended to noncommutative principal ideal domains - see Chapter 8 of [Cohn: FRTR]. The description of the action of X on a cyclic i^X]-module in terms of the companion matrix does not really depend on the fact that the multi plication in a field F is commutative. Thus it can be extended to the case that F is a division ring - that is, F is a field except that multiplication may not be commutative. (Thus every field is also a division ring; a noncommutative example of a division ring is given in Exercise 6.9.) Details of the extended result are to be found in [Cohn: FRTR], §8.4. A polynomial ring F[X] over a division ring is an example of a noncommutative Euclidean domain (see [B & K: IRM], §3.2). When the ring R is not a field or division ring, the classification of cyclic ,R[X]-modules, that is, the description of the ideals of i2[X], can be a difficult problem.

Exercises 6.1

Let R be a commutative ring and let M be an i?-module. Recall from Exercise 4.6 that every i?-module homomorphism p : R —> M is determined uniquely by x = p(l) 6 M; then p(r) = rx for all r G R. Let a be a fixed element of R. Show that M(a) = {x £ M | ax = 0} is a submodule of M. Show that if £ : R/Ra -> M is an .R-module homomorphism, then C, = p where the element x corresponding to p belongs to M(a). Show conversely that if x 6 M(a), then x gives rise to a homomorphism from R/Ra to M.

Exercises

103

Deduce that there is a bijective correspondence between the set Hom(R/Ra,M) of all R-module homomorphisms from R/Ra to M and the set M(a). Show further that this bijection is itself an isomorphism of Rmodules (see Exercise 4.1). Now take M = R/Rb for some b £ R. Verify that (R/Rb)(a)

6.2

= {r€ R/Rb \ ar e Rb}.

Hence (or otherwise) show that Hom(Z p , Z ? ) = 0 if p, q are distinct prime numbers. Compute Hom(Z p ,Z p ), Hom(Z p 2,Z p ), and Hom(Z p ,Z p 2). The Second Isomorphism Theorem. This and the following exercise give two important consequences of the Induced Mapping Theorem (6.3.1). They are given only as exercises since they are not used explicitly in these notes. Let R be any ring and let K and L be submodules of a left Rmodule M. Recall from section 3.6 that K + L = {k + l | ke K,

leL}.

Define p : K —> (K+L)/L by p(k) = k. Verify that p is a surjective homomorphism of R-modules and that Ker(p) = K C\L. Deduce that K/(Kr\L) 6.3

S {K + L)/L.

The Third Isomorphism Theorem. (In this exercise, we use notation "x" to denote the image of x in any quotient module.) Let R be any ring and let M be a left .R-module. Suppose that K C L C M is a chain of submodules of M. Show that the canonical map L : L/K —y M/K, i(l) = 1, is an injective i?-module homomorphism. Regard i as the inclusion map (that is, think of L/K as a submodule of M/K), and define a : M/K -*■ M/L by a(m) = m. Prove that a is a surjective .R-module homomorphism with Ker(a) = L/K, and deduce that there is an isomorphism of R-modules (M/K)/(L/K)

6.4

£

M/L.

This result is also known as the Idiot's Cancellation Lemma, for obvious reasons. Let a and b be nonzero elements of a Euclidean domain R. Com bining the results of Proposition 2.8.2 with the Second and Third Isomorphism Theorems, prove

104

Chapter 6. Quotient Modules and Cyclic Modules

(a) Ra/Rab S R/Rb (see Theorem 6.5.2); (b) {R/Rab)/{Ra/Rab) £ fl/ito. 6.5 Let 0 0 / 0 0 0 0 1 0 0 0 0 1

-/o

C =

0

V0

0 0

1

— fn-2

0

be a rational canonical block matrix, and put / fn-iXn~l + Xn, so that C = C(f). Let I be an n x n identity matrix. Show that det{XI

s

\

-h -h

/o + / i X + ■ • • +

-A)=f.

Remark. Thus / is the characteristic polynomial of C, which we meet again in section 9.7. Hint: use row operations - see section 5.12. 6.6 Let R be a Euclidean domain and let l,p, q be distinct irreducible elements of R. Draw up diagrams that illustrate the submodules of R/Ra (as in section 6.5) when (a) a = p; (b) a = p2; (c) a = Ipq; (d) a = p2q; (e) a = p2q2. 6.7 Let R be any ring. A composition series for a left .R-module M is a finite ascending chain 0 = M 0 C Mi c • • • C Mfc_x c Mk = M in which each quotient Mi/Mi_i, i = l,...,k, is a simple left Rmodule. Suppose that R is a Euclidean domain and that M = R/Ra is cyclic, a / 0. Find a composition series for M , and show that the set {Mi/Mi-i | i = l,...,k} corresponds bijectively to the set of irreducible factors of a (counting multiplicities). Does R itself have a composition series? Remark. If a module has a composition series, then the set of simple quotient modules associated to the series is essentially unique. This classical result is the Jordan-Holder Theorem, a proof of which can be found in [B & K: IRM] (4.1.10).

Exercises 6.8

105

Let F be a field and let R be t h e ring o f n x n matrices over F. For k = 1 , . . . , n, let Ik be t h e set of all matrices of the form / an fl2i

••• flu 0 ■•• 0 \ • ■ • a2fc 0 ■ • • 0

\ a„i

■■• a nfc

0

• •■

s

0 /

Show t h a t each 7fc is a left ideal of R, that / = I\ is a simple left .R-module and t h a t Ik+i/h = / for fc = 1 , . . . , n — 1. Deduce t h a t 0 C J i C • - • C Ife C 4 + 1 C--- C R

6.9

is a composition series of R (as a left module). T h e quaternions. Let Q be a four-dimensional vector space over t h e real numbers with basis l,i,j,k. Introduce a multiplication on Q by t h e rules t h a t 1 is t h e identity element and t h a t i2 = j

2

= k2 = — 1 and ij = k,

the multiplication being extended t o arbitrary elements of Q by distributivity a n d associativity. Much checking confirms that Q is a ring. Verify t h a t ij = —ji, so t h a t Q is not commutative. For a n element v = a ■ 1 + ai + bj + ck of Q, put T(v) = 2a and N(v) = a2 + a2 + b2 + c2. Show t h a t N(vw) = N(v)N(w) for any two elements v, w of Q and t h a t v satisfies t h e polynomial equation X2 - T{v)X

+ N{v) = 0.

Deduce t h a t Q is a division ring, t h e quaternion

algebra

s

Chapter 7

Direct Sums of Modules In this chapter, we introduce the direct sum construction, which is a very useful tool for analysing the structure of a module, and for making new modules out of old. It comes in two varieties - internal and external. Internal direct sums arise when we wish to express a given module in terms of its submodules, this decomposition being "internal" to the mod ule. The ultimate aim of this approach to module structure is to describe the modules that cannot be expressed as a direct sum - these are the "in decomposable" modules - and then to show how a general module can be assembled from indecomposable component submodules. The other version of the direct sum construction arises when we wish to find a module that contains a given set of modules as components. There is no reason why two modules should both appear naturally as submodules of a third module, so we need an "external" construction for the larger module. At the end of the chapter, we show that the two types of direct sum are, to an extent, interchangeable, and we give an interpretation of a classical result from number theory, the Chinese Remainder Theorem, in terms of direct sums. The definitions and the formal properties of direct sums are valid for modules over any ring, but our illustrations and applications require that the ring of scalars is a Euclidean domain.

7.1

Internal direct sums

To begin, we consider the simplest examples of direct sums, in which there are only two components. 107

108

Chapter 7. Direct Sums of Modules

Let M be a left ^-module, where R is an arbitrary ring, and let L and TV be submodules of M. Then M is the internal direct sum of L and TV if the following conditions hold. IDSM 1: L + N = M; IDSM 2: L n TV = 0. The notation M = L © TV indicates that M is the internal direct sum of its submodules L, TV, which are then called the components or summands of M. We also say that TV is the complement of L in M, and vice versa. When we have expressed a module M as a direct sum of two (or more) components, we sometimes say that we have decomposed M or that we have found a decomposition of M. Before giving any examples, we reformulate the definition in a useful way. 7.1.1 Proposition Let L and TV be submodules of a left R-module M. Then the following assertions are equivalent. (i) M = L®N. (ii) Let m g M. Then there are unique elements I € L and n € TV with m = I + n. Proof (i) =>• (ii): Suppose that m € M is given. Since M = L + N, we have m = I + n for some I G L, n € N. If also m = V + n! with /' € L, n' 6 N, then l — l' = n' — n belongs to LC\N, which is 0. Thus I and n are uniquely determined by m. (ii) => (i): The fact that there are elements / and n with m = I + n for each m in M shows that M = L + N. Suppose that x 6 LflJV. Then £ = x + 0 with i e i and 0 s TV, and also a: = 0 + x with 0 € L and x € TV. By uniqueness, we must have x = 0. □ Comments & examples. (i) The order of the components is not important; if M = L © TV, then equally M = TV @L. (ii) We allow trivial direct sums in which one component L or TV is the zero module 0; the other component must then be equal to M. (iii) Let F be a field and let V = F2, the two-dimensional vector space over F. The standard basis {ei,e2} leads to internal direct sum decomposition V = U © W of V in which U = Fe\ and W = Fe^. More generally, any basis {fi, f2} of V gives V = Ff\ © Ff2 - see Lemma 5.2.1.

7.2. A diagrammatic

interpretation

109

The above comments hold when the field F is replaced by any ring R. (iv) The previous example illustrates that a module can be expressed as a direct sum in many ways, since a vector space has many bases. It also shows that the choice of one component of a direct sum need not determine the other. To see this, consider the bases of the form { e i , / ( a ) } with / ( a ) = I °. ), where a <E F is arbitrary. Then Ff(a)

^

Ff(b) if a ^ b, and V = Fex © Ff(a) for every choice of a. (v) Let_M = Z 6 , considered as a Z-module. Then 2M = {0,2,4} and 3M = {0, 3} are submodules of M with 2M n 3M = 0. The fact that M = 2M + 3M follows either by direct calculation or, more intellectually, from the identity 1 = 2 - 2 - 3 . In contrast to the previous example, 2M and 3M are the only nontrivial summands of M. This example foreshadows a general technique for constructing de compositions of modules over a Euclidean domain. (vi) Let F be a field, let A — f

I be a 2 x 2 diagonal matrix over F,

and let M be the F[X]-module defined by X acting as A on F2. Then the subspaces L = Fei and N = Fe^ are F[X]-submodules of M, the action of X on L being given by multiplication by b and on N by multiplication by c. We have M = L © N. Notice that we already know that M is the direct sum of L and N as a vector space over the field F; the real meat of this example lies in the fact that L and N are invariant under the action of X. (vii) More generally, suppose that ^ 4 = 1

n

^

l i s a f c x f c block matrix

over F, with diagonal blocks of sizes r x r and s x s respectively. (Hence r + s = k.) Let M be the .F[X]-module defined by X acting as A on Fk, and let L = Fe\ + - ■ -+Fer and TV = Fe r + iH hFek, where, as usual, e\,..., efc is the standard basis of Fk. Both L and N are .F[X]-submodules of M, the action of X on L being given by the matrix B and on N by C. Then M = L © N. (A more general version of this construction is given in section 7.5 below.)

7.2

A diagrammatic interpretation

If L and N are arbitrary submodules of a right .R-module M, there is no reason why we should h@e^^-)^|fffeg MaNeriaM or I fl JV = 0. The config-

Chapter 7. Direct Sums of Modules

110

uration of the four submodules LnN,L,N by the diagram M •

and L + N can be represented

L+N

N

LDN

0

On the other hand, when M is the direct sum of L and N, the equations Lf) N = 0 and L + N = M lead to the simpler diagram M

L

•

•

N

0

•

7.3. Indecomposable

modules

111

(If one of the submodules L, N is zero, the diagrams become even simpler.)

7.3

Indecomposable modules

A module is called indecomposable if it is nonzero, and if it cannot be expressed as an internal direct sum except trivially. Thus, when M is indecomposable, any expression M = L ® N must have either L — 0 or N = 0. Recall from section 3.5 that a simple module is a nonzero module that has no proper submodules. Clearly, a simple module must be indecom posable. The reverse is far from true. For instance, let R be a Euclidean domain considered as the regular .R-module. Since a submodule of R is an ideal (section 3.5) and any ideal is principal (Theorem 2.5.3), the nonzero submodules of R are the principal ideals Ra with a ^ 0. An intersection Ra (~l Rb of nonzero ideals contains the product Rab, which is again nonzero if a and b are nonzero. Thus R ^ Ra © Rb for any nontrivial choice of a, b. An example of a different kind occurs in Example 3.11. There, we con structed an F[A"]-module M that has only one proper nonzero submodule, which means that M must be indecomposable. We record a more general version of the above argument for future reference. 7.3.1 Theorem Let p be an irreducible element in a Euclidean domain R. Then the cyclic R-module R/Rpn is indecomposable for any integer n > 1. Proof By Theorem 6.5.2, we know that the nontrivial submodules of R/Rpn have the form Rpl/Rpn for i = 1 , . . . , n — 1. Thus every proper submodule of R/Rpn is contained in Rp/Rpn, from which we see that no two proper nonzero submodules L and N can satisfy the requirement that L + N = R/Rpn. (A diagram of the submodules of R/Rpn is given in section 6.5.)

□ Remark. It turns out that the only indecomposable modules over a Eu clidean domain are those of the form R or R/Rpn for an irreducible element p of R and positive integer n. In Corollary 7.8.2 below, we confirm this fact for cyclic modules, but we are some distance from being able to prove it without the prior information that the module is cyclic. Our next task is to consider internal direct sums with more than two components.

Chapter 7. Direct Sums of Modules

112

7.4

Many components

Let R be any ring and suppose that Li,...,Lk are /?-submodules of a left .R-module M. Then M is the internal direct sum of L\s... , Lfc if the following hold. IDSMkl: Li + --- + Lh = M; IDSMk 2: Li n (Li + ■ • • + U-i + Li+1 +

h Lk) = 0 for i = 1 , . . . , k.

The notation for such an internal direct sum is M = L\ © • • • © Lfc. The submodules Li, i = 1,... ,k are the components or summands of M, and the complement of L* is the submodule Lj = Li + • ■ • + Li_i + Li+i +

h Lfc.

The order of the terms is unimportant, and we allow the possibility that some components are the zero module. It is convenient to allow the trivial cases k = 0, where M = 0, and k = I, where M = L\. For k = 2 we regain the definition of the direct sum of two submodules. Notice also that if M = L\ © • • • © Lfc, then M = Li® Li for each i. Internal direct sums with many components will occur frequently later in these notes. An immediate example is provided by the standard left free module Rk. For any basis {bi,..., bt} of Rk, Rk = Rh © ■ • • © Rbk by Lemma 5.2.1. The proof of the following useful but straightforward extension of Propo sition 7.1.1 is left to the reader. 7.4.1 Proposition Let L i , . . . , Lfc be submodules of a left R-module M. Then the following assertions are equivalent. (i) M = Li © • • • © Lfc; (ii) Let m € M. Then there are unique elements h £ Li, . . . ,/fc € Lfc with m = li + ■ ■■ +/fc.

□

7.5. Block diagonal actions

7.5

113

Block diagonal actions

Let F be a field and let / D1 0 D = \

0

0 D2 0

0

Dk J

be a block diagonal matrix over F, with k blocks on the diagonal, and suppose that D is an s x s matrix. The action of D on F" defines an F[X\module M which can be expressed very naturally as an internal direct sum. Since this type of decomposition is very important in future applications, particularly in Chapter 13, we give the full details, although they appear rather gruesome at first sight. As a first illustration, we consider the case that 0 d2

\

0 0

0

ds J

/ di

D = \

is an s x s diagonal matrix over F (so that k = s). Put Li = Fei for i = 1 , . . . , s. As mentioned in the previous section, we already know that M = L\ © • • • © Ls as a vector space over F. However, each Li is also an F[JV]-submodule of M, with X acting as di, and so we have a decomposition of M as an internal direct sum of F[X]-submodules. For a general block diagonal matrix D as above, we have to overcome some notational complications to describe F-bases of the components Li of M. Suppose that the i-th block Di is an n(i) x n(i) matrix, where n(i) > 1 is an integer. Since D is an s x s matrix, we have s = n(l) H

\-n(k).

Let { e i , . . . , es} be the standard basis of Fs, and put L\ =Fei H hFe„(i), L>2 = JFe n (i) + i + 1- Fe n (i) + n ( 2 ), Li = F e n ( 1 ) + . . . + n ( j _ i ) + i + ■ • ■ + i ? e n ( 1 ) + . . . + n ( i _ i ) + n ( i ) ,

Lk = ■f1en(l) + --+n(fc-l) + l +

•" Fen(i)

+

...+n(k-l)+n(k)-

Chapter 7. Direct Sums of Modules

114

Because of the block form of D, each Ll is an F[X]-submodule of M on which X acts as LV To see this explicitly, write out the submatrix Di as Di = (dui) where 1 < u,v < n(i). The (u,v)-entry of Di is the n(l) + h n(i — 1) + u, n(l) H + n(i - 1) + v-entry of the matrix D, and the remaining entries in row n(l) + • ■ ■ + n(i — 1) + u of D must all be zero, since they lie outside £>,. Thus (i) ^" e n(l)+ •+n(i-l)+t) = ^ l , v e n ( l ) + - + n ( i - l ) + lH

(i) l-^ n ( i )^e n ( 1 ) + ... + n ( i _i) + n (i)

for v = 1 , . . . ,n{i). It follows that M = Li © ■ • ■ © Lfc as an F[X]-module, as desired.

7.6

External direct sums

When we express a module as an internal direct sum, we break it down into component submodules. The construction of an external direct sum solves the converse problem: given a collection of modules, find a larger module that has the given modules as its direct summands. There is a complication in building up a module from specified com ponents, since, given an arbitrary pair of i?-modules L and ./V, there is no reason why there should be any module M that contains both L and TV as submodules. For instance, there is no obvious candidate for a Z-module that contains both Z and Z2. To escape from this difficulty, we must be content with a construction that causes the given modules to be replaced by isomorphic modules. Suppose that P\,... ,Pk are left .R-modules over some ring R. Recall from elementary set theory that the Cartesian product of P\,... ,Pk is the set

Px x ••• xP fc = {(pi,...,pk)

I PI e Pi,...,p f c e Ffc},

where (Pi, •• -,Pk) = (pi,- • • ,Pfc) «=> Pi = Pi, ■ • • ,Pfc = p'fc. Then the external direct product of P i , . . . , Pfc is the set Pi x • • ■ x Pfc made into a module by the rules (pi,...,Pfe) + (p' 1 ,... 1 p' fc ) = (pi + p ' i , . . . , Pfc + p'k) and r- (pi,.-.,Pit) = (rpi,...,rp f c ), where Pi,pi G P i , . . . ,Pfc,p'fc £ Pfc and r G R. An easy but long-winded verification of the axioms (3.1) confirms that Pi x • • ■ x Pfc is an R-module.

7.7. Switching between internal & external

115

In the case that P\ = ■ ■ ■ = Pk = R, the external direct product is simply the standard free left module Rk as defined in section 5.1, except that its members appear as rows rather than columns. On ordering. The ordering of the factors is important in an external direct sum, as opposed to the situation with an internal direct sum, where it does not concern us. The reason for our concern is that the same module may appear more than once as a component of an external direct sum, and so we must rely on the order of terms to distiguish elements that have the same unordered set of entries. For example, we must not confuse (0,1) with (1,0) in R2. (See also Exercise 7.5 below.) The axioms for an internal direct sum rule out any repetition of summands, save for 0 terms, since the different components can only have zero intersection.

7.7

Switching between internal & external

Next we see how an external direct sum can be rewritten as an internal direct sum, and vice versa. Suppose that M = Pi x ■ • ■ x Pfc, and define Li = {(pi,0,

,0,0) | px 6 Pi)

U = {(o,...,o, P l ,o,...,o)

\PlePt}

£fc = {(o,o,

\pkePk}.

,o,pfc)

Then each Li is a submodule of M. Since (pi, ■ • • ,Pk) = (Pi, 0, ■ • •, 0) + • ■ • + ( 0 , . . . , O.pjt) for any element of M, we see that M = Lx + --- + Lk. The other requirement for an internal direct sum, that Li n (Lj H

h Li_! + Li+i + ■ ■ ■ + Lk) = 0 for i = 1 , . . . , k,

is satisfied since the entries of a member of the Cartesian product are uniquely determined. Thus the axioms in section 7.4 hold, and we can write M = Li © ■ ■ ■ ® Lk.

Chapter 7. Direct Sums of Modules

116

Although the original modules Pi are not themselves contained in the external direct sum, each is isomorphic to its corresponding submodule Li by the i?-module isomorphism 0i : Pt -> Lh

6{Pi) = ( 0 , . . . , 0 , P i , 0 , . . . ,0).

In the reverse direction, any internal direct sum is isomorphic to an external direct sum. To see this, suppose that M = L\ © ■ • ■ © Lk is an internal direct sum, and put P — L\ X • • • x Lk, the external direct sum of the submodules Li of M. Since each element m of M can be written in the form m = l\ H— ■ + h where the elements l\ e L\,...,lk £ Lk are uniquely determined by m (Proposition 7.4.1), there is a well-defined map ip : M -> P given by i/;(m) = (ii,...,/fc). An easy verification shows that V is a n isomorphism of Rmodules. We summarize the preceding discussion as a proposition. 7.7.1 Proposition The following assertions hold. (i) If a left R-module M can be expressed as an internal direct sum M = L\ © • ■ ■ ©Lfc, then M is isomorphic to the external direct sum Li x ■■• x Lk.

(ii) If P = P\ x ■ • • x Pk is an external direct sum of left R-modules, then P is an internal direct sum P = L\ © • ■ • © Lk of submodules L\,..., Lk with Li = Pi for i = 1 , . . . ,k.

□ 7.8

The Chinese Remainder Theorem

Our aim now is to show how a classical result from number theory form, namely, the Chinese Remainder Theorem, leads to direct sum decomposi tions of cyclic modules over Euclidean domains. In its most familiar form, the theorem reads as follows. Suppose we are given a pair of coprime positive integers m and n, and an arbitrary pair of integers y, z. Then there is an integer x which satisfies both the congruences x = y mod m and x = z mod n. We reformulate this assertion in the language of rings and modules. First, notice that the congruence x = y mod z is equivalent to the equality x = y in the cyclic Z-module Z m (see 1.11), and similarly x = z mod n means that x = z in ^© 0 n(^feft^l?^f^^-/| n e a n i n S °f t n e notation "x"

7.8. The Chinese Remainder

Theorem

117

to vary according to context, which is more convenient than introducing several notations for residue classes.) Next, observe that there is a canonical Z-module homomorphism Q : Z —► Z m x Z n , a(x) = (x,x). Thus the Chinese Remainder Theorem asserts that a is a surjection. This algebraic formulation prompts us to ask for the kernel of a, which, as we shall see, is the ideal mnZ. By Theorem 6.3.2, we then have an isomorphism which identifies the direct sum Z m cyclic module. In classical language, the interpretation of the fact that a is an isomor phism is that the integer x is unique modulo mn. With this preamble, we now give the proof of the algebraic form of the Chinese Remainder Theorem, working over an arbitrary Euclidean domain rather than the integers. 7.8.1 The Chinese Remainder Theorem Let R be a Euclidean domain and let b and c be coprime elements of R. Then the canonical R-module homomorphism a: R-> R/Rb x R/Rc is a surjection, with Ker(a) = Rbc. Furthermore, there is an induced isomorphism a : R/Rbc -s- R/Rb x R/Rc of R-modules. Proof Denoting residue classes in either R/Rb or R/Rc by x, the map a is given by a(x) = (x,x), which is a homomorphism since it is composed of two canonical homomorphisms. Now suppose we are given an element (y, z) in R/Rb x R/Rc. Since b and c are coprime, we can write 1 = sb + tc for some elements s,t of R (see Lemma 2.8.1). Put x = zsb + ytc. Then x = ytc = y mod b, so that x = y € R/Rb, and similarly x = z £ R/Rc. Thus a is surjective. Clearly, x e Ker(a) if and only if x is divisible by both b and c, which means (Lemma 2.8.1 again) that x is divisible by be, that is, x £ Rbc. The final assertion follows from the First Isomorphism Theorem 6.3.2.

□

118

Chapter 7. Direct Sums of Modules

Remark. At this point, the reader might expect to find a description of the corresponding internal direct sum decomposition of R/Rbc. However, this description requires some machinery that we are going to develop in a more general setting in the next chapter, so we postpone a statement until more tools are at our disposal - see Corollary 8.3.2. 7.8.2 Corollary Let M = R/Ra be a cyclic module over a Euclidean domain R. Then M is indecomposable if and only if a = upn, where p is an irreducible element of R, u is a unit of R and n > 1. Proof Suppose that M is indecomposable. By the preceding result, a can not have two nontrivial coprime factors, and so the irreducible factorization can involve only one irreducible element of R. Thus a = upn as claimed. Conversely, if a = upn, then M = R/Rpn is indecomposable by Theorem 7.3.1.

□ Exercises 7.1

Let I, p and q be distinct irreducible elements of a Euclidean domain R. Using the diagrams you found in Exercise 6.6, suggest internal direct sum decompositions of the modules (a) R/Rlpq; (b) R/p2qR: (c) R/p2q2R. (The next chapter contains a systematic method for obtaining such decompositions.) / 0 1 0 7.2 Let N be the C[X]-module given by the matrix S = 0 0 1 \ 1 0 0 ) acting on C 3 , which we considered in Exercise 3.7. Find a direct sum decomposition of N into three one-dimensional components, and show that this decomposition is unique. Investigate what happens if the field of complex numbers is re placed by the real numbers K, or by the finite fields Z 2 , Z 3 , or Z 7 . 7.3 Let M i , . . . , Mfc be a set of i?-modules, and let Li be a submodule of Mi for i = l,...,fc. Show that there is an .R-module isomorphism (Mi x • • • x Mfc)/(.Li x ■ • • x Lfe) = M i / L j x ■ • • x

Mk/Lk.

Exercises 7.4

119

Let D be the ring of 2 x 2 diagonal matrices over a field F, and let e = e n and / = e 22 . Using Exercise 1.7, show that D = De © £>/ as a left P-module. Prove that the only D-module homomorphism from De to Df is the zero homomorphism and hence that the /^-modules De and Df are not isomorphic. (This contrasts to part (e) of Exercise 7.8.) Generalize these results to the ring o f n x n diagonal matrices over F. Let Pi x Pi be an external direct sum of P-modules. Define LJ : Pi x P2 —» P 2 x Pi by u){pi,p2) = (p2,Pi)- Show that UJ is an isomorphism of P-modules. Given a set of modules { P i , . . . , Pk), k > 2 and any permutation a of the integers 1 , . . . , k, prove that

s

7.5

Pi x ■•• x P f c ^ P Q ( i ) x ••• xP Q ( f c ) . 7.6

This exercise and the next anticipate some results that are developed further in section 14.2. Let R be a ring and let M be a left P-module with M = L © TV. (a) Show that the canonical homomorphism IT : M —> M/L induces an isomorphism r : N —> M/L. (b) Let L = inc : N -> M be the inclusion map and let a=

{T)-1TT:

M -» M / L - >

TV.

Show that at = idw, the identity map on TV. (c) Conversely, suppose that there is a left P-module Q, and homomorphisms a : M —> Q and e : Q —> M with ae = id,M- Verify that a is surjective. Show that for any m G M, we have m — etr(m) e Ker(
Let P = Pi x P 2 be an external direct sum of P-modules. Write idi for the identity map on Pj, i = 1,2 and define maps as follows: 7ri : P->- Pi, 7ri(pi,p 2 ) =Pi 7r 2 :P->-P2, 7r 2 (pi,p 2 ) =P2 ei.Pi^P,

ei(pi) = (pi,0)

£2 : P2 -> P

«2(P2) = (0,p 2 )-

s

Chapter 7. Direct Sums of Modules

120

Verify that these are all homomorphisms, and that the relations TTiCi = idi,

7r2£2 = id-2, 7T2£l = 0, 7TXE2 = 0

and £l7Ti + C27T2 =

s 7-8

idp

hold. Conversely, suppose we are given a collection of modules P, Pi, P2 and homomorphisms as above. Show that P = P\ x P%. Generalize this exercise from 2 to k terms. Let R be the ring of all n x n matrices over a field F. For each pair of integers i, j = 1 , . . . ,n, let Cy be the matrix with entry 1 in the (i,j)-th place and all other entries 0. The set of all such matrices is sometimes called a set of standard matrix units for R (despite the fact that the matrices e^ are not units of R). (a) Show that the set of standard matrix units is a basis for R as a vector space over F. (b) Prove that _ j ehk if i = j ehi.eik - j 0 ii i^j ■ (c) For each i, j , let Ij = Reij. Deduce that Ij is the set of all matrices A = ( 0 , . . . , 0, a,-, 0 , . . . , 0), where the j - t h column Oj is an arbitrary vector in the column space Fn and all other columns of A are zero vectors. (Thus Ij does not depend on the value of i). (d) Show that R = I a left i?-module. (e) For each pair of suffices j , k, define 6jk : Ij -> Ik by 9jk(x) = xejkVerify that 6jk is an isomorphism of left i?-modules. (f) Show that R has no two-sided ideals apart from 0 and itself. (This result generalizes Exercise 1.9.)

7.9

s

Direct products of rings. Exercise 7.8 can be generalized by introducing the direct product of a set of rings. This is the construction for rings that corresponds to the direct sum for modules. Let Ri,..., R^ be a set of rings, and let R = Ri x • ■ ■ x Rk be the external direct sum of Ri,..., Rk as additive groups. Define the product by ( r i , . . . , r f c ) ( s 1 , . . . , s f c ) = (risi,...,r f c s f c ) and confirm that R is a ring, with identity element IJI = ( l i , . . . , lfc) where lj is the identity element of Rt.

Exercises

121

Show that R is commutative if and only if each Ri is commutative, but that R is not a domain (save in trivial cases). For each i, let U = {(0,... , 0 , r , , 0 , . . . ,0) | r* e Rt}. Show that each h is a two-sided ideal of R and that R = I\ © ■ ■ ■ © Ik as an additive group. Conversely, suppose that we have a set I\,..., Ik of two-sided ide als of R and that R — Ix ffi ■ ■ • © Ifc as an additive group. Write 1 = ej H 1- efc with e* in / j . Show that e\ = e» and that e ^ = 0 if i ^ j . (So e i , . . . , efc is a set of orthogonal idempotents for i?.) Verify that /j = Rti for each i, that Ii is a ring with identity ei, and that i? = I\ x • • • x Ik as a ring. Show also that the only .R-module homomorphism from Ii to Ij is 0 if i / j . 7.10 Infinite direct s u m s . Infinite direct sums of modules are a useful source of counterex- O amples. We also need them for our treatment of projective modules in Chapter 14. Here, we sketch the definition. First, we must define an ordered set. This is a set / so that, for any two distinct elements i, j of I, either i < j or j < i, but not both. We require also that ii i < j and j < k, then i < k. The finite sets { 1 , . . . , n} of integers are ordered in the obvious way, which is why we can avoid any explicit mention of ordered sets in the main part of this text. Let R be any ring and let {Mi \ i € / } be an infinite set of left .R-modules indexed by an ordered set I. The external direct sum M = 0 7 Mi of these modules is defined to be the set of all infinite sequences m = (m,i), rrii £ Mi for all i which satisfy the restriction that only a finite number of terms m^ of m can be nonzero. Thus, there is an index s(m), which depends on m, so that mi = 0 for all i > s(m). (The fact that the order is important in an external direct sum explains why J must be ordered.) Define addition and scalar multiplication in M by the obvious analogy with the finite case. Verify that M is an i?-module and that Mi is isomorphic to a submodule of M. If we take Mi = R, we obtain the free module R1. For each i E I, let ei be the element of R1 that has entry i in the i-th place and zeroes elsewhere. Verify that {e^ \ i £ 1} is a basis of R1, the standard basis - you will need to formulate the generalization of "basis" to infinite sets.

122

Chapter 7. Direct Sums of Modules

C 7.11 Let R be the set of all infinite matrices A = (aij) over a field F, with rows and columns indexed by the positive integers, subject to the condition that each row and column of A has only a finite number of nonzero entries. Verify that R is a ring under the expected rules of addition and multiplication. Find a set of left ideals I\s... so that R is an internal direct sum 0 Ii as a left .R-module. Let rnR be the set of all matrices A in R that have only a finite number of nonzero entries. Show that mR is a two-sided ideal of R. (This provides a contrast to Exercise 7.8 above.)

Chapter 8

Torsion and the Primary Decomposition Now that we have the language of direct sums at our disposal, we can start the task of expressing a general module M over a Euclidean domain R as a direct sum of simpler submodules. The first step is to isolate a submodule T{M) of M, the torsion submodule of M, which is in a sense the "non-free" part of M. In a later chapter, we shall see that M = T(M) © P with P a free module. If a module is equal to its torsion submodule, that is, it has no free component, then the module is called a torsion module. We will show that a finitely generated torsion module can be annihilated by a nonzero element a of the ring of scalars R. If a can be chosen to be a power p " of an irreducible element of R, then the torsion module is called a p-primary module. The main result in this chapter is that any finitely generated torsion module can be decomposed into a direct sum of p-primary submodules, one for each irreducible divisor of o. An immediate consequence of this result is that a cyclic module can be decomposed into a direct sum of p-primary indecomposable cyclic submodules. It will take us several more chapters before we can obtain the corresponding result for non-cyclic modules in section 12.5. In this chapter we take the ring of scalars R to be a Euclidean do main, apart from some preliminary definitions that need R to be only a commutative domain.

123

124

8.1

Chapter 8. Torsion and the Primary

Decomposition

Torsion elements and modules

Let R be a commutative domain and let M be an .R-module. By definition (section 4.11), the annihilator of an element m € M is the ideal Ann(m) = {r £ R | rm = 0} C R. An element m £ M is said to be a torsion element of M if Ann(m) ^ 0, that is, there is some nonzero element a e R with am = 0. The zero element of any module is always a torsion element, since Ann(0) = R (remember that we insist that a domain is a nonzero ring). Thus we say that a module is torsion-free if the only torsion element in M is 0. At the other extreme, a module is a torsion module if all its elements are torsion. An example of a torsion-free module is provided by the ring R itself since R is a domain, the equation ar = 0 has no solution apart from the trivial ones with either a = 0 or r = 0. More generally, the standard free modules Rk are all torsion-free. On the other hand, any cyclic module of the form R/I with J / 0 is torsion, since / ■ {R/I) = 0. The zero module 0 is allowed to be both torsion and torsion-free. Some of our results have trivial exceptional cases caused by the presence of su perfluous zero submodules or summands; we shall ignore these. Notice that the definition of torsion depends on the coefficient ring R. For example, a field F is always torsion-free when considered to be an Fmodule, that is, a one-dimensional space over itself. On the other hand, if F is viewed as a module over the polynomial ring F[X] with X acting as A for some A in F, then every element of F is annihilated by X — A and so F is a torsion F[X]-module. To handle modules which may contain both torsion and non-torsion elements, we introduce the torsion submodule T(M) of M: T(M) = {m € M | m is torsion}, which consists of all the torsion elements in M. Thus M is torsion-free if and only if T{M) = 0, while M is a torsion module precisely when M = T(M). The next result justifies the use of the word "submodule", and gives an important property of T(M).

8.2. Annihilators of modules

125

8.1.1 Proposition (i) T(M) is a submodule of M. (ii) M/T(M) is torsion-free. Proof (i) Suppose that m,n e T(M). Then am = 0 and bn = 0 for nonzero elements a, b of R. Since R is a domain, ab ^ 0, and clearly ab(m + n) = 0. Thus m + ne T(M). If r e R, then n(rm) = r(am) = 0, so rm € T(M), confirming that T(M) is a submodule. (ii) Let x € T{M/T(M)), so that there is a nonzero element b of R with 6a; = 0. By definition of the quotient module (section 6.1), x = m for some m in M. Since the zero element of M/T{M) is 0, we have bra = bx = 0, and hence bra 6 T{M). But then a{bm) — 0 for some a ^ 0 in .R; since ab ^ 0 and a6m = 0, we have m € T(M) and so a; = 0 in M/T(M). U An example. Here is an example of a module which is neither torsion nor torsion-free. Let n be a nonzero positive integer and put M = Z x Z n , the external direct sum of Z-modules. To describe M as an internal direct sum, take L = {(y, 0) | y e Z} and N = {(0, z) \ z £ Zn}. As in section 7.7, we have M = L®N. Let x = (y, z) G Af, and suppose that a ^ 0 in Z and that ax = 0. Then aj/ = 0 and az = 0, which means that y = 0 and that z = w for some w 6 Z with Sw = 0, that is, aw; C nZ. Thus T(M) C TV; but it is clear that we have the equality T(M) = TV since nz = 0 for all z in Z n . The quotient module M/T{M) can be identified with L = Z by using the First Isomorphism Theorem as in Exercise 7.6.

8.2

Annihilators of modules

Next we extend the definition of annihilators from elements to modules. Let M be an i?-module. The annihilator of M is Ann(M) = {r e R\rm

= 0 for all m € M } .

An easy verification shows that Ann(M) is an ideal of R. When the module M is cyclic, say M = Rx, then Ann(M) = Ann(x), so an annihilator of an element is a special case of an annihilator of a module. If the module M is not torsion, then Ann(M) = 0, since there is some element m of M which is not in T(M) and so Ann(M) C Ann(m) = 0. In the other direction, it is possible for the annihilator of a torsion module

126

Chapter 8. Torsion and the Primary

Decomposition

to be 0 - see Exercise 8.6 below. However, the next result shows that this does not happen in the cases of most interest to us. 8.2.1 Proposition Suppose that M is a finitely generated R-module. Then M is a torsion R-module if and only if Ann(M) ^ 0. Proof Suppose first that M is torsion. Since M is finitely generated, we have M = Rx\ + ■ ■ ■ + Rx3 for a finite set of generators x\,.. .,xs of M. Since M is torsion, each generator Xi has a nonzero annihilator ideal, so we can choose a nonzero element a* of Ann(xj) for each i. Put a = ai ... as. Then a ^ 0, and axi = 0 for all i, which implies that a{r\X\ + ■ ■ ■ + rsxs) = 0 for any element of M. Thus a G Ann(M). The converse argument is obvious. □ 8.2.2 Corollary Let F be a field, let A be an n x n matrix over F and suppose that M is the F[X]-module obtained from Fn with X acting as A. Then M is a torsion F[X)-module. Proof Since the vector space of all n x n matrices over F has dimension n 2 , the n 2 + 1 powers I, A,..., An must be linearly dependent. Thus there is a nonzero polynomial g(X) of degree at most n 2 with g(A) = 0. It follows that g(X) is a nonzero element of Ann(M). □ Remark: the annihilator of M actually contains a polynomial of degree n, namely, the characteristic polynomial of A - see Exercise 9.2.

8.3

Primary modules

For the remainder of this chapter, we shall assume that the coefficient ring R is a Euclidean domain. We know from section 2.9 that a nonzero element a of R has a standard factorization n(l)

a = upi

n(fc)

.. .pkK '

where u is a unit of R, Px,-.-,Pk are distinct (that is, nonassociated) ir reducible elements of R, and n ( l ) , . . . ,n(k) are positive integers. The set Pi,...,Pk is uniquely determined by the element o, apart from its order ing, and the integers n ( l ) , . . . ,n(fc) are unique once an ordering of the irreducible factors has been chosen.

8.3. Primary modules

127

Let M be a finitely generated torsion fi-module and write Ann(M) = Ra. Then M is said to be a p-primary module if there is a single irreducible element p of R with a = upn for some unit u of R\ in this case, -Ra = Rpn, so we can omit the unit u. A cyclic module of the form R/Rpn is evidently p-primary, with annihilator Rpn, and we will eventually prove that any p-primary module is a direct sum of such cyclic modules with varying exponents n. For the moment, we will show that any torsion module can be expressed as a direct sum of primary components. The following lemma provides the key tool. 8.3.1 L e m m a Suppose that R is a Euclidean domain and that M is an R-module with Ann(M) = Ra, a ^ O . Suppose also that a = be for coprime elements b,c in R. Then the following hold. (i) M = bM ® cM, where bM = {bm | m e M) and cM = {cm | m £ M}. (it) Ann(WW) = Re and Ann(cM) = Rb. Proof (i) Since b and c are coprime, we have 1 = xb + yc for some elements x, y of R. Then for any m in M, m = 1 ■ m = b{xm) + c(ym), and so M = bM + cM. If m e bM D cM, we have m = bl = en for some l,n € M and hence cm = cbl = al = 0 and similarly bm = 0. Expanding 1 • m again, it follows that m = 0, which gives bM D cM = 0 and hence M = bM © cM. (ii) Suppose that x G Ann(6M). Then xbm = 0 for all m in M, and so xb £ Ann(M) = Rbc. Since R is a domain, this means that x € i?c, which shows that Ann(WW) C Re. But clearly Rc(bM) = 0, so that Ann(fcM) = Re. The other equality follows by symmetry. □ Combining the above result with Theorem 6.5.2, we obtain the "inter nal" version of the Chinese Remainder Theorem (7.8.1) that was promised in the preceding chapter.

128

Chapter 8. Torsion and the Primary

Decomposition

8.3.2 Corollary Let M = R/Rbc be a cyclic R-module where b and c are coprime ele ments in R. Then M = bM®cM with bM * R/Rc

and cM £* R/Rb.

□ 8.4

The p-primary component

Let M be a finitely generated torsion .R-module, where R is a Euclidean domain, and let p be an irreducible element in R. The p-primary submodule or component of M is TP(M) = {m e M | p r m = 0 for some r > 1}. An argument similar to that in the proof of Proposition 8.1.1 shows that TP(M) is a submodule of M. It is not quite obvious that a module is p-primary if and only if M = TP(M) - there is a slight problem with the question of whether or not a submodule of a finitely generated module need itself be finitely generated. This is in fact always the case when R is Euclidean, but we have to wait until Theorem 9.4.2 is available before we can use this fact. For a similar reason, we have not defined the p-primary submodule of an arbitrary finitely generated .R-module - we do not know yet that the torsion submodule is again finitely generated. The next result shows how the nontrivial p-primary components of a module are determined by its annihilator. 8.4.1 Theorem Suppose that R is a Euclidean domain and that M is a finitely generated torsion R-module with Ann(M) = Ra, a ^ 0. Let p be an irreducible element of R and write a = ppn with p coprime to p. (i) If p is not a factor of a (so that a = p), then TP(M) = 0. (ii) In general, the p-primary component Tp(M) of M is pM. (Hi) There is a direct sum decomposition M =

pM®pnM.

8.4. The p-primary

component

129

Proof (i) Since p does not divide a, the elements a and pn are coprime in R for any n > 1. Thus 1 = xa + yp n for x,y e R, and m = m • 1 = 0 for m in T„(M). (ii) By Lemma 8.3.1, we have M = pM ® pnM, and Ann(pM) = Rpn. If m i , . . . , mt is a finite set of generators for M, then pm\,... ,pmt is a finite set of generators for pM, which is thus a p-primary module according to our definition. This shows that pM C TP(M). To prove the reverse inclusion, suppose that m is an element of TP(M). Then m = x + y with x £ pM and y e pnM. But y must be annihilated by p n , since m is, and by p, since pnM is. Thus y = 0 and so m is in pM, which gives the desired equality, (iii) The decomposition is now immediate from Lemma 8.3.1. □ We can now give the complete primary decomposition of a torsion mod ule. 8.4.2 Theorem Suppose that R is a Euclidean domain and that M is a finitely generated torsion R-module with Ann(M) = Ra, aj^O. Let a = up™ .. .p2 be a standard factorization of a where u is a unit of R and Pi, ■ ■ ■ ,Pk are distinct irreducible elements of R. Choose pt so that a = p™ Pi for i = 1 , . . . , k. Then, for each i = 1 , . . . ,k, the pi-primary component TPi(M) of M is PiM, and M has the direct sum decomposition M =

TPl{M)®---®TPk{M).

Proof We induce on i If k = 1, then M is pi-primary by definition, and the direct sum "decomposition" has only one term. Assume now that k > 2. By the previous result, Theorem 8.4.1, we have TPl (M) =p1M and M = TPl{M)®pnl(l)M.

(8.1)

By Lemma 8.3.1, p™(1)'M has annihilator Re with c = p£ p£ ■ Choose pi so that c = p™ pi for i = 2 , . . . , k. Then, for each i, the in duction hypothesis tells us that the pi-primary submodule of p™( 'M is PiP^M, which is simply pzM. We also know, by the induction hypothesis again, that p^M

= =

TP2(p?(1)M)©-..eTPfc(p?(1)M) p2M®---®pkM.

(8.2)

130

Chapter 8. Torsion and the Primary

Decomposition

Combining Eqs. (8.1) and (8.2), we see that M = pxM © ■■■ ®pkM.

Since Theorem 8.4.1 gives the equalities ptM = TPi(M) for all i, the result follows. □

8.5

Cyclic modules

Our results enable us to find the primary decomposition of a cyclic module over a Euclidean domain R. By Theorem 6.4.1, such a module has the form R/I for a unique ideal I oi R. If / = 0, the cyclic module is R itself, which is not torsion. We therefore assume that / is not zero, so that / = Ra for some nonzero element a of R. The element a can be multiplied by a unit of R (Lemma 1.8.1) without changing the ideal / , so we can assume that a has a standard factorization a = p" .. .pk . By our main theorem above, TPi(R/Ra)

= p^R/Ra)

Since a = p™ Pi, we have RpJRa have isomorphisms

=

= R/Rp^'

TPt (R/Ra) S R/Rp"{i)

RpJRa. (Theorem 6.5.2). Thus we

for i = 1 , . . . , k,

and so, using Proposition 7.7.1, we can express R/Ra as an external direct sum of primary modules R/Ra = R/Rp"{1)

x • • •x

R/Rpnk{k).

Note that, by Theorem 7.3.1, the modules R/Rp*^ are all indecom posable, so we cannot split R/Ra into smaller components, at least, not by this technique. A uniqueness theorem to be proved later shows that any other method of decomposing R/Ra into indecomposable modules must give essentially the same result.

8.6

Further developments

Our treatment of torsion depends heavily on the hypotheses that R is com mutative and that it is ^^J^fifetTAlfe/igfftlJrit decompositions given in

Exercises

131

Theorems 8.4.1 and 8.4.2, for which we assume R to be Euclidean, hold over principal ideal domains in general. The theory of torsion can be extended to modules over arbitrary com mutative rings, where it is usually treated in tandem with the theory of localization, which does not make an appearance in these notes. There is a good theory of primary decomposition for modules over commutative rings, which can be found in Chapter 4 of [A & McD] or of [Sharp]. There is also a useful definition of torsion over noncommutative domains provided a fur ther requirement, the "Ore condition", is satisfied - see §2.1 of [McC & Ft]. The construction of satisfactory primary decomposition theories for classes of noncommutative rings is a difficult problem; again the reader should look at [McC & R] and the references therein.

Exercises 8.1

8.2 8.3 8.4

(In these exercises, R is a Euclidean domain and all modules are finitely generated R-modules, unless otherwise stated.) Suppose that L is a submodule of M. Show that T(L) is a submodule of T(M) and that TP(L) is a submodule of TP(M) for any irreducible element p of R. More generally, if 9 : L —y M is an R-module homomorphism, show that 9 induces a homomorphism T(9) from T(L) to T(M), and likewise for Tp. If 9 is a surjection, does it follow that T(9) is a surjection? Suppose there is an isomorphism 9 : L —y M. Show that T(9) : T(L) ->■ T(M) and Tp(9) : TP(L) -> TP{M) are also isomorphisms. Show that every free .R-module is torsion-free. Suppose that M = L® N. Prove that (a) T{M) = T{L)®T(N)(b) TP(M) = TP(L) 0 TP(N) for any irreducible element p of R.

8.5

Generalize these results to the case that M has k > 2 components. Suppose that P = L x N. Prove that (a) T(P) 3 T(L) x T(N); (b) Tp(P) =* TP(L) x TP(N) for any irreducible element p of R.

8.6

Generalize these results to external direct sums with k > 2 compo nents. Let the Z-module M = © i > 2 ^» be the infinite direct sum of the finite cyclic modules Zj (Exercise 7.10). Verify that every element of M is torsion, but that Ann(M) = 0.

132 8.7

Chapter 8. Torsion and the Primary

Decomposition

Suppose that b and c are nonzero elements of R which are not coprime. In this example, we outline an "elementary" argument which shows that the direct product R/Rb x R/Rc is not cyclic. This should be contrasted with the Chinese Remainder Theorem 7.8.1. The fact that R/Rb x R/Rc is not cyclic can also be deduced from the general uniqueness results which we shall obtain in Theorem 12.8.1 and section 12.9. The argument is by contradiction, using the fact that a module M is cyclic if and only if there is a surjection from R to M (section 4.11). Verify the following statements, in which M = R/Rb x R/Rc. (a) There is an irreducible element p of R which divides both b and c. (b) There are positive integers m and n with TP(R/Rb) = R/Rpm, TP{R/Rc) =* R/Rpn, and TP{M) S R/Rpm

x

R/Rpn.

(c) If M is cyclic, so is TP(M) (see Exercises 7.6 and 7.7). (d) Tp{M)/pTp{M) S R/Rp x R/Rp. (e) If M is cyclic, there is an .R-module surjection /3 : R —> R/Rp x R/Rp. (f) Rp C Ker(/?), and so there is an induced surjection j3 : R/Rp —> fi/ify x H/iZp. (g) /? is also an R/Rp-module homomorphism. (h) This cannot happen, since R/Rp is a field (Proposition 2.11.1) and so there can be no surjective R/Rp-linear transformation from R/Rp to {R/Rp)2.

Chapter 9

Presentations Our results to date have given us a reasonable hold on the theory of cyclic modules over a Euclidean domain R. Combining Theorem 6.4.1 with the factorization theorems for elements of R, we know that a cyclic .R-module has the form R/Ra for an element a of R which is unique up to multipli cation by a unit, and we can calculate the primary decomposition of R/Ra as in section 8.5. The problem we now face is to extend these results to arbitrary finitely generated i?-modules. To provide some motivation for what follows, we take another look at how we recognize a cyclic module as being isomorphic to one of the form R/Ra. The statement that a module M is cyclic tells us that it has a single generator, say m; then M = Rm. The generator will satisfy some "rela tions" , that is equations of the form rm = 0 for r £ R. If the only such equation is the trivial one, Om = 0, then {771} is a basis for M, which means that M is free and isomorphic to R. If there are nontrivial relations, then there is a "fundamental" relation am = 0 with the property that all other relations are consequences of this fundamental relation, that is, they take the form (xa)m = 0 for x 6 R. This assertion is simply a reinterpretation of the fact that the set of coefficients r with rm = 0 forms the annihilator ideal Ann(m) of m, which is a principal ideal Ra. The First Isomorphism Theorem now assures us that M = R/Ra. It is the analysis of generators and relations that provides the key to our description of /^-modules in general. Suppose that M has a finite set { m i , . . . , mt} of generators, which means that each element m in M can be written m = rivfix ~\ (- rtmt for some r j , . . . , rt e R. Then there are usually some relations between the generators, that is, iden133

Chapter 9.

134

Presentations

tities of the form J"ITTII

H

4- rtm t = 0.

If there is no relation except the trivial one with all coefficients r; = 0, then the generating set is, by definition, a basis of M, and M = i?* is a free module as in Chapter 5. When there are nontrivial relations, we have two tasks. The first is to isolate a fundamental set of such relations, that is, a set of relations from which all others can be derived. The next is to reshape the fundamental relations so that the structure of M becomes transparent. Our general definitions in this chapter require only that the coefficient ring R is a commutative domain, but we need to take R to be a Euclidean domain to obtain some results.

9.1

The definition

Let M be an /^-module. A presentation of M is a surjective i?-module homomorphism 6: Rl ->• M, from a standard free module i?* to M. The free module JR* has the standard basis { e i , . . . , et}, so that each element x £ R* can be written x = riei H

\-rtet

for some unique members r\,..., rt of R (section 5.1). Thus, given a pre sentation 0 of a module M, each element m of M has the form m =

6{x)

= ri0(ei) + --. + r t 0(e t ) which shows that {0(e!),... ,6(et)} is a finite set of generators for M. Conversely, given a finite set of generators { m i , . . . , mt} for M, we can define a presentation 9 by setting 6{x) = r1mi H

(- rtmt

for all x 6 R\

and then 0(ei) = m i , . . . , % )

=mt.

Notice that a module has many different presentations, since each choice of a set of generators gives rise to a presentation.

9.2.

Relations

135

Examples. (i) A cyclic module R/Ra has a presentation TC : R -> /?/i?a, 7r(r) = f = r - I and also a presentation —?r: r t-> r • (—1). (ii) A module will have generating sets of different sizes, and so it will have presentations involving free modules of different ranks. For example, take M = R/Ra x R/Rb, the external direct product of two cyclic modules, and put mi = (T, 0) and m 2 = (0,T). Then M = Rmi ® Rm2, and there is a presentation p : R2 -» M given by P(ri,r2) = (ri,ri)

= r a m i + r2m2.

If a, b are coprime, then M is cyclic by the Chinese Remainder Theo rem 7.8.1 and so M has a presentation as in (i) above; on the other hand, if a, b are not coprime, then M is not cyclic (Exercise 8.7).

9.2

Relations

Suppose that 9 : Rl —> M is a presentation of M, and write m-i = 0(ei),...,mt

= 9(et)

for the corresponding generators. Informally, a relation between the gener ators of M is an expression of the form rimi H

h rtmt = 0, r*i,..., rt € R.

However, this informal definition does not lend itself to calculation, the problem being that it is not clear when one relation is to be regarded as a consequence of others. To overcome this difficulty, we notice that rxm\ -\

h rtmt = 0 «=> r\e\ H

1- rtet 6 Ker(0),

and so we make the formal definition that a relation on M is to be an element of the kernel Ker(#) of 9. The module Ker(0) is called the relation module for M with respect to 9. Properly speaking, we should talk of relations on a particular set of generators of M rather than on M itself, but it will be clear from the context which set of generators is being used at any particular time. Suppose that we can find a finite set of generators {pi,... ,ps} for Ker(#), say Pi = 7 n e i + • • • + 7ti e * (9.1) 7i* e i + • • • +

ltset.

Chapter 9.

136

Presentations

Then any element of Ker(#) is a linear combination of {pi,..., ps}, which fact can be interpreted as meaning that any relation among the generators { m i , . . . , mt} of M is a consequence of the "basic" relations 0 = 7nmi + :

h 7«mt \ \ .

0 = 7i s mi H

h 7tsmt J

(9.2)

We therefore say that the set of relations {pi,..., ps} is a set of defining relations for M. For example, the cyclic module R/Ra has one defining relation, p\ = aei, while the direct sum R/Ra x R/Rb has two defining relations, pi = ae\ and p2 = be%-

9.3

Defining a module by relations

So far, we have discussed presentations of a given module. Next, we look at the reverse procedure, that is, the construction of a module with a given set of defining relations. Suppose that we are given a set of elements {p\,..., ps} in R* as in Eq. (9.1). We form the submodule K = Rpi + ■ ■ ■ + Rps of Rl generated by the given elements and let M = Rt/K. There is a canonical surjection 7r : .ft* —> M which is a presentation for M, with relation module K = Ker(7r). Thus M has defining relations {p\,... ,ps}, and so, naturally enough, M is called the module defined by the given relations. Less formally, we say that M is defined by a set of generators and relations as in Eq. (9.2) if M is defined by the corresponding relations {pi,-..,Ps}

9.4

in .ft'.

The fundamental problem

As we have noted, a particular module will have many presentations and so many sets of defining relations. Conversely, two sets of defining relations may or may not lead to the same module. Our fundamental problem then is to give a procedure that allows to determine whether or not two sets of defining relations do give (SbpydightetMM^rial

9.4. The fundamental

problem

137

To illustrate the problem, here is a collection of sets of defining relations for various Z-modules, together with a calculation of the modules so defined. (i) Defining relations 3mi = 0, 5m2 = 0. There is a presentation 6 : Z 2 ->• M with kernel K generated by 3ej and 5e2- It's clear that K = 3Zx5ZcZxZ so that M ^ Z 3 x Z 5 (see Exercise 7.3). By the Chinese Remainder Theorem 7.8.1, M = Z15 also, (ii) Defining relation 3mi — bm.2 = 0. Here, the relation module K has one generator p\ = 3ei — 5e2- Since K has rank 1 and 1? has rank 2, it is reasonable to suspect that M = 1? jK contains a module of rank 1, that is, a module isomorphic to Z. Now notice that 1 = 2 - 3 — 1-5 in Z, and put n\ = 5 and n.2 = 3. Then ri\, ri2 generate Z, and there is a presentation u : 1? —> Z, ui(y, z) = yni + zn.2, with kernel K. Hence M = Z. Observe that the relation in this example is a linear combination of the relations in the first example, (hi) Defining relations 3mi — 5rri2 = 0, 5m.2 = 0. These relations are equivalent to those in the first example, since 3mi = 0 obviously. So we get the same module. (iv) Defining relations 3mi — 5m2 = 0, 67711 = 0. Clearly, 10rri2 = 0. Put n\ = 2m\ and 77,2 = 2m2, so that ni,ri2 generate the submodule 2M of M. But 7ii,ri2 satisfy the same relations as the generators in the first example, which gives 2M = Z15. We need to calculate M/2M. This module has generators p\,P2, which are the images of mi, 777,2 respectively. These generators must satisfy the original relations together with the relations 2pi = 0 = 2p2. Thus p\ = p2, so that M / 2 M ^ Z 2 . It is now not hard to show that M =■ Z 3 0 = Z 2 x Z 3 x Z 5 by using the Chinese Remainder Theorem together with the results in sec tion 8.5. The next result tell us that a set of defining relations can always be taken to be finite. 9.4.1 Theorem Let R be a Euclidean domain and let K be an R-submodule of a standard free R-module Rf of rank3e>[jphejtot8cliWatm#tee, of rank s with s
Chapter 9.

138

Presentations

Proof We argue by induction on t. If t = 0, then Rl = 0 = K trivially. If t = I, then K is an ideal of R and so K = Ra is principal; the rank of K is 0 or 1, depending on whether a is zero or nonzero. Now suppose that t > 1, and let R be "projection to the last term", that is, cr(riei H \-rtet) = rt. Put / = Im(cr). Since a is an R-module homomorphism, / is an ideal of R. If I = 0, then K C fl*-1, so we are finished by our induction hypothesis. So suppose that I = Ra is not 0. By definition of I, there is an element w in K with a(w) = a, that is, w = wiei +

1- wt-\et-i

+ aet,

where e i , . . . , e t is the standard basis of Rl. For any other element x = x\e\ +

1- xt-\et-\

+ xtet € K,

we have Xt = ba for some b in R. Then x - bw € Ker(tr) n K. Write L = Ker(
K = L © to. Now L is a submodule of Ri~1, so the induction hypothesis tells us that L is free of rank h say, with h < t — 1. Let { c j , . . . , c^} be a basis of L. Since .Rw has basis the single element w (and has rank 1) , K is free with basis { c i , . . . ,Ch,w} and rank s = h + 1 < i. D Remarks. (i) This theorem assures us that a finitely generated R-module with t gen erators has a presentation that requires s < t defining relations, (ii) In vector space theory, a generating set can be reduced to a basis by omitting elements. Such easy arguments cannot be expected to work for modules over Euclidean (&wpfyfl}§h1<mfcM$eftk& generating set {2,3} of Z.

9.5. The presentation

matrix

139

(iii) The argument in the proof of the theorem is, in principal, constructive. Given an explicit set of generators {Pl,---,Pu}

for K, we can compute a generator a of Im(
by Euclid's algorithm. Thus we can find a linear combination w of the given generators so that a(w) = a, and hence a set of generators of K n R}~1 as in the theorem. However, this method is not so easy to implement if we are faced with a large number of generators and relations, and the technique that we will give in the next chapter is better for computations. We digress from the main topic of the chapter to give an important result that follows directly from the preceding theorem. 9.4.2 T h e o r e m Let R be a Euclidean domain and let M be a finitely generated R-module. Then every R-submodule of M is also finitely generated as an R-module. Proof Let L be a submodule of M, and let IT : Rl —> M be a presentation of M. Then the inverse image 7r*(L) of L in Rl is a submodule of Rl (Proposition 4.13). By the theorem above, 7r*(L) is finitely generated, and so also is L = 7r»(7r*(L)). □

9.5

The presentation matrix

Suppose that the module M has defining relations Pi = 7 n e i H

H 7ue, +

Pi = 7 i j e i + • • ■ + jijei

Ps = 1Q9PW&tafl'M0p4al-

H

h 7ne*, +

jtjet,

■ + itset

Chapter 9.

140

Presentations

as in Eq. (9.1). The corresponding presentation matrix for M is defined to be the t x s matrix / 7n

\

• • m

7«

•

lij

7ti

• ■

Itj

■ ■■

l u \

lis

■ ■■

Its

1

notice that the coordinate vector of pj gives the j'-th column of the matrix. Despite Theorem 9.4.1, we allow the possibility that s > t. The reason for this is that a module may be presented to us with surplus relations, but it may be far from obvious which relations can be discarded. We can reverse our point of view, obtaining a module from a matrix. Given any t x s matrix T over R, we define the submodule K of Rf to be that with generators p\,... ,ps as above, and then define M to be the quotient module M = Rt/K. By construction, T is a presentation matrix for M.

9.6

The presentation homomorphism

It is sometimes useful to interpret the presentation matrix as the matrix of a homomorphism 7 : R" —> Rl between free modules. Define the presentation homomorphism 7 by 7(2) = Tx for each column vector x e Rs. If we write E = {ex,..., et} for the standard basis of Rl and B = {bi,..., bs} for the standard basis of Rs, then 7(^>i) = Pi = 7 n e i H

h 7«et, (9.3)

l{bs) = Ps = 71*ei H

+ ItsGt-

Thus T = ("I)E,B as in section 5.9. Notice that the image Im(7) of 7 is the relation module Ker(#) of the corresponding presentation 9 : R* -> M of M. In general, 7 will not be an injective homomorphism. We record the circumstances in which it is. 9.6.1 T h e o r e m The presentation homomorphism 7 is injective if and only if the set {pi,..., ps} of defining relations forms a basis of the relation module K = Ker(0).

9.7. F[X]-module

presentations

141

Proof We know that the defining relations generate K, so they form a basis precisely when they are linearly independent. But pi = -y(bi) for all i by definition of 7, and since b\,..., bs is the standard basis of Rs, the defining relations are linearly independent if and only if 7 is injective. □

9.7

F[X]-module presentations

Suppose that A is an n x n matrix with coefficients in a field F, and that M is the F[X]-module obtained from Fn with X acting as A, in the familiar way. Then there is a canonical presentation matrix for M as an F[X]module, which we now describe. Let { e i , . . . ,e„} be the standard F[X]-basis of F[X]n, let { e i , . . . ,en} be the standard F-basis of Fn, and define 7T : F[X]n

-> Fn

to be the F[X]-module homomorphism with 7r(ei) =eu ... ,7r(en) = e n . The action of X on the standard basis of Fn is given by the products Ae\,..., Aen, which on expansion read Xe\ = anei + ■ ■ ■ + aj\e~j + • • • +

anien,

Xej = aijEi + • ■ • + a,jje~j + • ■ • +

anjen,

Xen = a\ne\ + ■ ■ ■ + aj„e~j + • ■ • +

annen,

from which we see that the defining relations for M are Pi = (X - an)ei

-

■••

—a^\&j

Pi =

-axiei

-

■■■

+{X-aJJ)ej

pn =

-&i n ei

- CopyrightetPSASterial

-

■■■

-an\en, Q-njCni

+ (X — ann)en,

Chapter 9.

142

Presentations

and that the presentation matrix for M is / X — on

r=

-Q.il \

_ a

••

\j

X

~0.nl

"■■

-a j]

—a

in

\ (9.4)

—a;n /

L

nj

which is the characteristic matrix of the matrix A. Let XI be the product of X with the n x n identity matrix 7, that is, the n x n diagonal matrix with diagonal entries all equal to X. Then we can write the characteristic matrix more succinctly as XI-

A.

(9.5)

The determinant det(r) = det{XI - A) is the characteristic polynomial of A, which plays an important role both in elementary linear algebra and in our analysis of the structure of F[.X']modules. A straightforward expansion shows that the characteristic poly nomial is monic, with det(XI - A) = Xn - (on + ■ • • + ann)Xn-1

9.8

+ ■■■ + ( - 1 ) " det(,4).

F u r t h e r developments

The attempt to describe modules in terms of presentations leads to some of the most fruitful areas of mathematics, in that it provides a method of characterizing various types of rings and modules. The method can be illustrated by the key result of the chapter, Theorem 9.4.1, which assures us that any finitely generated module M over a Eu clidean domain R has a free relation module K, and that the rank of K is at most the number of generators of M. This result holds for commutative principal ideal domains ([Cohn 1], §10.5) and, in fact, it provides a charac terization of such rings. Suppose that the theorem holds for a commutative ring R. Then a nonzero ideal I oi R must be a free .R-module with one generator and so I — Ra is principal. Moreover a cannot be a zero divisor in R, since the obvious surjection from R to 7 cannot have a nonzero free kernel. Thus R must be a principal ideal domain. If the ring R is not a principal ideal domain, then the .R-modules that do have a presentation with a free relation module form an interesting type of .R-module.

Exercises

143

Another variation on this theme is to look in turn at a presentation tp : Rs -> K of the relation module K of M if K is not free. Perhaps the relation module K' of this new presentation is free - if not, we can repeat the operation. Proceeding in this way, it is possible to build what is called a free resolution of M. An illustration is given in Exercise 9.3 below. The study of such resolutions, and the characterizations of rings and modules that arise through them, is the subject matter of homological alge bra. Introductions to this subject can be found in many texts, for example, [Rotman] and [Mac Lane]. A ring in which every submodule of a finitely generated left module is again finitely generated is called a left Noetherian ring. Theorem 9.4.2 therefore assures us that a Euclidean domain is a Noetherian ring. The class of Noetherian rings is very large; it includes most of the types of ring of current interest to algebraists, which may be seen by consulting [McC & R]. An example of a non-Noetherian ring is given in Exercise 9.4.

Exercises 9.1

Let R be a commutative domain and suppose that the R-module M has a presentation 7r : R* —> M with square presentation matrix T and that det(r) ^ 0. Using the formulas given in Det 8 of section 5.12, show that the relation module Ker(-7r) for M contains det(r)ei,...,det(r)et.

9.2

Deduce that M is a torsion module with det(r) G Ann(M). The Cayley-Hamilton Theorem. Let A be a square matrix over a field F and let M be Fn made into an F[A"]-module with X acting as A. Show that if h(X) is a polynomial in Ann(M), then h{A) = h0I + h1A + --- + hkAk = 0.

9.3

In particular, if h(X) = det{XI - A), then h(A) = 0. This result is the Cayley-Hamilton Theorem: "a square matrix satisfies its characteristic polynomial". Let R = F[X,Y] be the polynomial ring in two variables over a field F, and let M be F regarded as an i?-module with both X and Y acting as 0. The obvious presentation 7r : R —> M has Ker(7r) = XR + YR, which is not a principal ideal of R (Exercise 1.6), and so Theorem 9.4.1 does not hold OcqsyFL

144

Chapter 9.

Presentations

Define a : R2 -> XR + YR by a(r, s) = Xr + Ys. Verify that T-.R^-R2,

s s 9-4

9.5

T(W) =

(YW,-XW),

induces an isomorphism R = Ker(a). Thus we have constructed a free resolution of M. Generalize this result to a polynomial ring in three variables. (Warning: the computations rapidly become rather complicated the corresponding resolution for a polynomial ring in k variables has k terms.) Let R = F[X\,X2,---\ be the polynomial ring in an infinite set of variables X\, X2, ■ ■ ■■ (This means that each element / of R is a poly nomial in a finite set of variables X\,..., Xk, but there is no bound on the number k of variables allowed. Addition and multiplication in R are as expected.) Show that the ideal I = RXi + RX2 + ■ ■ ■, generated by all the variables, cannot have a finite set of generators. Let M be F regarded as an i?-module with each variable acting as 0, and let 7r : R —> M be the evident presentation. Deduce that any set of relations for M which arises from n must be infinite. Remark: it can be shown that any presentation of M must have an infinite set of relations, which is a stronger result than the above. Let pi,... ,Pk be distinct irreducible elements of a Euclidean domain R, and put Qi = Pi ■ ■ -Pi-iPi+i ■ ■ -Pk, i — 1,.. -,k. Show that 1 = wiqx + ■ ■ • + WkQk for some elements u>i,..,, Wk of R, and deduce that Qi,- ■ ■ ,qk generate R as an ^-module, but that no subset oi qi,... ,qk generates R. Define 7r : Rk —> R by 7r(ej) = qi for each i. Show that this presentation leads to a set of (k — l)k/2 relations for R. Find an element w e Rh so that Rk = Ker(7r) © Rw, and show that Ker(7r) has generators z\=e.\

— qiw,...

,zk = ek - qkW.

Show that the corresponding presentation matrix is T = I - w.qT where w is the column vector (vji,...,Wk)T vector. Prove also that W\Z\ H + WkZk = 0.

and q is the obvious row

Chapter 10

Diagonalizing and Inverting Matrices Let R be a Euclidean domain. The discussion in the previous chapter shows that a finitely generated fl-module is specified by a presentation, which in turn is determined by a presentation matrix T with entries in R. We now show that any matrix T over R can be reduced to a standard diagonal matrix A, the invariant factor form of T, by elementary row and column operations. This reduction will allow us to find very nice presentations for modules. The diagonalization technique is an algorithm which provides us with explicit invertible matrices P and Q so that PTQ = A. The algorithm also allows us to determine whether or not a matrix is invertible, and gives a method of computing the inverse. A further application is to the equivalence problem for matrices: given a pair of matrices T and F' over R, are there invertible matrices P and Q with T' = PTQ? Throughout this chapter we take the coefficient ring R to be a Euclidean domain.

10.1

Elementary operations

Our diagonalization technique is to perform a sequence of elementary row and column operations on a given matrix until we have reduced it to the desired form. These operations are very nearly the same as those encoun tered in elementary linear algebra, but we must be a little more careful as we are working over a Euclidean domain rather than a field. 145

146

Chapter 10. Diagonalizing and Inverting Matrices

Let r be a t x s matrix over R. The elementary row operations which can be performed on T are as follows. EROP 1 Interchange two rows of T. EROP 2 Add to any one row of T a multiple of a different row. EROP 3 Multiply a row of T by a unit of R. The elementary column operations are defined analogously. ECOP 1 Interchange two columns of E\ E C O P 2 Add to any one column of T a multiple of a different column. ECOP 3 Multiply a column of T by a unit of R. Notice that each type of elementary operation has, as a special case, the identity operation, which do not change any matrix. For example, the multiplication of a row or column by the identity element 1 of R is a type 3 manifestation of the identity operation. On the other hand, a non-identity operation may well have no effect on a particular matrix; for example, the zero matrix is not changed by any elementary operation. Some explanation is needed for the emphasis in the definitions of type 3 operations. The point here is that we want our elementary operations to be reversible, that is, we wish to be able to regain the original matrix by using another elementary operation that is also defined over R. If we multiply a row by a unit u, we can undo the effect by multiplying by its inverse u _ 1 , but if we multiply by a non-unit a, we cannot reverse the operation inside R since a has no inverse in R. It is obvious that operations of types 1 and 2 can always be reversed. If we are working over a field F, as is the case in elementary linear algebra, then a unit of F is the same as a nonzero member of F, so the definitions usually speak of "multiplication by a nonzero element". The effect of multiplying a relation by a non-unit is easily illustrated. The single relation 2m\ = 0 on one generator mi defines the Z-module Z2, but the relation 4m 1 = 0 defines instead Z4.

10.2

The effect on defining relations

Next we consider the correspondence between elementary operations on the of d e n n m presentation matrix and ^ftftW{\fyffij^$jgiset g relations.

10.2. The effect on defining relations

147

Recall that the entries of T are given by the coefficients 7^ in the defining relations of Eq. (9.2): 0 = 7 n m i + ••• + 7 u m t , (10.1) 0 = 7i s mi + ■•■ + 7 t s m t . The effect of column operations is transparent. Each column of T is made up of the coefficients from an individual relation, so an elementary column operation on T corresponds to the "same" operation on the rela tions. Here is a detailed list. exchanging columns i and j : exchanging relations i and j adding r times column i to column j for j jt i : adding r times relation i to relation j multiplying column i by a unit u € R : multiplying relation i by u. The result of row operations is less obvious, since it involves a change in the generators that occur in the relations. Write V = (7? •) for the matrix obtained by performing an elementary row operation on T, so that we obtain a new set of relations on some generators TJIJ, . . . , m't as follows: 0 = -y'nm'1 + ■■■ +

j'nm't, (10.2)

0 = 7iXi + ■ ■ ■ +

ll<-

These relations must be equivalent to the original set in Eq. (10.1) above, which requirement determines the new generators. We consider cases. (1) Suppose that V is obtained from T by exchanging row h and row i. Then j ' h - = j t j and 7^ = fhj for all j , while -y'gj = j g j for all g ^ h,i. In this case, the change in generators is clear: m'h = rrii, rrii = rrih, and m^ = mg for g ^ h,i. (2) This is the trickiest case. Suppose that T' is obtained from T by adding r times row h to row i, h =/= i. Manipulating the new j - t h relation, we obtain

o = 7iXi+"-+ =-fljm'1+■

■■+

TMK lh3m'h

+•••+

^3<

+---+it3m-

+ ■ ■ ■+(-yij + nhj)m'i+■

= lum1! +■■■ +lhjkotipfri&frf£&Material Hjm'i

■ ■+jtjm

+"• +ltjm

Chapter 10. Diagonalizing and Inverting Matrices

148

Thus, to recover the original j - t h relation, we must have m! = mg for all g ^ h and m'h = m^ — rrrii. (3) Suppose that we multiply row i by the unit u of R. The new j-th relation is 0 = 7 y m i + • • • +l'hjm'h+ ■■■ + ii3< +■■■ +l'tjm't = -yijm[+ ■ ■ ■ +fhjm'h+ ■ ■ ■ +u~jijm'i+ ■ ■ ■ +Jtjm't so it is clear that we must have m' = n%g for all g ^ % and m!i = u~ m;. A n example. Here is a numerical example. Consider the following defining relations for a Z-module: 0 = 2mi + 4rri2 0 = 3mi + 7m2-

}

(10.3)

The associated matrix is

Using column operations (and omitting some steps), we make the transfor mations

r
(i!) and relations 0 = m'1 0=

2m'2

with m1 = m,\ — 3m2,

m'2 = ra2-

Thus the Z-module defined by the relations (10.3) is Z 2 .

10.3. A matrix

10.3

interpretation

149

A matrix interpretation

Before we give the promised diagonalization method, it will be useful to have an interpretation of elementary operations in terms of multiplication by invertible matrices. Suppose that p is one of the elementary row operations, acting on a t x s matrix T, and denote the resulting matrix by pT. Then we can perform the same row operation on a t x t identity matrix /, obtaining pi, and then take the product (pI)T. Let x be one of the elementary column operations. We denote the result of applying x to T by Fx- In this case x operates on an s x s identity matrix, also called / (there should be no genuine chance of getting confused!), and we can compute the product T(Ix)The operations and matrices are related by the following basic lemma. 10.3.1 Lemma (i) PT = (pI)T. (li) TX = Y{Ix). (Hi) Let p~l and x _ 1 be the elementary operations which reverse the effects of p and x respectively. Then the matrices pi and Ix are both invertible, with inverses p~xI and Ix-1 respectively. (iv) (pT)x = p ( r x ) . Proof (i) The proof is by direct calculation. Since only two rows of I and T are alterd by the row operation, it is enough to make the verification when r is 2 x s and / is 2 x 2. (The doubtful reader can fill in the unchanging rows for reassurance.) Suppose, for example, that p is of type 2, "add r times row i to j , i ^ j " . Ignoring the unaffected rows (and assuming i < j), we have 1 0 pi ( r 1 ) and

(pi)r

7»i

(

7i2

•••

liS

) ■

which is obviously pT. The calculations for the other types of elementary row operation are easier. (ii) The argument for column operations is similar, working with t x 2 matrices.

Chapter 10. Diagonalizing and Inverting Matrices

150 (iii)

We have {p-ll){pl)

=

p~\pl)

= =

{P~1P)I I

and similarly (pl)(p~1l) = I, confirming that (pl)~l = p~lIThe calculation for x is much the same, working on the right rather than the left. (iv) In matrix terms, we require (pI)(T ■ Ix) = (pi ■ r ) ( J x ) , which is true since matrix multiplication is associative. □

10.4

Row &; column operations in general

We will need to use many elementary row and column operations to diagonalize a matrix, and so we must expand our definitions to encompass sequences of operations. A row operation p on a t x s matrix T is defined to be the product of a sequence p\,..., ph of elementary row operations performed on T in the given order; thus p is a product of operators P = Ph ■ ■ -Pi,

and pr =

ph(...(Plr)...).

Notice that the elementary operations are listed from the right to the left in the product since row operations operate on the left of matrices. Likewise, a column operation x on T is the product of a sequence Xii • • • tXh of elementary column operations, again performed in the given order. We can write x as X =

Xi---Xk

where the terms are ordered from left to right, since column operations act on the right of matrices. The ordering of the elementary row operations is very important, since the same operations in a different order will give a different row operation. For example, let p\ be "exchange row 1 and row 2", and let pi be "add row 1 to row 2". Then

P2PJ= ( J while

J)

10.4. Row & column operations in general

151

Similarly, the ordering of the elementary column operations is crucial in defining their product column operation. On the other hand, if we are performing both column and row operations on a matrix, then the identity (pT)x = p{^x) OI Lemma 10.3.1 shows that it does not matter if we perform all the row operations first, and then the column operations, or vice versa; we can go back and forth between row and column operations as we wish. Here is an extension of Lemma 10.3.1. 10.4.1 Lemma (i) Let p = Ph • ■ ■ P\ be a row operation, where each pi is an elementary row operation. Then p is invertible, with inverse p^=p-^...p-h\ (ii) There is a matrix equation pI =

(phI)...(PlI).

(Hi) The matrix pi is invertible over R, with inverse (p^ I)... (p~^ I), (iv) Let x — Xi ■ ■ ■ Xk be a column operation, where each \i *s an e^e~ mentary column operation. Then x *s invertible, with inverse X" 1

=Xk1---Xil-

(v) There is a matrix equation IX =

(IXi)---(IXk).

(vi) The matrix Ix is invertible over R, with inverse (IXk ) ■ • ■ C^Xf )• (vii) (pT)x = p(Tx)Proof (i) Multiplying pn ■ ■ ■ P\ by p\l ... p^1, on either side, we see that all the terms cancel, so we are left with the identity row operation, which does nothing to any matrix that it operates on. So the product of the inverses is indeed p~l. (ii) This follows by induction on h from part (i) of Lemma 10.3.1. We have pi

= =

(ph{Ph-i ■ ■ ■ Pi))I {PhI){{Ph-\ ■ ■ ■ P\)I)

GQp0gmfrM3terial. (PlI)).

Chapter 10. Diagonalizing and Inverting Matrices

152

(iii) Immediate from the above, using the fact that, for invertible ma trices A,B, we have (AB) _ 1 = B _ M _ 1 . The assertions about column operations have similar proofs, and the final claim follows from the associativity of matrix multiplication. □

10.5

The invariant factor form

We now come to the main computational result of this text, which shows how a matrix over a Euclidean domain can be reduced to a convenient standard form. Before we give the technique, we must describe this desired form. A matrix A over R is in invariant factor form or Smith normal form if it is t X s diagonal matrix

1 A =

5l

0 52

■ ■ ■

0 0

0

■• ■

0

0

■■

•

0

0 0

0 0

• 5T • ■ 0

0

■• •

0

■■

•

0 0

• •

0

••

• o/

VV oo o0

0

° \

whose nonzero entries 5i,... ,5r satisfy the relations Si | 52, S2 | S3,...,

6r-i

| 5T.

Notation: we will write a diagonal matrix as A = diag(<5i,... A , 0 , . . . , 0 ) when convenient. Now let T be any s x t matrix over a Euclidean domain R. An in variant factor form or Smith normal form for T is any diagonal matrix A = diag(<Ji ,...,Sr,0,...,0) that is itself in invariant factor form and which is related to T by an equation PTQ = A in which P and Q are invertible ^-matrices. The nonzero entries 5\,..., 5r are called the invariant factors of T, and the integer r is the rank qf-X'nvririhterl Material

10.5. The invariant factor form

153

The invariant factors are not unique, since any of them can be multiplied by any unit of R. However, this is the only type of change permitted. We will prove this assertion in the next chapter, Theorem 11.3.1, together the uniqueness of the rank. Our immediate task is to show that any matrix does have an invariant factor form. 10.5.1 The Diagonalization Theorem Let r be at x s matrix with entries in a Euclidean domain R. Then we can perform a sequence of elementary row and column operations of types 1 and 2, all of which are defined over R, on T, with the result that T is transformed into a t x s matrix A that is in invariant factor form. Furthermore, there are matrices P and Q, both of which are invertible over R, so that PTQ = A, and hence A is an invariant factor form for T. Note: we require not only that the matrices P, Q have entries in R but also that their inverses have entries in R. Proof We start by observing that the second assertion is a consequence of the first. Suppose that pT\ = A where p is a product of elementary row operations and x is a product of elementary column operations, and put P = pi and Q = I\ for suitable identity matrices. Then, by Lemma 10.4.1, both P and Q are invertible over R and PTQ = A. The proof of the first assertion is by double induction. We induce on both the size of the matrix and the size of its minimum nonzero entry. We present the argument in a series of steps; to avoid over-elaborate notation, we behave as programmers and allow the meaning of T and some other matrices to vary during the argument. The fact that we use operations only of types 1 and 2 will be clear from the method. There are two starting cases in which we need not do anything. If T = 0, then it is in the desired form with r = 0, and if T is 1 x 1 it is obviously diagonal. We define the "size" $(T) of the minimum entry by using the function if : R -> Z (section 2.1). Suppose that T = (7^) is nonzero. Then we put * ( r ) = m i n t e d ) I 7 i ^ 0}. Now for the procedure. We assume T ^ 0 and that T is not l x l . Step 1. Choose an entry 7 ^ with $ ( r ) = fefg/i
Chapter 10. Diagonalizing and Inverting Matrices

154

no need to exchange rows, that is, we perform the identity row operation, and likewise if k = 1. Step 2. We can now assume that 3>(r) = <£>(7n). If s = 1, that is, T has only one column, go straight to step 3. If not, proceed as follows. For j = 2 , . . . , s, use the division algorithm to write 7ij = Qijln + 7y with qijs 7 y € R, (r') < 3>(r). By induction hypothesis, we can reduce T' and hence T to the desired diagonal form. If j[- = 0 for all j , we turn to row operations. Step 3. If there is only one row, the preceding arguments will diagonalize T, so we may assume that t > 2. For i = 2 , . . . , t, write 7a = 9u7n + 7a, with q'ilt -y'n £ R, ip{iiX) <

f(ln)-

We then subtract q'n times row 1 from row i for i — 2 , . . . , t. If some 7^ =£ 0, we have reduced 3>(r) and we are home by the induction hypothesis. If 7^ = 0 for all i, then we have transformed T into a block form, which we again call T to save notation: /

711

0

•■■

0

\

0 r

=

: V 0

r )

Step 4If by chance T' = 0, we are finished. If not, we perform long di visions to see whether or not 711 divides the entries 7^ of the submatrix V. It will be convenient to index the entries of T' according to their positions in the larger matrix T rather than their positions in V itself. For i,j > 2, write Hij = lijln

+lij

with

qij,

Ht e R, ip(j'/j) <
If some 72" / 0, we first add row i to row 1 in V and then subtract q^ times column 1 from column j to obtain a new matrix, say T", with 1, j - t h term 7,". We now have $ ( r " ) < $ ( r ) , so we can replace F by T" and start again.

10.6. Equivalence of matrices

155

After a finite number of trips around this loop, reducing $ ( r ) each trip, we must arrive at the stage where / 711 0

r=

0

... r

•

0

\

'

and 7 n divides 7^ for all i,j > 2. Step 5. We can now finish the argument by induction. If V = 0 or V is 1 x 1, there is nothing to do. Otherwise, I?' is t — 1 x s — 1, and so we can reduce it to invariant factor form A' = d i a g ^ , . . . , <5r, 0 , . . . , 0), with 82 I S3,..., <5r_i I 5r, by a series of elementary row and column operations. Since none of these operations can have any effect on the first row and column of T (other than adding, subtracting and permuting 0's), we have, on putting 711 = Si, transformed T to A = diag((5i,<52)... A , 0 , . . . , 0 ) . Finally, we have to show that <5i divides 82- Since A' is obtained from V by applying elementary row and column operations, the entries of A' are .R-linear combinations of the entries of V. But we have arranged matters so that 5\ divides all entries of V, and so <^i must divided 82□

10.6

Equivalence of matrices

Let R be any ring. Two matrices V and I" over R are said to be equivalent or associated if there are invertible matrices P and Q with V = PTQ. The equivalence problem for matrices requires that we determine precisely when two matrices of the same size are equivalent. The preceding result gives a partial solution to this problem for matrices over a Euclidean domain, since we now know that any matrix is equivalent to its invariant factor form, and in the next chapter we will obtain a complete solution (Corollary 11.3.2). A special case is well known from elementary linear algebra. A field F can be regarded as a degenerate Euclidean domain in which
0 0

where Ir is the rxr identity matrix. Since the rank of a matrix is unchanged by row and column operations, r is the rank of T. Thus the equivalence class of a matrix over a fXMpfri&ttfaffaMiafletigl its rank.

156

10.7

Chapter 10. Diagonalizing and Inverting Matrices

A computational technique

The procedure for computing the invariant factor form is an algorithm, that is, it can be done by a computer provided that the computer knows how to do arithmetic in the Euclidean domain R. Here is a way of setting out the calculations so that we record the matrices P and Q as well as finding the invariant factor matrix A. Given a t x s matrix T over a Euclidean domain R, form the augmented (t + s) x (s + t) array

r / / in which the top right matrix is a t x t identity matrix and the bottom left matrix is an s x s identity matrix. If we perform a row operation p (elementary or not) on T, we can record its effect on the appropriate identity matrix / by performing it on the whole array: T I pT pi I ~* I Similarly, the array will record the effect of a column operation on the s x s identity matrix: T I TX I I IX Continuing this way, we can record the effect of any sequence of row and column operations, so that when pFx = A, the array will have become A IX

pi

from which we can read off P = pi and Q = IxHere is a numerical example. Not all the steps of the algorithm are needed - a numerical computation in which they all appeared would be too lengthy - but enough steps are used to illustrate the ideas, I hope. This calculation is also long-winded in that the method is followed slavishly; in "real life" computation, it's usually easy to spot alternative operations that shorten the calculation. It should be remembered that any sequence of row and column operations is legitimate provided that the result is in invariant factor form. We work over the ring of integers Z. Let T=

/ 12 3

9 2

12 \ 4 .

CopyrtgHtid MaffiieJ

10.7. A computational

technique

157

The augmented array is a 6 x 6 array, which we manipulate as follows.

r/ /

12 3 12 1 0 0

9 12 1 0 0 2 4 0 1 0 8 22 0 0 1 0 0 1 0 0 1

3 12 12 1 0 0

2 4 0 1 0 9 12 1 0 0 8 22 0 0 1 0 0 1 0 0 1

2 3 4 0 1 0 9 12 12 1 0 0 8 12 22 0 0 1 0 1 0 1 0 0 0 0 1 2 0 0 1 0 1 9 3 -6 1 0 0 6 0 0 1 8 4 0 1 0 1 -1 -2 1 0 0 1 3 4 1 -1 0

0 0 1 0 2 9 -6 1 0 0 6 0 0 1 8 0 0 1 -2 1 0

1 0 0 0 1 0 3 3 - 6 1 0 0 4 0 6 0 0 1 "*■ 1 - 2 0 -1 3 -2 Copygightqfi Material

Chapter 10. Diagonalizing and Inverting Matrices

158

1 0 0 1 -1 0

0 0 -2 3 0

0 0 1 0 3 - 6 1 - 3 0 6 0 - 4 1 0 -2 1

1 0

0 3

0 0 10 0 1 - 3 0

0 0 1 -2 - 1 3 0 0

6 -4 4 1

0 - 4 1

Thus the invariant factor form of T is A =

°\ ° '

1 0 0 3 \ 0 0 6/

the row operations are recorded as 0

P = pl =

1

1 -3

0 0 ,

Vo -4

li

and the column operations as /'

Q = ix = \

10.8

l -l

,°

-2 3 0

-4\ 4

1/

Invertible matrices

The diagonahzation method also provides a technique for constructing invertible matrices over a Euclidean domain. The key observation is the following very easy lemma. 10.8.1 Lemma Let A = diag(<5i,..., 6r, 0 , . . . ,0) be atxs diagonal matrix with nonzero entries 6\,... ,8r contained in a Euclidean domain R. Then the following statements are equivalent.

(i) A is invertible ovec^yrighted Material

10.8. Invertible matrices

159

(ii) A is a square matrix with r = t = s, and each of the diagonal terms 5\,..., 8r is a unit in R. Proof Suppose that A is invertible. If, say, t > r, then A has a row of zeroes, and so does A • A - 1 = /, a contradiction. Thus t = r and likewise s = r. The fact that all the terms Si are units now follows by direct calculation, or alternatively, from the fact that det(A) = <5X . ■ -Sr is a unit in R (Theorem 5.12.1). The converse is obvious. □ This lemma gives several characterizations of invertible matrices which we list in the next theorem. 10.8.2 Theorem Let r be a t x s matrix with entries in a Euclidean domain R. Then the following statements are equivalent. (i) r is invertible over R. (ii) T is a square matrix and the invariant factor form ofT can be taken to be the identity matrix. (Hi) T — pi for some row operation p. (iv) V = I\ for some column operation xProof (i) => (ii): By Theorem 10.5.1, there are invertible matrices P, Q so that the invariant factor form of T is A = PTQ. Since T is invertible, so is A. Thus A and hence T are square matrices. Also, 6i,...,fir are units in R, and so, by multiplying column i by S~ for each i, we can take the invariant factor form to be the identity matrix /. (ii) => (iii): We have PTQ = I, where P and Q are invertible square matrices, say r x r. Thus T = P~1Q~l. By construction, P is obtained from the identity matrix / by applying a sequence of elementary row operations, and P _ 1 is obtained by applying their inverses - see Lemma 10.4.1. Thus P~l = p'l for some row operation P'By the same lemma, Q l = (Ixi) ■ ■ ■ (IXk) f° r some sequence of elemen tary column operations X\i■ ■ ■ iXk- But for each such elementary column operation, there is an elementary row operation pi so that pil = Ixi - this is easily seen by evaluating Ixi for each of the three types of elementary column operation. Thus, writing p" = pi... pk, we have T = p'p"I. (iii) => (iv): Arguing as in the previous paragraph, we can replace each elementary row operation in p by an elementary column operation, and so find a column operation x with pi = I\(iv) => (i): Already prove^cipj^Jgfttedl AfefeA'a/ □

160

Chapter 10. Diagonalizing and Inverting Matrices

The computational form of the diagonalization method set out in sec tion 10.7 also enables us to determine whether or not a given matrix T is invertible, and to compute the inverse when T is invertible. If T is not square, then it has no inverse anyway. If it is square, we reduce it to invariant factor form. If the invariant factor form has a nonunit diagonal term, again T has no inverse over R. If all the diagonal terms in the invariant factor form are units, then we can convert the invariant factor form to the identity matrix. At the end of the calculation, we have found invertible matrices P, Q so that PTQ = I from which

T" 1 = QP.

If we have used only one kind of operation, say column operations, then P = I and Q = V~l. However, usually both row and column operations will be used, although, in principle, only one kind need be. If the inverse of P and/or Q is required, then at each step in the calcu lation we should record the elementary row or column operations used, as appropriate. We obtain a sequence pi,..., ph of elementary row operations and a sequence Xi> ■ ■ ■ iXk of elementary column operations so that P = pk...pil

and Q =

Ixi--Xk,

from which P~l = pf 1 • • ■ Pk'l

10.9

and Q-1 = I%1l ■ ■.Xi1 •

Further developments

Elementary row and column operations can be performed on matrices with entries in an arbitrary ring. However, when the ring is noncommutative, scalars must act as left multipliers on rows but as right multipliers on columns. The Diagonalization Theorem holds for matrices over (commutative) principal ideal domains, but the proof is different - in essence, elementary row and column operations no longer suffice to achieve the diagonalization, another type of operation being needed. Details can be found in §3.7 of [Jacobson] or §10.5 of [Cohn 1]. A consequence is that the "constructive" description of invertible ma trices given in Theorem 10.8.2 does not extend to principal ideal domains

Exercises

161

- for some principal ideal domains there are invertible matrices which can not be reduced to an identity matrix by any sequence of row and column operations. However, examples of this phenomenon are surprisingly hard to find; some are given in [Grayson] and [Ischebeck]. The diagonalization results also extend to noncommutative Euclidean domains ([B & K: IRM], §3.3) and to noncommutative principal ideal do mains ([Cohn: FRTR], Chapter 8). When the coefficient ring is not a principal ideal domain, we cannot expect that a matrix will have a standard form that is as simple as the invariant factor form. Even if the ring of scalars is close to being a Euclidean domain, the analogue of the invariant factor form can be rather complicated, as can be seen from [L & S].

Exercises 10.1 Let M be a Z-module with generators mi,m2 and relations m\ + 2m,2 = 0, 2mi + 3m 2 = 0. Show directly that M = 0. Prove also that the presentation matrix for M has invariant factor form / . 10.2 Let the Z-module M have generators mi, 77-12,7713 and relations 2mi + 2m 2 + 3m 3 = 0, 4mi + 4m 2 + Im-j, = 0, 10mi + llm2 = 0. Find the invariant factor form of the presentation matrix of M. Hence make an educated guess for the structure of M. (A formal technique for determining module structures will be given in Chapter 12.) 10.3 Let M = L © N, and suppose that L = R/Rb and N = R/Rc are cyclic, with generators l,n respectively. Given that b and c are coprime (and that R is a Euclidean domain), find a generator for M in terms of I and n. Write down the 2 x 2 relation matrix for M, and show that it can be transformed into diag(l,6c). 10.4 Let M be a Z-module with generators mi, 7712 and relations mfiP&R&&&

4fefcfl#i + m 2 = 0.

162

Chapter 10. Diagonalizing and Inverting Matrices

Write down the presentation matrix T of M. Find the invariant factor form A of T, giving explicitly the invertible matrices P, Q with PTQ = A. Find the bases of Z 2 corresponding to P _ 1 and Q. Deduce that M is a cyclic module, and find a single generator of M in terms of the original generators. 10.5 (This is three problems in one - keep a arbitrary for as long as you can.) A Z-module M is defined by generators 7711,77x2,7713 and relations 3 m i + 3 m 2 + 27713 = 0 = 2m\

+ 67712 + 07713

where a € Z is a parameter. Write down the presentation matrix for M. For a = 0,1,2, find the invariant factor form A of T, together with the matrices P, P~x and Q.

Chapter 11

Fitting Ideals We have shown that a matrix over a Euclidean domain has an invariant factor form, but we have not yet shown that this form is unique. To prove the uniqueness, we introduce an alternative method for finding the invariant factors of a matrix, through the computation of its Fitting ideals. These ideals are defined in terms of the determinants of the square submatrices of the given matrix, and so they can be calculated directly from the matrix, without recourse to row or column operations. However, the Fitting ideals are unchanged by row and column operations, and so the Fitting ideals of a matrix are the same as those of its invariant factor form, which leads to the desired uniqueness result. Another consequence is that we can complete the solution of the equivalence problem for matrices. Throughout this chapter, the ring of scalars R is taken to be a Euclidean domain, although the basic definitions require only that R is commutative.

11.1

The definition

Let T = (jij) be at x s matrix with entries in a Euclidean domain R. For any integer h < s,t, a minor of T of order h is the determinant of an h x h submatrix of T. The Fitting ideal of level or order h for T is the ideal Fit^(r) of R generated by all the minors of T of order h. For example, take the matrix /

T=

3 10 \ -20

2 1 5 - 5 -10 10

0 \ 15 I , -30 /

CopyrightdSft/laterial

Chapter 11. Fitting Ideals

164 with entries in the ring of integers The order 1 minors are 3 2

1

0

. . . 10

-30

the order 2 minors are - 5 --25 50 10 0 0

-45 -90 0

-15 30 0

30 -60 0

and all order 3 minors are 0. Thus

Fit!(r) = z ] Fit2(r) = 5Z } Fit3(r) =

(11.1)

o1

and T has no other Fitting ideals. When a matrix is in invariant factor form, its Fitting ideals are easy to find. 11.1.1 L e m m a Let A = diag(<5i,..., 5r, 0 , . . . , 0) be a matrix in invariant factor form. Then the Fitting ideals of A are

Fit,(A) = M 5 1 ;0- - 4

lf

h
if h > r

Proof First, take r = 1. The nonzero l x l subdeterminants are simply the nonzero entries 5i,..., 5T of A, so Fiti(A) is generated by these elements. But, as the matrix is in invariant factor form, we have 51\62\...\Sr and so F i t ^ A ) = RSi. For arbitrary h < r, a n / i x f t submatrix of A will contain a row or column of zeroes unless it is a diagonal submatrix of the nonzero r x r diagonal submatrix of A. Thus the nonzero generators of Fit/t(A) are all the products of the form where 1 < i(l) < . . . < i(h) r, then any hxh submatrix contains a row of zeroes and

so has zero determinant. Copyrighted Material

□

11.2. Elementary

11.2

properties

165

Elementary properties

Next we show that the Fitting ideals are unchanged if we perform an ele mentary row or column operation on a matrix. 11.2.1 L e m m a Let r be atx s matrix, let p be an elementary row operation on V and let X be an elementary column operation on T. Then for h = 1 , . . . , min{i, s}, Fith(/oT)=Fitfc(rx)=Fith(r). Proof We give the argument only for a row operation p. Consider the effect of p on the determinant of an h x h submatrix E of T. If p does not involve any rows of E, nothing happens to the determinant. If p involves two rows of E, then p is effectively a row operation on E and, by standard properties of determinants (section 5.12), it changes the determinant as follows. • If p exchanges two rows, det(pE) = - d e t ( E ) . • If p adds a multiple of one row to another, det(pE) = det(E). • If p multiplies a row by a unit u of R, det(pE) = udet(E). Finally, we have the situation when p involves two rows, one of which, say row i, contributes to E and the other, row j , does not. There are now two cases in which the determinant might be changed. • Suppose p exchanges rows i and j . Then pE is an h x h matrix, formed from the same columns as E, in which the entries of E from row i of T are replaced by the corresponding entries from row j of T. Thus there is an h x h submatrix E' of T so that pE and E' have the same rows, but perhaps in a different order, which gives det(pE) = ±det(E'). • Suppose that p adds r times row j to row i. Then det(pE) = det(E) ± r • det(E') h t where E' is the snhmamx Su9hWm MM §&Me.

Chapter 11. Fitting Ideals

166

Since the Fitting ideal Fit h(pT) of order h is generated by the determi nants of all the h x h submatrices of T, we can now see that Fith{pT) C Fith(r). But p is invertible, so we also have Fit/»(r) C Fith{p£)D 11.2.2 Corollary Let r be at x s matrix. (i) Given a product p of elementary row operations on T and a product X of elementary column operations onT, we have F i t h ( p r x ) = Fith(r) for h= 1 , . . . , min{t, s}. (ii) Given invertible matrices P and Q, t x t and s x s respectively, we have Fith(PTQ) = F i U ( r ) for h= 1,... ,min{t,s}. Proof (i) This is immediate from the Lemma together with part (iv) of Lemma 10.3.1. (ii) By Theorem 10.8.2, we can write P = pi and Q = I\, so the claim follows. □

11.3

Uniqueness of invariant factors

We can now show that the invariant factors of a matrix are essentially unique. 11.3.1 Theorem Let r be a matrix over a Euclidean domain R and suppose that A = diag(<5i,... A , 0 , . . . , 0 ) and

A' = diag(*i,...,£.,0,... > 0) are both invariant factor forms of T. Then r = r', and there are units U\,..., ur of R with 5i — « i 5 i , . . . , S'r = ur5r.

11.4. The characteristic polynomial

167

Proof By Theorem 10.5.1, we have A = PTQ and A' = P'TQ' for invertible matrices P,P',Q and Q', so that A' = P'P^AQ^Q'. Thus A and A' have the same Fitting ideals by the preceding corollary, which means that r = r' and that R51...5h

= R6[ ...5'h

for h = 1 , . . . ,r. Taking h = 1 we get 5[ = ui<5i for some unit wi of the domain R, and the rest follows easily. □ Standard choices. In the ring Z of integers, we take an invariant factor to be positive, and in a polynomial ring F[X], we take an invariant factor to be monic. With these choices, the invariant factor form of a matrix is unique. The preceding results yield the solution to the equivalence problem that we posed in section 10.6. 11.3.2 Corollary Let r and V be t x s matrices over a Euclidean domain R. Then the following statements are equivalent. (i) r and V are equivalent, (ii) Fith(r) = Fit f t (r') forh = l,.. .,min(t,s). (Hi) T and V have the same invariant factor forms (up to multiplication of their entries by units).

□ 11.4

The characteristic polynomial

Let A be an n x n matrix over a field F, and let M be the module over the polynomial ring F[X] which is defined by A acting on the space Fn. By section 9.7, the presentation matrix of M has the form V = XI — A, where I is an n x n identity matrix. For matrices of this type, it is often more convenient to compute the invariant factors through the Fitting ideals. It is clear that the n-th Fitting ideal of XI — A is Fit n (X7 - A) = det(XI

-

A)F[X),

where det(XI — A) is, by definition, the characteristic polynomial of A. It follows that the invariant factors 8%,... ,§n of XI — A satisfy the equation

ftpyfflte&fM&r^A).

(n.2)

Chapter 11. Fitting Ideals

168

Since the invariant factors are themselves polynomials, we can see, just by counting their degrees, that many of them are likely to be constant polynomials. Indeed, suppose that no invariant factor is a constant. Then all must have degree 1, and by the divisibility condition S\ | • • • | <5„, they are all equal to X — a for a fixed constant o. Thus A = al. At the other extreme, consider the rational canonical block matrix

(° c

• • • • • •

0 0 0

0 0 0

1 0

0 1

1 0

0 0 1

0

0

• •

0

■ •

-/o

\

-/l

-/a

— fn-2 -fn-1 )

1 associated to the polynomial f = fo + fiX + ■ ■ ■ + fn-\Xn + Xn. Then det(XI - C) = f (Exercise 6.5), and XI-A has an ( n - 1 ) x ( n - 1 ) identity submatrix, which shows that 5i.. . <5n-i = 1 and hence

51 = ■ ■ ■ = $n-i = 1 and 6n = f. To see a A= I c

some more possibilities, b d, I to be an arbitrary X -a We have XI — A = ( -c nomial is z Xr2 - (a + d)X

we analyse what happens when we take 2 x 2 matrix over F. -b ,, , ), so that the characteristic polyX-d + (ad - be) = 5XS2.

If either b / 0 or c ^ 0, then Fit!(XI - A) = F[X] and Si - 1. If 6 = c = 0, then Fiti (XI — A) is generated by X — a and X — d. There are now two cases. If a ^ d, then a — d = (X — d) — (X — a) is in Fiti(AT - A), so 6i = 1 again. On the other hand, if o = d, then S\ = X — a. Since 5\S2 = (X — a) 2 , we find that 52 = X — a as well.

11.5

Further developments

It is easy to see that Fitting ideals can be defined for a matrix T over any commutative ring R, and that Lemma 11.2.1 still holds. A direct calculation with determinants confirms that the Fitting ideals of T are the same as those of PTQ fojQaQyipgtlterfl Moi&Me matrices P, Q over R.

Exercises

169

Thus the existence of an invariant factor form for a matrix T is equivalent to the assertion that all Fitting ideals of T are principal. Further, if every i?-matrix has an invariant factor form, then every finitely generated ideal of R must be principal, since we can arrange the generators as a 1 x t matrix for some t. Thus, if we are given that a commutative domain is Noetherian and that every .R-matrix has an invariant factor form, then R is a principal ideal domain. (Exercise 11.7 shows that we must assume that R is Noetherian.) In the reverse direction, if R is a principal ideal domain, then any Rmatrix has a (unique) invariant factor form ([Cohn 1], §10.5). Whether or not this assertion holds for principal ideal rings that are not domains is a point on which the literature seems to maintain a silence. In the noncommutative case, there is no analogue of the determinant which would make it possible to define Fitting ideals with any useful prop erties, at least, as far as I know.

Exercises 11.1 Use the Fitting ideals to compute the invariant factors of the integer matrix discussed in section 10.7: 12

-

3

V 12

9 2 8

12 \ 4

22 /

11.2 Compute the Fitting ideals of the following matrices, and hence con firm your calculations in the exercises to Chapter 9. (a)

1

I2 /

(b)

2 2

\ 3

(c) ( 1

I2

(d)

/

3 3

2 3 4 4 7 2 1 2 6 a

\ ) 10 \ 11

'\

oj

J \ ■

\2 / 11.3 In the last part of the above exercise, show that the three cases a = 0,1, 2 cover all the rS@flB5i9fiel<3foVf^a^ariant factors of the matrix.

170

Chapter 11. Fitting Ideals

/ 1 0 o \ 11.4 Let A = I 0 1 a ) be a matrix over a field F. Discuss the pos\ 0 0 b ) sibilities for the invariant factors of XI — A as a and b vary. 11.5 For those who relish a challenge! Let M be a Z[i]-module with generators mi, m 2 , rnz and relations (1 + i)m 2 = 6m! + 4m 2 + 2(1 + i)m3 = 6mi + (5 - i)m 2 + 2im 3 = 0.

s s

Find the invariant factor form of the presentation matrix of M. Hint: note that 1 + i, 3 and 2 — 3i are irreducible in Z[i] and that 5 - i = (l+i)(2-3i). 11.6 Prove the assertions made in section 11.5. Now take R = F[X, Y] to be the polynomial ring in two variX Y ables over a field F, and let T = I v I. Compute Fiti(r) and Y Fit 2 (r), and conclude that T has no invariant factor form (see Exercise 1.6). 11.7 Let R = F[Xi, X 2 ,...] be a "polynomial ring" in an infinite set X i , X 2 , . . . of variables over a field F, but with the variables sub ject to the relations Xi = Xf+1 for all i. Show that any finite set of members of R belongs to a polynomial ring -F^fc] for some k, and deduce that any finitely generated ideal of R is principal. Show also that the ideal generated by all the variables is not finitely generated (and hence not principal).

Chapter 12

T h e Decomposition of Modules Now that we have the invariant factor form of a matrix at our disposal, we can exploit it to obtain a very nice presentation for a module M over a Euclidean domain. This presentation leads to a decomposition of the module as a direct sum of cyclic modules, which in turn allows us to find the torsion submodule T{M) and the p-primary components TP(M) of M, and to show that the quotient module M/T(M) is in fact a free module. We also show that the descriptions which we obtain are unique, subject to some conditions on the way in which we arrange the cyclic summands of M. As applications of our results, we obtain the structure theory of finitely generated abelian groups and some properties of lattices in R n . Throughout this chapter, R is a Euclidean domain and all .R-modules are taken to be finitely generated. We start with a description of the submodules of a free module.

12.1

Submodules of free modules

Let K be an .R-submodule of the standard free i?-module Rl of rank t. By Theorem 9.4.1, K is also free, of rank r with r < t, but the proof of that theorem did not reveal a basis for K. We now exploit our calculations with matrices to find a basis of K from a given finite set of generators p\,..., ps oiK. 171

Chapter

172

12.

The Decomposition

of

Modules

Write Pi = 7 n e i +

1" ltiet (12.1)

Ps = 7i*ei + ■ ■ ■ + 7t»Ct as in Eq. (9.1), where { e j , . . . , e t } is the s t a n d a r d basis of Rl, and p u t / 7n

••

7ia

:

•..

:

\ 7u

■•■

Its

T=

\ )

As shown in section 9.6, T can be viewed as the matrix of a homomorphism 7 : Rs -» Rl, where 7 is defined by 7(2) = Ta; for each column vector x € Rs. Denote the s t a n d a r d basis of R* by E and the s t a n d a r d basis of Rs by B = { & i , . . . , bs}. Then l{b\)

= p i , . . . , 7 ( 6 s ) = ps

and T = ( 7 ) E , B (section 5.9). Furthermore, the image Im(7) of 7 is the module K. Suppose now that there is a t x s matrix A with A = PTQ for invertible matrices P and Q, of sizes t x t and s x s respectively. Exploiting the calculations in Chapter 5, we can find new bases E' of R* and B' of Rs so t h a t A = ("()E',B'- T O accomplish this, we note t h a t the relation 5.13 of section 5.11 tells us t h a t A = {J)E>,B>

= PE',E

■ {l)c,B

■ PB,B>

provided we choose the new bases so t h a t PE',E = P and PB,B' = QUsing the formula given in Eq. (5.2), we see t h a t the basis B' is given explicitly in terms of B and t h e matrix Q by the equations b'x = qnh b'2 = quh

+ 921^2 H + <722&2 H

h qsibs h qS2bs

(12.2)

K =
* ^Wti^m^Mitenaf

P"e'f

12.1. Submodules of free modules

173

To find E' in terms of E, we must calculate the inverse P . P - 1 = P E , E ' by the results of section 5.6, and

l

= (p^A Then

e

l = P l l e l +P21 e 2H \-Ptlet, e e 4 = P l 2 l + P~22 2 + ••• +Pt2 e *>

(12-4)

: e

t = Puei +P~2te2 H

+P«et.

We can now obtain a very useful description of a submodule of a free module. 12.1.1 Theorem Let K be a submodule of the standard free R-module i?*. Then K is free, and we can find a basis E' = { e ^ , . . . , e't} of Rl and elements 5\,..., Sr of R so that 8\e\,... ,8Te'r is a basis of K, with 5\\ ■ ■ ■ \ 5T. Proof We keep the above notation. By the Diagonalization Theorem 10.5.1, there are invertible matrices P and Q with PTQ=

A = diag(<Si,...A,0 ) ...,0) >

a matrix in invariant factor form, and A is the matrix of 7 with respect to bases E' of R* and B' of Rs. Since K is the image Im(7) of 7, we see that 1(b'1)=S1e[,...,1(b'r)=5re'r

is a set of generators for K. But this set must also be linearly independent, since the set e\,..., e'r is already linearly independent. □ A computation. The techniques given in Chapter 10 enable us to carry out numerical calculations. For example, suppose that K is the submodule of 1? with three generators pi = 12ei + 3e 2 + 12e3, ] p2 = 9ei + 2e 2 + 8e 3 , > pz = 12ei + 4e 2 + 22e 3 . J The corresponding matrix is / 12 T= 3

9 2

12 \ 4

Copyr'fahlSd l&latffiall

(12.5)

Chapter 12. The Decomposition of Modules

174

and the calculations in section 10.7 show that T has invariant factor form 1 0 0 PTQ = A = | o 3 o 0 0 6 with P-

0 1 0

1 0 -3 0 -4 1

1 -1 0

and Q =

2

-4

3

4 1

0

We find that 3 1 0 \ 1 0 0

>-i

V4 o Thus the basis

i;

= 3ei +e2 + 4e 3 (12.6)

= ei = e2

of 1? gives the basis e

i> 3e 2 , 6e 3

of K.

12.2

Invariant factor presentations

Let us see how the preceding discussion leads to a nice presentation of a module. Let M be an .R-module which is defined by generators m i , . . . ,mt and relations 0 = 7 n m i + ■•■ + 7tim t , (12.7) 0 = 7is"ii H

+ 7tsmt,

as in section 9.3. The corresponding presentation is the surjective .R-module homomorphism 6 : R* —» M, given on the standard basis of i?' by 0(a) = m1,...,6(et)

+mt,

and the relation module for M is the kernel Ker(#), which has generators Pi = 7 n e i H

+7tiet, (12.8)

P^ bopfaffited

e

±3te tMaterial

12.2. Invariant factor presentations

175

Thus the presentation matrix of M is the t x s matrix T over R. By Theorem 12.1.1, there is a basis E' for Rl so that Ker(#) has basis Pi = he-'i,- ■■,p'r =

5

re'r.

Put Since 6 is surjective, the set {m\,..., m't} is a set of generators of M with the particularly convenient relations Sim'i = 0 , . . . , STm'r = 0, where Si \ ■ ■ ■ \ Sr. Such a presentation of M will be called an invariant factor presentation of M, although the less accurate term diagonal presentation is sometimes used. (Strictly speaking, a diagonal presentation is one in which the Si's need not satisfy any divisibility condition.) If r < t, then some of the new generators are not constrained by any re lations, a phenomenon which will be interpreted soon - see (iii) of Theorem 12.3.1 below. As we have remarked, the method given in the Diagonalization Theorem 10.5.1 is a computational technique which enables us to find an explicit invariant factor presentation for M. However, if we are not seeking an explicit set of generators of M, then it may be more convenient to calculate the invariant factors through the Fitting ideals of T. The computation again. Let M be the Z-module which has generators 7711,7722,7713 and relations 12mi + 3 m 2 + 12m 3 = 0, ] 9mi + 2 m 2 + 8 m 3 = 0, i

(12.9)

12mi + 4m2 + 22m3 = 0. J The calculations in the preceding section show that we can find new gen erators for M, given by m'j = 3mi + m 2 + 47713, \ m 2 = mi, > m3 = m2, J

(12.10)

so that the relations have become 7771=0, 3m 2 = 0 and 6m 3 = 0.

(12.11)

Next, we see how the structure of a module is determined by its invariant factor presentation.

Chapter 12. The Decomposition of Modules

176

12.3

The invariant factor decomposition

Suppose now that the .R-module M is given by an invariant factor presen tation with generators { m i , . . . , mt} and relations Simi = 0 , . . . , Srmr = 0. It may happen that the presentation starts with some terms whose co efficients Si, i = 1 , . . . , h, are units in R, since a presentation matrix for M may well have unit invariant factors. As a unit can always be replaced by the identity element of R, the presentation begins m i = • • ■ = nth = 0.

But these generators obviously contribute nothing to the module M, so we can omit them and renumber the generators and relations so that each coefficient Si is a non-unit in R. We state the main result formally as a theorem. 12.3.1 Theorem Let R be a Euclidean domain and let M be a finitely generated R-module. Then the following statements are true. (i) M has a set of generators m\,... ,mt so that M is the internal direct sum M = Rmi © ■ • ■ © Rmr © Rmr+i © • • • © Rmt with Run & R/RSi

for i = l , . . . , r

and Rmj = R for j = r+

l,...,t,

where Si,..., Sr are non-units in R with Si \ • • ■ \ Sr. (ii) The torsion submodule of M is T(M) = Rmi © • • • © Rmr, which has annihilator Ann(T(M)) = RSr. (Hi) Put F(M) = Rmr+1 © • • • © Rmt. Then F(M) S Rw is free, and M = T(M) © F{M)

with F{M) S

(iv) The integer u) is uniquely determined by

Copyrighted Materiar

M/T{M).

12.3. The invariant factor

decomposition

177

Proof (i) By the previous remarks, we can assume that M is given by an invariant factor presentation with generators { m i , . . . , m t } and relations Simi = 0 , . . . , 5rmr = 0, and that the coefficients satisfy the conditions stated above. Thus there is a presentation 0 : Rl —> M of M and a basis {b\,... ,bt} of -R* with 6{bi) = rrii for each i and Ker(0) = RSih © • • • © RSrbT. Define a map

x • • • x R/R5T x Rw,

w =

t-r

by {o-\b\ + ■ ■ ■ + arbr + a r + i 6 r + i -I

1- atbt) = ( a i , . . . ,ar,ar+i,...

,at)

where a\,..., at are in R and 5^ is the image of ai in R/RSi for i = 1 , . . . , r. Then 0 is a surjective i?-module homomorphism with Ker() = Ker(0). By the First Isomomorphism Theorem 6.3.2,

from M = Rt/Kex(9) to iV, and it is clear that the restriction of to the component Rrrii gives an isomorphism Rrrii — R/RSi if 1 < i < r or Rrrii = R if r < i < t. (ii) By Proposition 8.2.1, a finitely generated i?-module P is torsion provided that aP = 0 for some nonzero element a of R. If an element y = a i m i + ■ • ■ + armr + ar+imr+\

+ ■ ■ ■ + atmt G M

has ay = 0 for some a ^ O , we must have ar+i

= ■ ■ ■ = at = 0

and aai £ RSi for i = 1,...

,r.

Thus T(M) C Ami ffi - • •ffii?m r . However, the divisibility conditions on 5\,..., Sr show that ST{Rmx © • • • © RmT) = 0, which gives the result. (iii) This is now clear. (iv) The integer w is the rank of the free module M/T(M), 5 5 2 independent of any choic^gfjp^tecfWeS^T6111 - - -

which is n

Chapter 12. The Decomposition of Modules

178

The elements 5\,..., Sr are called the invariant factors of M and the integer w is called the rank of M. The direct sum decomposition of M into cyclic summands is known as the invariant factor decomposition of M. The uniqueness of the invariant factors of M will be established in Corollary 12.8.2. Notice that the uniqueness theorem for the invariant factors of a matrix does not lead directly to the corresponding result for modules, the point being that a given module can have many unrelated presentation matrices. The invariant factor decomposition may also be less precisely referred to as the cyclic decomposition, although it contains more information than simply telling us that there is a cyclic decomposition of a module.

12.4

Some illustrations

Here are some computations to illustrate the theory. (i) Suppose that M = R/Rp2q x R/Rpq2 where p, q are distinct irreducible elements of R. The presentation matrix of M is T = dia.g(p2 q,pq2), which, although diagonal, is not in invariant factor form. The Fitting ideals of T are easily seen to be Rpq and Rp3q3, so the invariant factor form of T is A = dmg(pq,p2q2) and M =* R/Rpq x

R/Rp2q2.

I 12 9 12 (ii) Let M be the Z-module with presentation matrix I 3 2 4 ). By \ 12 8 22 section 12.2, M has the invariant factor presentation 777,

0, 3m'2 = 0 and 67713 = 0.

Omitting the trivial term with coefficient 1, we see that r = 2, w = o and that M = Z3 x Z6. (iii) Next, we consider the integer matrix

r=

3 10 \ -20

1

2 5 -10

1 -5 10

°\

15 -30 /

that we introduced in section 11.1. The Fitting ideals of this matrix are Fiti(r) = z ) Fit 2 (r) = 5Z V

(12.12)

12.5. The primary

decomposition

179

and the invariant factor form of T is A = diag(l,5,0). Thus the module with presentation matrix T is isomorphic to Z 5 x Z, so we have r = 1 and w = 1. The next result, although also an illustration, is important enough to be recorded separately. 12.4.1 Proposition Let F be a field, let C be the rational canonical block matrix associated to the polynomial f = f0 + f1X H h / n _ i X n _ 1 + Xn, and let M be the module over the polynomial ring F[X] which is defined by the action of C on Fn. Then M = F[X]/F[X]f(X). Proof The presentation matrix of M is T = XI — C with

(°

0 • • 0 0 0 • • 0 0 1 0 1 ■ • 0 0

C

0

0 •

■

0

• 0

■

1

0 1

-/o

\

-/l

-h — fn-2 -fn-1 /

As in section 11.4, the invariant factors of T are S1 = ■■■ = 5n-i = 1 and 5n = / , which gives the result.

12.5

□

The primary decomposition

In Theorem 8.4.2, we proved that a finitely generated torsion module over a Euclidean domain has a primary decomposition, that is, it can be ex pressed as a direct sum of its p-primary components for certain irreducible elements p. We also found the structure of these primary components when the module is cyclic (secfJQfti^/^^e^j^H^^^put these results together to

Chapter 12. The Decomposition of Modules

180

obtain the structure of the primary components of a general torsion mod ule, which we can do since the invariant factor decomposition expresses a torsion module as a direct sum of cyclic modules. Let M be a finitely generated module over a Euclidean domain R. Recall from section 8.4 that, for an irreducible element p of R, the p-primary component of M is the submodule Tp(M) of M consisting of those elements m of M which have pkm = 0 for some k > 0. Evidently, the primary components of M are the same as those of its torsion submodule T(M), so we may assume that M is a torsion module. First, we review the results for the case that M = R/R5 is a cyclic module. Let <5 = up™ ■■■Pk be a standard factorization of S. The discussion in section 8.5 shows that R/R5 has the primary decomposition R/RS = TPl (R/RS) © • - - © TPk (R/RS) with TPi (R/RS) « R/Rpfj)

for j = 1 , . . . , k.

Now we return to the general case. Suppose that M = R/RSi © • ■ ■ © R/R5r is the invariant factor decomposition of M, and let 5T = u r p j

---Pk

be a standard factorization of 5r. Then for i < r we can write r n(i,l) n(i,k) Si =Uip^ ■■■Pk ■

These factorizations need no longer be "standard" since some exponents n(i,j) may be 0, but, for fixed j , the exponents of the irreducible element Pj form a nondecreasing sequence 0
x ■ ■ • x R/Rpfr'j)

for j = 1 , . . . , Jfc.

(12.13)

Notice that some of the summands may be zero modules, corresponding to the possibility that n(i,j) = 0. The collection of nontrivial powers {p"{hj)

I i =l.....r1L.Jj=ila^.^k,

n(i,j)

^ 0}

12.6. The illustrations, again

181

that occur in the primary decomposition of M is called the set of elementary divisors of M. We also note that we can find an explicit set of generators 31,1, •■ ■ ,gr,k of the torsion module M which gives the primary decomposition of M as an internal direct sum; that is, M = Rg1%x © • ■ • © Rgr:k

with

RgtJ^R/Rpf-j)

for all i,j.

Suppose that we have already found a set of generators m\,..., give the invariant factor decomposition of M: M = Rmi © • • • © RmT with Rrrn S! R/RSi

mr which

for all i.

For each irreducible factor pj of Sr, let €i{pj) be the complement of pj in Si'. e.(n.\

- „"(«.!) ri(ij-l) ^UO/ — Pi ■■■Pj_l

n(i,j+l) Pj+1

n(t,fc) •••?*:

(If Pj does not genuinely occur in Si, take eiipj) = Si.) Then the results of section 8.5 show that the required generators, some of which may be zero, are given by 9%,j = £i(Pj)m» for i = 1 , . . . ,r, j = 1 , . . . , k.

12.6

T h e illustrations, again

Here are the elementary divisors of the modules whose invariant factor forms were computed in section 12.4 above. (i) Let M = R/Rp2q x R/Rpq2 where p, q are distinct irreducible elements of R as in (i) of 12.4. Then M = R/Rpq x R/Rp2q2 and so the primary components of M are TP(M) « R/Rp x R/Rp2

and Tq(M) S fl/ity x R/Rq2.

The list of elementary divisors of M is {p,p 2 , q,q2}(ii) The Z-module of 12.4(ii) is M = Z 3 x Z 6 . The primary decomposition of M is therefore M = T2(M)®T3(M) with

T2(mpyfight&tflM&yM®l= Z 3 x z 3 .

Chapter

182

12.

The Decomposition

of

Modules

T h e set of elementary divisors of M is { 2 , 3 , 3 } . We also showed t h a t M has generators m i , m,2 with relations 3 m i = 6m2 = 0, where m i , m 2 occur in t h e original generating set for M. To find generators of M adapted to its primary decomposition as in section 12.5 above, we take p\ = 2, P2 = 3, and write (5i = 3 = 2°3 1 and 52 = 6 = 2 1 3 1 , so t h a t n ( l , l ) = 0, n ( l , 2 ) = l, n ( 2 , l ) = l , n ( 2 , 2 ) = l . T h e generators are then £?i,i =

3m

i > 9i,2 = " i i , 92,1 = 3 m 2 , 92,2 = 2 m 2

with relations 9i,i = 0, 301,3 = 0, 202,1 = 0, 3
12.7

Reconstructing the invariant factors

T h e invariant factors of a module can be reassembled from its elementary divisors. We give the general argument first, and then an example. Suppose t h a t pi,... ,Pk are distinct irreducible elements of R and t h a t we are told that the elementary divisors of M are 2(1.1) Pi

*(l,y(l)) . ,■■■,Pl

. „z(fe,l)

>■■■1 Pk

z(k,y(k)) >■••>Pk

'

where the exponents are all nonzero and are listed in nondecreasing order for each pj, t h a t is, z(j,l)<---
z kk nh^lMriM^^) <( >!) >*) l±^Htihlamtt&S= z=(k>y( ^v())))

12.8. The uniqueness results

183

and °r — Pi

• ■ • Pk

Next, write n(r - 1,1) = 2(1,2/(1) - l)...n(r-

l,fc) = -z(fc,y(fc) - 1)

and Or-1 — Pi

Pfc

,

where the exponent z(j, y(j) - 1) is to be interpreted as 0 if it happens that y(j) = 1 for any j . Continuing in this way, or, more properly, arguing by induction on r. we obtain the set <5i,...,<5r of invariant factors of M. An example. Here is a concrete illustration of the reconstruction argu ment. Suppose that the Z-module M has elementary divisors 2,2 2 , 2 4 , 2 6 ; 5 2 ; 7, 7, 7 3 ; 11, l l 2 , l l 4 . In this case, r = 4, and collecting the highest powers gives 5i = 2 6 - 5 2 - 7 2 - l l 4 . We then find <53 = 2 4 ■ 7- l l 2 , 62 = 2 2 - 7 - l l , <5i=2.

12.8

The uniqueness results

We next turn to the question of the uniqueness of the invariant factors and elementary divisors of a module. We first consider the case of a p-primary module, that is, M = TP(M) for some irreducible element p of R. 12.8.1 Theorem Let M be a p-primary R-module. Then the following hold. M ^ R/Rpn{1)

x • ■ •x

R/Rpn^

where the nondecreasing sequence of positive integers n(l) < ... < n{y) is uniquely deterrr@KB$/liglitkd Material

Chapter 12. The Decomposition of Modules

184

(ii) The set {pn^\ ■ ■ ■ ,pn^} is both the set of elementary divisors of M and the set of invariant factors of M, and it is uniquely determined by M. Proof Since M is annihilated by some power of p, the primary decompo sition of M cannot have any terms of the form R/Rq* if q ^ p. It follows that the primary decomposition of M must be as stated, and that this must be the same as the invariant factor form of M. Clearly, the unique ness assertion in part (ii) will follow immediately from the uniqueness of the exponents n ( l ) , . . . ,n(y), which we establish by induction on the exponent n(y). Notice that p n ( y ) is the smallest power of p such that pn^M = 0. First, suppose that n(y) = 1, in which case we must have n(j) = 1 for j = 1 , , , . , y and pR = 0. Thus M can be regarded as a vector space over the field R/Rp - see Theorem 2.11.1 and Exercise 3.8 - and y is simply the dimension of M over R/Rp, which we know to be unique by elementary linear algebra. Now suppose that n(y) > 1, and define z to be the number of terms which are isomorphic to R/Rp; that is, z is the largest integer so that n

( l ) = ■■■ = n(z)

= 1.

If all the exponents n(i) are greater than 1, we take z = 0, and, in any case, z 1, we have Rp/Rpn = R/Rp71-1 (Exercise 6.4), while p{R/Rp) = 0. Hence pM £=! R/Rpn{z+l)-x

R/Rp"^-1.

x • • •x

By our induction hypothesis, the sequence of integers n(z + l ) - l , . . . , n ( y ) - l is uniquely determined by pM, which shows that the sequence n(z + l ) , . . . , n(y) is uniquely determined by M; in particular, the number y — z of terms in this sequence is unique. Next, consider the quotient module M/pM. If n > 1, then (R/Rpn)/p(R/Rpn)

a

R/Rp,

and so M/pM is a vector space over R/Rp, of unique dimension y. Thus R Rp is the number of cyclic com&^}§^^i\/j£t&fafOTm /

12.9. A summary

185

z = y-

(y-

z),

which is again unique as it is the difference of two numbers themselves uniquely determined by M. □ Since the invariant factors of a module can be reconstructed from its elementary divisors as in section 12.7, the next corollary needs no further proof. 12.8.2 Corollary Let M be an R-module. Then the invariant factors of M are uniquely determined by M, except that each of them can be multiplied by a unit of

R.

12.9

'

'

"

□

A summary

As our description of the structure of a module has emerged piecemeal, here is a brief summary of the main results. Let M be a finitely generated module over a Euclidean domain. We start by decomposing M as a direct sum T(M)®F{M), where T(M) is the torsion submodule of M and F{M) is a free complement, as in Theorem 12.3.1. The submodule T(M) is absolutely unique, as we can see from its definition as a subset of M. Exercise 12.1 shows that the free complement F(M) is not usually unique as a subset of M, but it does have a unique rank. The next step is to express T(M) in invariant factor form T(M) £* R/RSi x ■ • ■ x RSr; as we have just seen, the invariant factors are unique apart from changes of the form 5't = UiSi for units u, of R. The corresponding internal direct sum decomposition of T(M) is usually far from unique, as can be seen by considering all possible bases of the vector space R/Rp x R/Rp - any such basis gives an internal direct sum decomposition of R/Rp x R/Rp as an .R-module. The torsion submodule can further be decomposed into p-primary com ponents TP(M) for irreducible elements p of R. These components are absolutely unique since they are specific subsets of M. There is a unique finite set pi,...,pk of irreducibles for which TPi(M) ^ 0, namely those occurring as factors of the annihilator Ann(T(M)) of T(M). Copyrighted Material

186

Chapter 12. The Decomposition of Modules

Finally, each TP(M) has a decomposition R/Rpn^ x •■■ x R/Rpn^\ n n(y where the set of invariant factors {p ^\.. . ,p ^} of Tp(M) is unique pro vided that we take the exponents in non-decreasing order. Again, the corre sponding internal direct decomposition of TP{M) is not absolutely unique. The set of elementary divisors of M is the set of powers {p"' 1 ', ■ ■ • ,pn<-y'} where p ranges over all the irreducible factors of the annihilator of T{M); this set is again unique apart from the order in which it is listed.

12.10

Abelian groups

As we saw in section 3.2, a Z-module is an additive abelian group under another name. This observation means that our results on modules can be interpreted in the language of group theory. Here is a brief sketch of this interpretation. A group theorist will usually prefer to write a group multiplicatively, as in section 1.1. A cyclic group C will then appear as the set of powers {xi\i£Z}

C =

of a generator x instead of the set of multiples that is expected when we use additive notation. There is an infinite cyclic group Coo in which all the powers xl are distinct. The finite cyclic group of order n is Cn = { l , ! , . . . , ! 7 1 - 1 } with i n = l , i V l

for 1 < i < n.

To see that these groups are really Z and Z n in disguise, note that there are bijective maps Qoo

:

^ —t ^ o o j

given by aoo(i) = x\ and given by

an(i) = x\

12.11.

Lattices

187

which are isomorphisms of groups since the equations aoo(i+ j) = aoo(i) + aoo(j) and

an(i+j)

= an(i) + an(j)

hold for all i,j and n. Given a prime p of the ring of integers Z, the p-primary component of a finitely generated multiplicative abelian group A takes the form TP(A) = Cp„
12.11

Lattices

Our results on additive groups have a geometric interpretation when we consider finitely generated additive groups that are subgroups of a real vector space R™. Such an additive group L is called a lattice in R n . It is clear that L must be torsion-free as a Z-module. Thus, by Theorem 12.3.1, L is isomorphic to Z7" for some integer r, and hence L has a basis { a i , . . . , ar}. We say that L is a full lattice in R" if r = n, in which case { a i , . . . , an} is also a basis of R n . The reason for the use of the term lattice can be seen from the following diagram in R 2 :

g

Chapter 12. The Decomposition of Modules

188 •

o

o

o

c)

O

•

o

0

0

0

0

o

o

o

o

1»

o

o

o

o

0

•

0

o

0

•

o

<)

O

0

o

•

0

0

0

•

o

0

o

()

O

•

0

o

0

0

0

o

0

o

o

(»

o

o

0

0

0

•

0

o

o

6i^

o

c)

O

o

o

•

0

0

0

•

o

0

c)

o

o

0

0

0

0

G—

©—

©

w

Q

—0

0

o

o

•

o

c)

O

o

o

^•62

0

0

0

•

o

o

o

c)

O

•

0

0

0

0

0

o

o

o

o

<»

o

o

o

0

0

•

0

o

o

•

o

()

O

o

o

•

0

0

0

»-*

0

h-i^™

V

Q

Q

6ei

Here, we take L to be the lattice given by the standard basis {ei, 62}, and M is a lattice with basis {61, 62}, where 61 = I

I and 62 = I

The points belonging to M are indicated by solid circles • and those in L by open circles o, except where they are hidden under a point of M. A basis { a i , . . . , a n } of a lattice L defines a fundamental parallelepiped II(L) whose vertices are the origin, the vectors a%,... ,an, and all the sums it! + ■ •• + Otfei with i\ < • •■ < ik and 1 < k < n. We then associate with L the volume vol(L) = I det(ai . . . a n ) | of II(L), which is (by definition) the absolute value of the determinant of the matrix whose columns are the (column) vectors a\,... ,an. For n = 2, we prefer to speak of parallelograms and area. In our illus tration, a fundamental parallelogram II(L) of L is indicated by dotted lines and a fundamental parallelogram II(M) of M by solid lines. It is easy to see that L has area vol(L) = 1, while vol(M) = 6. Our main result in thi$^§^$gyj^p j^p^f^he volume of a lattice relates

12.11.

Lattices

189

to the volume of a sublattice. We write \G\ for the order of a finite group G. 12.11.1 Theorem Suppose that L and M are both full lattices in the real space W1, and that M is contained in L. Then the following statements hold. (i) vol(L) does not depend on the choice of a basis for the lattice L. (ii) The quotient group L/M is finite, and vol(M) = vol(L)-|L/M|. Proof (i) Let { a i , . . . ,an} and {6i,... ,bn} be bases of L as a Z-module, write b3 = aipij H

h anpin

for j - 1 , . . . , n,

and put P = (pij), an n x n matrix with entries in Z. Then P is a change of basis matrix, as in section 5.6, and so it is invertible as an integer matrix (Eq. (5.7)). By Theorem 5.12.1, we see that det(P) = ± 1 . Now let A = (ai ... an) and B = {b\ . . . bn) be the matrices whose columns are the vectors in the respective bases. A careful check shows that B = AP, which explains why our scalars have suddenly appeared on the right, and also why we had an unexpected transposition of suffices in our original definition of the change of basis matrix. Thus det(B) = ±det(A), which proves the assertion. (ii) By Theorem 12.1.1, there are bases of L and M of the forms {bx,..., bn} and {S\bi,..., 5rbr} respectively, where S\,... ,5r are the invariant factors of L/M. But a basis of M is also a basis of R n , so we must have r = n. By Theorem 12.3.1, we have L/M^Z6l

x.-xZ,n,

so that L/M is finite, with order Co$ffi8t\tedWlatwfal

Chapter 12. The Decomposition of Modules

190 Clearly, vol(M)

= = =

|det(5i&i . . . § <$i...(5n|det(&i . . . bn)\ \L/M\vo\(L). D

Thus, in the example above, we have L/M = 1,6. It is easy to see that the basis {62, ^1} is a basis of L and that {62,6ei} is a basis of M.

12.12

Further developments

Our arguments in this chapter depend on the fact that the presentation matrix of a module can be put into invariant factor form. Thus the results of this chapter can be extended to modules over principal ideal domains. They also hold for noncommutative Euclidean and principal ideal domains, but with less satisfactory versions of the uniqueness results. Details can be found in Chapter 8 of [Cohn: FRTR] and in [G, L & O]. Results on modules over some rings that are "nearly" Euclidean can be found in [A & L], and an investigation of presentations of modules over rings which are not principal ideal domains is given in [G & L].

Exercises 12.1 Let M = Z a x Z with a / 0 . Write mx = (T, 0) and m 2 = (0,1). Show that M = Zmi © Zm 2 and that Z a S TLrax and Z £ Zm 2 . Find all elements x = x\m\ + X2W2 € M so that Z = Zx. For which of these x is there an element y G M so that M = Zx © Zyl Hint: consider intersections first. 12.2 Describe the invariant factor and elementary divisor forms of the Zmodules with the following presentation matrices. 4 10 \ 2 4 11 (a) 2 \ 3 7 0/ 2 '\ (b) (I I 32 12 ') \ (c) I 33 6 I.■ (See (s Exercises 11.2 and 11.3.) \2 a /

f

Exercises

191

/ 1 0 a \ 12.3 Let A =

0 1 a I be a matrix over a field F, and let M be the \ 0 0 b ) F[X]-module defined by the action of A on F 3 . Discuss the possibilities for the invariant factor and elementary divisor forms of M as a and b vary. (See Exercise 11.4.) 12.4 The challange again! Using Exercise 11.5, find the invariant fac tor and elementary divisor forms of the Z[i]-module with generators mi,m2,m3 and relations (1 + i)m,2 = 6mi + 4m2 + 2(1 + 1)7713 = 6m 1 + (5 — i)m,2 + 2imz = 0. (Note that 1 + i and 3 are irreducible.) 12.5 Let L be a lattice in W1, and let b\,..., bn be any set of members of L. Show that 61, ...,&„ is a basis of L as a Z-module if and only if vol(L) = I det(6i . . . bn)\. 12.6 Let R be a Euclidean domain and let ip : R —> Z be the function of section 2.1. We define (very unofficially!) the order \M\ of a finitely generated torsion i?-module M by first setting \R/R5\ — tp(6) for a cyclic module, and then using the invariant factor decomposition (Theorem 12.3.1) to extend the definition to general modules. Verify that this definition coincides with the usual one when R is the ring of integers Z. Show that Lagrange's Theorem still holds, that is if N is a submodule of M, then \N\ divides \M\, with quotient \M/N\. Let K be the field of fractions of R. Extend ip to a function

Q (hint: Axiom ED 2). Define an iMattice L in Kn and a "volume" vol(L), and show that, after obvious modifications, Theorem 12.11.1 is still true.

Chapter 13

Normal Forms for Matrices Our aim in this chapter is to describe some normal forms for a square matrix A over a field F. Before we can begin, we must describe what we are seeking. A normal form for A is a matrix C whose entries conform to some standard pattern and which is similar to A, that is, C = PAP~l for an invertible matrix P over F. The first normal form that we find is the rational canonical form. This form can be computed from the invariant factors of the characteristic matrix XI — A, and so it can be found by purely algebraic calculations, by which we mean computations that involve only addition, multiplication and long division of polynomials. The calculation of rational canonical forms enables us to solve the similarity problem, that is, we can determine precisely when two matrices are similar. The second form is the Jordan normal form. This form is more elegant when it can be obtained, but it can be found only when all the irreducible factors of the characteristic polynomial det(XI — ^4) are linear polynomials X — A. Thus, if we wish to ensure every matrix over F has a Jordan normal form, we must impose the condition that every irreducible polynomial over F is a linear polynomial, or, in other words, that F is algebraically closed. Even when this requirement is satisfied, there is in general no purely alge braic method for determining the Jordan normal form, since the roots of a polynomial cannot be determined algebraically. We also discuss the versions of the Jordan normal form that can be found when the irreducible factors of the characteristic polynomial need not be linear. 193

194

Chapter 13. Normal Forms for Matrices

The Jordan normal form is useful for solving polynomial equations in matrices, which we illustrate in a couple of examples, and we hint at how such calculations occur in the representation theory of groups. The margins of this chapter are liberally scattered with the "supple mentary" material indicator. In the original lecture course, I was able to treat only the rational canonical form and the Jordan normal form over the complex numbers. The extensions of the Jordan normal form and the applications are supplementary material.

13.1

F[X]-modules and similarity

Our results on normal forms are derived from the structure theory of mod ules over the polynomial ring F[X], using the correspondence between F[A"]-modules and matrix actions that we first discussed in section 4.4. Let A be an n x n matrix over the field F and let M(A) be the space Fn made into an F[X]-module with X acting as A, in the usual manner. Then there are two types of change we can make to M(A) that result in the replacement of A by a similar matrix, that is, a matrix of the form A' = PAP-1. (Following the practice in group theory, we will sometimes say that A' is a conjugate of A.) The first type of change corresponds to isomorphism between F[X]modules. If we are given that A' = PAP-1, then the linear map 7r : x —> Px on Fn is an F[X]-module homomorphism from M(A) to M(A') (section 4.6) and moreover it is an isomorphism since P is invertible. Conversely, if we are given an F[X]-module isomorphism TX from M(A) to M(A'), then TV must be given by an invertible matrix P and A' and A will be similar through P. The second type of change is to change the basis of Fn. Let B be the standard basis of Fn, let B' be another basis for Fn, and write A' for the matrix of the linear transformation v —> Xv with respect to B'. Then A' =

PAP'1

where P = PB\B is the change of basis matrix (Eq. (5.13) of section 5.11). Conversely, if we are given that A' is similar to A through P , then we choose B' so that P = PB',B- Thus A' is again the matrix of the action of X with respect to the basis B'. Of course, these two types of change are really two interpretations of one phenomenon. If we are given a basis B = {b\,..., bn} of Fn and an isomorphism 7r : Fn -)■ Fn, then B' = {7r(6i),... ,7r(6„)} will be another basis of Fn. Vice versa, if B' = {b'j,..., b'n} is another basis, we can define an isomorphism TX by ■K<^@pyr^m^l4terial

13.2. The minimum

polynomial

195

These remarks suggest our strategy for finding a normal form for A: we analyse the structure of M(A) as an F[X]-module and hence choose a basis of Fn so that the action of X is represented by a matrix in some desirable form. We start by reviewing the results that we found in previous chapters. As we saw in section 9.7, the characteristic matrix

T=

XI-A,

is a presentation matrix for M(A). The n-th Fitting ideal of XI - A is generated by the characteristic polynomial det(XI — A) of A, which is a monic polynomial of degree n (section 11.4). The invariant factors of XI-A are polynomials Si(X),..., 6n(X) that satisfy the relations 61(X)...

5n(X) = det(XI

- A), S^X)

| • • • | Sr(X),

and, by Corollary 11.2.2, 51(X)...5h(X)=Fith(XI-A)

for

h=l,...,n.

Since the nonzero constants are units in F[X], we can take the invariant factors to be monic polynomials, and then, by Theorem 11.3.1, they are uniquely determined by the matrix XI—A and hence by the matrix A itself. We also know from Corollary 12.8.2 that the invariant factors are uniquely determined by the module M(A), that is, no alternative presentation of M(A) can give different invariant factors. Thus, by Theorem 12.3.1, M(A) has an invariant factor decomposition M(A) £ F[X]/F[X]S!{X)

x • • • x F[X]/F[X]5n(X)

(13.1)

as an F[X]-module. Note that the rank w of M{A) must be 0, that is, M(A) cannot have a nontrivial free component F[X]W. The simplest way to see this is to note that M{A) is finite dimensional as a vector space over F and F[X] is not. A more sophisticated approach is to recall that Ann(M(A)) ^ 0 since it contains the characteristic polynomial det(XI — A) of A (see Exercises 9.1 and 9.2). In general, some of the invariant factors of XI — A will be the constant polynomial 1. As these invariant factors give a zero component of M(A), we say that they are trivial invariants factors. We sometimes omit such terms from expressions such as that in Eq. (13.1) above.

13.2

The minimum polynomial

We now have a better understanding of the annihilator Ann(M(A)) MIA).

of

Chapter 13. Normal Forms for Matrices

196

13.2.1 T h e o r e m (i) Ann{M(A)) = F[X]6n(X). (ii) Sn(A) = 0. (Hi) If h{X) is any other polynomial with h{A) h(X).

0, then 5n(X)

divides

Proof The divisibility conditions S\ \ • ■ ■ \ 5n show that Sn(X) annihilates M(A). But Ann(F[X]/F[X]5n(X)) = F[X]Sn(X), which gives (i). Thus the matrix 8n{A) acts as the zero linear transformation on the underlying space Fn, and so must be the zero matrix. Finally, note that if h(A) = 0, then h(X) £ Ana(M(A)). D The polynomial 5n(X) is called the minimum (or sometimes minimal) polynomial of the matrix A, as it is the (unique) monic polynomial of small est degree that is satisfied by A. The Cayley-Hamilton Theorem (Exercise 9.2) shows that A also satisfies its characteristic polynomial det(XI — A). Examples. Here are some illustrations based on the calculations in sec tion 11.4. Let 0 • • 0 0 -/o \ (0 1 0 • ■ 0 0 -h 0 1 • • 0 0 -h C 0

V0

0 0

• • • •

1 0

0 1

— fn-2 — fn-1

)

be the companion matrix of the polynomial / = fo + fiX-\ Xn. Then, as in Proposition 12.4.1, S^X) = ■■■ = 5n-!(X)

= 1 and Sn{X) =

h/n-i^n_1 +

f(X),

so that the minimum polynomial of C is the same as the characteristic polynomial of C; the F[X]-module corresponding to C is F[X]/F[X]f(X). Let ^ 4 = 1

,

be a 2 x 2 matrix, which has characteristic polyno

mial f = X2-(a

+ d)X +

(ad-bc).

The calculations in section 11.4 show that there are two possibilities for M(A). (i) If any of the inequalities C6p$rl&)te<¥Matarial^ d

13.3. The rational canonical form

197

holds, then 8\(X) = 1 and 8^{X) = f, so that M(A) is isomorphic to F[X]/F[X]f(X) and the characteristic and mininum polynomials of A coincide, (ii) If b = c = 0 and a = d, then 6X(X) = X - a = S2(X), so that M(A) = N x N, where N = F[X]/F[X](X - a) is the field F regarded as an F ^ - m o d u l e with X acting as the scalar a. The minimum polynomial of A is X — a and the characteristic polynomial is (X — a)2.

13.3

The rational canonical form

We obtain our first canonical form for a matrix. Let A be an n x n matrix over a field F, and let M = M(A) be the F[X]-module defined by the action of A on Fn in the usual way. For any F[X]-submodule of M, the action of X on N defines the F-linear transformation cr(X) : n —> Xn. We sometimes refer to a(X) informally as "the linear transformation X". We write the invariant factor form of M as an internal direct sum (omit ting trivial components) M = Mi 0 • • • © M r with Mi*F[X]/F[X)Si(X)

for i =

l,...,r.

Choose a basis Bi of each summand M, as a vector space over F, and let Di be the matrix representing the action of X on M% for each i. Then the union B = B\ U . . . U Br is a basis of M, and, as in section 7.5, the matrix of the linear transformation X with respect to this basis is a block diagonal matrix / £>i 0 . . . 0 \ 0 D2 ... 0 dia.g(Di,...,Dr). D \

0

0

...

Dr /

Fix an index i. We exploit the fact that we have an isomorphism Bi-.Mit*

F[X]/F[X]Si{X)

to choose Bi to correspond to the canonical basis of F[X]/F[X]Si(X) we constructed in section 2.12: explicitly, Bi = {&i,i, ■ ■ •

where

,bi
0(&i,0tw>^ft?«f(M$$at x

»(i)-i

that

Chapter 13. Normal Forms for Matrices

198

and n(i) is the degree of St. Then the corresponding matrix A for X is the companion matrix C(6i) of Si. Put Sl

= *W

+ 5

«

X +

... + 5 «

X^"

1

+ Xn»,

so that .. .. ..

0 0 0 0 0 0

-4° \

0

.

1 0

°n(t)-2

0

..

0

°n(t)-l

(°

0 0 1 0 1

C(*i) 0

1

-4° /

We summarize these remarks, and a little more, as a theorem. 13.3.1 The Rational Canonical Form. Let A be an n x n matrix over a field F. matrix P such that PAP-1

Then there is an invertible

= C(A) = diag(C(<5i),..., C{5r))

is a block diagonal matrix over F, in which the diagonal terms are the companion matrices C(Si) of the nontrivial invariant factors Si of the char acteristic matrix XI — A of A, and <5il...|<5r. Furthermore, the nontrivial invariant factors of XI — C(A) are also Si,... ,Sr. Definition: the matrix C(A) is called the rational canonical form of A. Proof The points not covered by the preceding discussion are the existence of the invertible matrix P and the claim about the invariant factors. In the notation of section 5.11, A = (X)E,E is the matrix of the linear transformation X of Fn with respect to the standard basis E of F n , while C(A) = (X)B,B is the matrix of X with respect to B. Let P = PB,E be the corresponding change of basis matrix, which is invertible, with inverse P~l = PB,B- Then Eq. (5.13) gives C(A) = PAP'1 as required. Since A is similar to C(A), we see that XI — A is similar to XI — C(A), so both matrices have the^sa^JnyMianl.faGtprs by Theorem 11.3.1. □

13.3. The rational canonical form

199

13.3.2 Corollary The rational canonical form of A is unique. Proof Suppose that a matrix A has two rational canonical forms, say PAP-1

=C =

QAQ-1

= C = d i a g ( C ( ^ ) , . . . , C{5'r,))

diag{C(51),...,C{5r))

as above and also

with 5'x\...\5'r,. Since the matrices C and C" are similar, their characteristic matrices XI — C and XI — C" must have the same invariant factors, again using Theorem 11.3.1. We know that the nontrivial invariant factors of XI — C are the same as those of XI — A, namely 5\,..., Sr, the remainder all being the identity 1. However, a direct computation of Fitting ideals shows that the nontrivial invariant factors of XI — C must be 5[,..., 6'r,. It follows that r = r' and, since all these polynomials are monic, that 5, = 5[ for i = l,...,r. □ We can now determine when two nxn F are similar.

matrices A and A' over the field

13.3.3 Theorem Two nxn matrices A and A' over a field F are similar if and only if their characteristic matrices XI — A and XI — A' have the same invariant factors. Proof If A and A' are similar, so are their rational canonical forms. By Corol lary 13.3.2, these rational canonical forms must be the same, so that the invariant factors of the matrices XI — A and XI — A' must also be the same. Conversely, if XI — A and XI — A' have the same invariant factors, A and B are both similar to the same matrix in rational canonical form and so are themselves similar.

□

Chapter 13. Normal Forms for Matrices

200

13.4

The Jordan normal form: split case

The second canonical form for a matrix that we exhibit is the Jordan normal norm. First, we give it in its most familiar version, which arises when the characteristic polynomial of A splits into linear factors. The standard factorizations of the invariant factors of XI — A will then be products of powers (X — A)' for various scalars A and exponents t, and each such power will give rise to a component matrix of the Jordan normal form, in the following way. Let M = F[X]/F[X](X - A)' and regard X and X - A as linear trans formations of M, viewed as a vector space over F. Define a set of elements W = {u>\,..., Wt) in M by the equations w1 = 1, VJI = (X - X)w\,

=

u>3 = (X - A)w2 Wt

(X-A)V,

(13.2)

= (X-A)*"1^.

(X-X)wt-!

Then W must be a basis of M as an F-space (or otherwise (X — A)' would not be the minimum polynomial of the linear transformation X of M), and the action of X on this basis is given by Xwi = Xwi + u>2, Xu>2 = \w2 + U>3, (13.3) Xwt-i = Awt_x +wt, Xwt = Xwt. It follows that the matrix representing X with respect to the basis W is / A 0 0 1 A 0 0 1 A

0 0 0

0 0 \ 0

A 0 0 1 A 0 0 1 A )

0 0 0

0 \ 0 0

J(X,t)

(13.4) 0 0 0

0 0 0

Such a matrix is called an elementary Jordan matrix.

13.4.1 The Jordan No^t9^r^mm

Material

13.4. The Jordan normal form: split case

201

Let A be an n x n matrix over a field F. Suppose that the irreducible factors of the characteristic polynomial det(XI — A) of A are all linear, so that det(XI -A) = (X- Ai) 2(1) {X - Xs)z{s) for some scalars Xi, ...,XS in F and exponents z ( l ) , . . . ,z(s). Then there is an invertible matrix P such that PAP'1

= J(A) = diag( J(Ai, t(l, 1 ) ) , . . . , J{Xut(iJ)),...,

J(Xs,t(s,

r)))

is a block diagonal matrix over F, in which the diagonal matrices are ele mentary Jordan matrices. The size t(i,j) of the entry J{Xi,t{i,j)) is given by the exponent of X — Xi in the factorization SjiX) = {X-

Xi)t{l'j)

(X -

Xs)t{s-j)

of the j-th nontrivial invariant factor Sj(X) of XI — A. NB: It may happen that some of the exponents t(i,j) are zero when j < r. In this case, the term J(Xi,0) is to be interpreted as a phantom 0 x 0 submatrix of J. Definition. The matrix J(A) is called the Jordan normal form of A. Proof By the results in section 12.5, the primary decomposition of the F[X]-module corresponding to A is M = Mi,! © • • • © Mi,j © ■ • • © M s , r in which

AfijeiF[X]/F[X\{X->n^, the polynomials

{X-XiY^

for i = l,...,s,

j = l,...,r,

being the elementary divisors of M. We take an F-basis of each summand Mit]- so that X is represented by the elementary Jordan matrix J(Xi,t(i,j)) - if some t(i,j) = 0, the corresponding summand is 0, which has the empty set as its basis. The union of all these bases gives a basis of M and the action of X on this basis is represented by the matrix J as above. The matrix P is then the change of basis matrix from the standard basis of M to the new basis, asC3up>togf^te«6Mi/l'4ia6rem 13.3.1 above. Q

Chapter 13. Normal Forms for Matrices

202

13.4.2 Corollary Let A be annxn matrix over a field F and suppose that A has a Jordan normal form. Then the Jordan normal form of A is unique, apart from the order in which the elementary blocks are written. Proof It is clear that we can always permute the order of the diagonal blocks of the Jordan normal form, since this amounts to a renumbering of the roots of the characteristic polynomial of A. The argument to show that no other change is possible parallels that given for the rational canonical form in Corollary 13.3.2 above. If J = J(A) is a Jordan normal form of A as constructed in the theorem, then the nontrivial invariant factors and hence elementary divisors of XI — J must be the same as those of XI — A. However, direct calculation shows that the invariant factors correspond ing to an elementary Jordan matrix J(A, t) are 1 , . . . , 1, (X — A)*, which are also the elementary divisors of XI — J(\,t). Thus, if J ' = diagCJCA;, t'(l, 1 ) ) , . . . , J(Aj,t'(», j)),...,

J(\'s,,t'(s',r')))

is an alternative Jordan normal form for A, we find that the nontrivial elementary divisors of XI - A are (X - A'^)' W for i = l , . . . , s ' and j = l,...,r'. This forces the equalities s = s', r = r', and Xi = AJ for all i and t(hJ) = t'(i,j) for all i,j. □

13.5

A comparison of computations

The methods for computing the two normal forms of a matrix, the rational canonical form and the Jordan normal form, are rather different, in that the calculation of the rational canonical form is purely algebraic, while the calculation of the Jordan normal form is not. Our theory shows that to find the rational canonical form of a matrix A, we must compute the invariant factors of the characteristic matrix XI — A. We can do this by finding the Fitting ideals of XI — A and then using Lemma 11.1.1, or, alternatively, we can use elementary row and column operations as in Theorem 10.5.1. In either case, it will be arduous to perform the calculations by hand unless the matrix A is small, or has some special form, since we must work in the polynomial ring F[X], However, the calculations are purely algebraic in the sense that we need only use the operations of addition, multiplication and long division in the ring F[X]. They are also algorithmic in that a computer can be pro grammed to perform theTQopyrjghted Material

13.6. The Jordan normal form: nonsplit case

203

In contrast, it may happen that a matrix with entries in a given field F does not have a Jordan normal form over that field, since the irreducible factors in F[X] of the characteristic polynomial XI - A may not be linear. For example, the real matrix

H-ll) has characteristic polynomial X2 + 1, which is irreducible over the real numbers. This failing can be remedied in two ways. Given a polynomial f(X) over a field F, it is always possible to embed F in a splitting field E in which all the irreducible factors of f[X) are linear (Proposition 2.13.1). We can therefore construct a field over which A does have a Jordan normal form by adjoining roots of the characteristic polynomial of A to F. Thus if we adjoin i = y/—l to R, obtaining the complex numbers C, then the matrix A above has the Jordan normal form

o - "

The second, more drastic, method is to embed F in an algebraically closed field E. By definition, E has the property that the only irreducible polynomials over E are the linear polynomials X — A. This property can be restated as "every nonconstant polynomial over E has a root in E". It is a fact that any field can be embedded in an algebraically closed field - see §6.1 of [Cohn 2]. The field C of complex numbers is the most familiar example of an algebraically closed field, but some analytic tools are required to establish this fact. A proof is given in §7.4 of [Cohn 2]. Moreover, the evaluation in C of the roots of a polynomial over Q (for example) cannot be always carried out algebraically, as is evidenced by the existence of quintic polynomials that cannot be solved by radicals. Thus the Jordan normal form cannot be computed purely by algebraic computations in the polynomial ring F[X] save in special circumstances. Notice that the rational canonical form of a matrix A is computed through the invariant factors of the module M(A), but that the Jordan normal form requires instead the elementary divisors of M{A).

13.6

The Jordan normal form: nonsplit case

We next derive the version of the Jordan normal form which can be found O even when the characteristic polynomial of A does not split into linear factors over the coefficientfi&)0J*ft?/WfeCBMi#eHH/thisform the nonsplit JOT-

Chapter 13. Normal Forms for Matrices

204

dan normal form to distinguish it from the standard version of the Jordan normal form. As usual, let M(A) be the F[.X"]-module associated to a square ma trix A. The primary decomposition of M{A) as F[X]-module expresses M(A) as a direct sum of components which are (isomorphic to) cyclic mod ules F[X]/F[X]p(X)k, where p(X) varies through the irreducible factors

of XI

-A.

On general principles, A is similar to a block diagonal matrix J+{A) whose diagonal terms correspond to the action of X on the various cyclic components of M{A). We shall give a description of a typical diagonal term, which we designate J+(p, k). It will be convenient to use a double-suffix notation to describe the basis that we construct. Write p(X) = p0 + p\X -\ 1- Xh, so that deg(p) = h, and put u>i,i =T,wi, 2 = X w i , i , . . . , w u = X h _ 1 w;i ? i. Notice that if we reduce mod p, these elements map to the canonical F basis 1,lC, ...,Xh~l of F[X]/F[X]p(X) as constructed in section 2.12. Now for i = 2 , . . . , k we define ma = p ( * ) l - 1 w i , i . • • • > wi,h =

p{X)%~lwlM.

Since the elements in "layer" i map to the canonical F-basis of

F{x}p(xy-l/F[x}p(xy s

F[X]/F[X\P(X),

the collection W = {wij} spans M as an F-space. Since W has kh mem bers, it is therefore an F-basis of M. The action of X on the basis elements is as follows. For j < h, Xwij

= wi]J+i

whatever the value of i. For j = h and i < k we have Xwith

= X wlA = -po«>t,i - PiVJi,2

Ph-\Wi,h + Wi+i.i

and for j = h and i = k, we have XWk,h = X Wk,l = -poWk,l - PlU>fc,2

Ph-lWk,h-

To write down the corresponding matrix J+(p,k), we take the basis elements in the order in which we constructed them; that is, we give the set of suffices

^'Jbofayri&htecr haien^

•• • >h

13.6. The Jordan normal form: nonsplit case

205

the lexographical ordering (1,1), ( 1 , 2 ) , . . . , (1, h); (2,1), (2, 2 ) , . . . (2, h);...;

(k, 1), (k, 2 ) , . . . , (fc, h).

Thus the rows and the columns of J+(p,k) must also labelled with double-indices (i,j) arranged in this order. The column (i,j) of J+(p,k) gives the effect of X on Wij. For j < h, we see that the (i,j)-th column is (.. . , 0 ; 0 , 0 , . . . , 0 , 1 , 0 , . . . , 0 ; 0 , . . . ) T where we exhibit the entries in the rows labelled (i, 1 ) , . - . , (i, h) (we have transposed the column for convenience). The only nonzero entry is in the (i,j + l),(i,j)-place. For j = h and i < k, the (i, h)-th column is (..., 0; -po, - p i , . . . , -Ph-i; 1,0,..., 0; 0,.. .) T ; the entry — pi occurs in row (i, h) while the entry 1 is in row (i + 1, h). Finally, for j — h and i = k, column (k, h) is (•■•,0;-p0,-Pi,-

-Ph-i)

the entry — pi occurs in row (i, h) We can now describe the matrix J+(p, k) as a block matrix. Let

(o

0

• ■ 0 0

-Po

1

0

■ •

0

0 0

• • • •

-ph-2 -Ph-1

\

-Pl

C = C[p) 0 1

I

be the companion matrix of p and let Y be the h x h matrix / 0 0 0 0

0 1\ 0 0

Y = 0

0

•••

0 0

Cor)yifyh&d Material0 J

Chapter 13. Normal Forms for Matrices

206

Then / C O O Y C 0 0 Y C

0 0 0

0 0 0

0 \ 0 0 (13.5)

J+(p,k) 0 0

0 0

0 0

\ o o o

c o o Y 0

C Y

0 C )

The uniqueness of the nonsplit Jordan normal form is proved in the same way as in the case of the ordinary Jordan normal form. The question of computability amounts to that of the computability of the irreducible factors of the characteristic matrix, which again is not usually possible by algebraic methods.

13.7

The Jordan normal form: separable case

we give a variation on the nonsplit Jordan form which is used s EinFinally, [Green] to calculate the characters of elements of linear groups. This ii

variation does not appear to be discussed in any textbooks, at least, not at a relatively elementary level. We assume again that we are given a n n x n matrix A with coefficients in a field F and that some of the irreducible polynomials p(X) which occur in the factorization of characteristic polynomial XI — A of A are not linear. (If the irreducible factors are all linear, we simply regain the ordinary Jordan normal form.) The existence of this alternative form depends on a hypothesis about the roots of the irreducible polynomials p(X). As we proved in Proposition 2.13.1, we can extend the field of coefficients F to a bigger field E in which all the irreducible factors of the characteristic polynomial are linear, which amounts to the same thing as saying that each polynomial p(X) has all its roots in E. Our hypothesis is that each irreducible polynomial p(X) has distinct roots in E. The technical expression for this condition is that each p(X) is a separable polynomial. Notice that different irreducible factors p{X),q(X) of XI — A are per mitted to have roots in common in E. An example of a non-separable polynomial is given in Exercise 13.7. The claim is that A is similar to a matrix JS(A) whose diagonal blocks

Copyrighted Material

13.7. The Jordan normal form: separable case

207

are matrices of the form 0

■■

c I

• 0 0 •• • 0 c ■■ ■ 0

0 0 0

0 0 0

•• • c •• ■ I •• ■ 0

0

cI 0

0 0 0

0 0 0

J°(p,k)

(13.6) 0 0 \ 0

0

0 0

c

cC ) where C is the companion matrix of p and I is an h x h identity matrix, h = deg(p). Such a block has the same form as the matrix J+(p, k) of Eq. (13.5) above except that the matrix Y is replaced by an identity matrix I throughout. The matrix J s (^4) is called the separable Jordan form for A. The claim will be established by showing that Js(p,k) is similar to + J (p, k) for any p and k. Since the polynomialp has distinct roots Ax,..., A/, in the extension field E, we know that the Jordan normal form of the matrix Cis /Ax 0

0 A2

... ■••

0 0

I

\

A

SCS \

0

0

■■■

A,,

-l

/

where S is an h x h invertible matrix having entries in E. Let T = diag(5 S ■ ■ ■ S) be a block diagonal matrix with k copies of S on the diagonal. Then ( 1

J' = TJ°(p,k)T-

A

J 0

0 A I

0 • • 0 0 • ■ 0 A • ■ 0

0 0 0

0 0 0

(13.7)

= 0 0

^o

0 0 0

0 ■ ■ A 0 ■ • / 0 ■ • 0

0 A I

0 0 A /

We write the row and column indices of J' in the form (i — \)h + j where j = I,... ,h and i = 1 , . . . , k to conform with the partition of J' into h x h blocks. Thus, for fixed i, the rows and columns labelled with (i — l)h + j , j varying, give the i-th diagonal block of J', and the nonzero entries in J ' are Aj in the ^Qpyh^hfedMate^

+ J) " t h P l a c e

Chapter 13. Normal Forms for Matrices

208 and

1 in the (ih + j , (i - l)h + j ) - t h place for i = 1 , . . . ,k - 1. Next, we permute the rows and columns of J' by moving row (i — l)h+j to row (J — l)k + i and likewise moving column (i — l)h + j to (j — l)fc + i, obtaining a new matrix J". Let P be the hk x hk permutation matrix corresponding to the permuta tion of the rows, that is, P is the result of performing the row permutations on an hk x hk identity matrix. Then the matrix P ~ x corresponds to the ef fect of performing the column permutations on an hk x hk identity matrix, and

J" =

PJ'p-\

We now observe that the nonzero entries of J " are \j in the ((j: — l)k + i, (j - l)k + i)-th place, j = 1 , . . . ,h, i =

l,...,k

and 1 in the ((j - l)fc + (i+ 1), {j - \)k + i) -th place for i = 1 , . . . , k - 1. Thus J" is a diagonal block matrix, having h diagonal blocks corresponding to the range of values of j , and the j-th block is the k x k matrix (

A

J

1 0

0 \j

1

0 0 A,

• • • ■ • •

0

•

0

0

0 0 0

0

0 0

\

Ai = 0 0

Vo

0 0 0

0

A,

0

0

•

1

Xj

0 0

0

■ ■•

0

1

^ 1

which is evidently the elementary Jordan matrix associated to the poly nomial (X — \j)h- Hence J " is in Jordan normal form, associated to the polynomial (X-\1)k...(X-\h)k=p(X)k. But then J" must be the Jordan normal form of the matrix J+(p,k), + since the characteristic matrices of both J" and J (p,k) have the same invariant factors. It follows that J " and J+ (p, k) are similar, and, trac ing through the various similarities that we have used, that J" (p, k) and J+(p, k) are similar as ™%mm®
13.8. Nilpotent

matrices

209

Finally, we have to prove that Js(p,k) and J+(p,k) are in fact similar as matrices over F, that is, we can find a matrix Q with entries in F so that QJs{p,k)Q~l = J+(p,k). As matrices over E, the characteristic matrices of both J"{p,k) and J+(p,k) have the same set of invariant factors. However, these invariant factors already belong to F, since both matrices in fact have entries in F, which shows that Js(p,k) and J+(p,k) are similar over F, by Theorem 13.3.3.

13.8

Nilpotent matrices

We use the Jordan normal form to show how some elementary matrix equa tions can be solved by listing the possible Jordan normal forms of a solution. A square matrix A over a field F is nilpotent of exponent k if Ak = 0 for some integer k > 1, but Ak~l ^ 0. Suppose that A is nilpotent. By Theorem 13.2.1, we know that the minimum polynomial of A is the "highest" invariant factor 5n (X) of XI — A and that Sn(X) divides Xk. Thus the nontrivial invariant factors of XI — A take the form Xii for exponents t\,..., tr with t\ < ■ ■ ■ < tT, and so A has Jordan normal form J( J 4) = d i a g ( J ( 0 , t 1 ) , . . . , J ( 0 , t r ) ) . An easy calculation confirms that each block J(0,ti) is nilpotent of expo nent ti, so J{A) is nilpotent of exponent tr and the exponent of its conjugate A must also be tr. Conversely, if J(A) has the form above, then J{A) and hence A are nilpotent.

13.9

Roots of unity

Next, we seek matrix solutions of the equation Xk = I, I an identity matrix. In the next chapter, we shall see how such solutions can be used in determining the representations of a cyclic group (section 14.4; Exercise 14.3). For simplicity, we assume that A has a Jordan normal form J(A). Since each invariant factor of XI- A is a divisor of the minimum polynomial 5n(X) of XI - A and, in turn, 5n(X) divides Xk - 1 (Theorem 13.2.1), this hypothesis will be satisfied if Xk — 1 splits into linear factors in the coefficient field F, that is, if F contains the fc-th roots of unity. Clearly, the matrix A is a root of unity if and only if its conjugate J(A) is a root of unity, so we can reduce the problem to that of determining the elementary Jordan m a t r i © § p $ 7 Y $ ^ e j ' ^ £ f / a /

Chapter 13. Normal Forms for Matrices

210

Suppose J is such an elementary Jordan matrix, of size txt An easy calculation shows that /

Xk kXk~l

0 Xk

•■

\

:

:

'

k

J =

with t > 1.

so we must have Afc = 1 and kX^1

s

= 0.

(13.8)

The analysis now separates into two cases. These equations are incom patible if the characteristic of F is either 0 or a prime p which does not divide k, since then k ^ 0 in F. Thus, J(A) must contain only l x l blocks, that is, it is a diagonal matrix, and each "block" must be a fc-th root of unity in F. Suppose, on the other hand, that F has nonzero characteristic p which does divide k. We compute the powers of the elementary Jordan matrix J in two steps. Write k = psh with h coprime to p and put J = XI + L, where XI is a scalar matrix and L is the obvious lower triangular matrix. Since XI commutes with any matrix, and p divides all the binomial coeffi cients except the first and last, we have JP = XPI + Lp,

and inductively JP' = Xp'l + Lp'. p

3

Notice that L ' = 0 <£> p > t. Next, consider a matrix of the form p,I + W with W strictly lower triangular. Then {Hi + W)h = nhI + hp,h~lW + ■■■, from which we see that (/j,I + W)h = / o /

= l a n d l V = 0,

since h ^ 0 in F. Thus (XI + L)p'h = 1^

Xp'h = 1 and ps > t.

Finally, observe that, in field of characteristic p, Xp'h-l

= (Xh - l )

p

\

from which we conclude that an elementary Jordan matrix with Jk = I is a t x t matrix with t < g g A B / l i J J ^ ^ U ^ ^ / ^ - t n root of unity in F.

13.10. Further

13.10

developments

211

Further developments

The results of this chapter depend very much on the fact that we work with matrices over a field F. Section 8.5 of [Cohn: FRTR] gives some results when F is a division ring; beyond this, it is very difficult to find normal forms for matrices, even when the coefficient ring is commutative, as can be seen from [G & L], [G, L & O] and [L & S].

Exercises 13.1 Let -3-2 4 \ 4 1-4 0 - 1 1 / be a complex matrix. Find the rational canonical form and Jordan normal form of A, and show that they are the same as for

'■[

-1 0 0 0 0 1

B

0 -1 -2 )

Notice that B is not in rational canonical form, although it is a diagonal block matrix made up of companion matrices. Explain. 13.2 Find the rational canonical form and Jordan normal form for each of the following matrices:

1° A=| 0 Vo

1 0 0

M 1 ;

0 1

0

B = ! o o 1

0/

l l

\

C

0 o/

2 3 0 V0

0 2 0 0

0 0 2 2

0 \ -2 0 2

and

/ 1 1 1 • • 1 1 0 0

D

1 0

1 • • 1 1 • • 1

1 1

00 00 00 ■■•■■ 00 11 / / where D is an n x n matrix. 13.3 Repeat Exercises 13.1 and 13.2, but regarding the matrices as having entries in the following fields in turn: R, Z2, Z3, Z5. (So this is 6 x 4 problems in one. You should always find a rational canonical form, but there may be no Jordan normal form.) Find the nonspliC<^9/dgfi^6rfiMa/6gJS/there is no Jordan form.

V\

Chapter 13. Normal Forms for Matrices

212

13.4 Let A be a square matrix over a field F, and let AT be its transpose. Show that the invariant factors of the characteristic matrices XI - A and XI — AT axe the same. Deduce (a) A is similar to AT; (b) the F[X]-modules M(A) and M(AT)

s

are isomorphic.

13.5 Let F be any field. Find the possible Jordan normal forms of an n x n matrix A which satisfies the equation A2 = A. 13.6 Let p(X) be a separable irreducible polynomial over a field F , with p(X) =

(X-X1)...(X-Xh)

in some extension field E of F, and write ?i{X) = (X - Xx)...

s

(X - A i - i ) ( * - A i+1 )

...(X-Xh)

for i = 1 , . . . ,k. Let J' be the matrix in Eq. (13.7), and let Mi be the (hk - 1) x (hk - l)-minor of XI - J' formed by eliminating row 1 and column h+1 of J'. Show that Mi = p1. Find (hk - 1) X (hk - l)-minors Mi of XI - J' with M{ = pi for each i. Deduce that Fithk-i(J') = 1, and devise an argument to show that J' is similar to J+ (p, k) without using the ancillary matrix J " of section 13.7. 13.7 This exercise shows that the hypothesis of separability is essential for the equivalence of the separable Jordan normal form with the nonsplit version of the Jordan normal form. Let Q be any field and let Q[t] be the ring of polynomials over Q in an indeterminant t. Further, let F = Q(t) be the field of rational functions over Q, that is, the field of fractions of Q[t\. Verify that the polynomial p(X) = X2 — t is irreducible over F. Let E = F(y/t) be a splitting field for p(X). If the characteristic of Q is not 2, that is, 2 ^ 0 in Q, then p(X) = (X + Vt)(X - yfi) is separable, but if Q has characteristic 2, say Q = Z2, then p(X) = (X — v^) 2 and p(X) is not separable. / 0 t 0 0 \ 1 0 0 0 Let A which is the matrix Js(p,2) of Eq.

l o o t \ 0 1 1 0J

(13.6).

213

Exercises Show that F i t i p f l - A) = Fit2(XI

- A) = 1,

that Fit 3 (.X7 - A) has generators 2X, X2 + £ and X 2 - t, and that Fit4(X/-J4)=p(X)2. Confirm that if the characteristic of Q is not 2, then Fit 3 (XZ -A)

= l

and hence that A is similar to the nonsplit Jordan matrix J+(p, 2) of Eq. (13.5). Suppose that the characteristic of Q is 2. Show that F i t 3 ( X I - A) = p(X) and that the invariant factors of XI — A are 1, l,p(X),p(X). that A is similar to the rational canonical form matrix / 0

R-

l

t

0 0 \

° °°

0 0 V0 0

0 t ' 1 0 /

Deduce

Chapter 14

Projective Modules In this final chapter, our aim is to provide some contrast to the results that we have obtained for the structure of modules over Euclidean rings. We do this by taking a brief look at those modules which occur as a direct summand of a free module - these are the projective modules. If the coefficient ring is Euclidean, then a projective module must be free, since any submodule of a free module is free. For rings in general, a projective module need not be free, nor need a submodule of a free module be projective. We also discuss the types of ring that are defined by imposing "projectivity" conditions on modules, and we show that one such class, the Artinian semisimple rings, occurs naturally in the representation theory of groups. The material in this chapter is all supplementary to the original leeture course on which these notes are based. Besides the aim of placing the Euclidean results in a wider context, the results and references given here provide an introduction to some topics that might be included in an "enhanced" MMath or MSci version of the course.

14.1

The definition

Let R be any ring. A left .R-module P is said to be projective if it is a direct summand of a free module R1, where I is some index set that need not be finite (Exercise 7.10). Thus there is a left R-module Q so that P x Q S R1. Since an external direct sum can be rewritten as an internal provided that we replace the given modules by isomorphic modules (section 7.7), we see 215

g

Chapter 14. Projective Modules

216

that a module P is projective if (and only if) there is an internal decompo sition P' 8 Q' = R1 for some P' with P' = P as an R-module. .EaiampZes. (i) It is immediate from the definition that the ring R is itself a projective left (and right) R-module, as is any free module R1. (ii) The zero module is projective. (hi) Let R be the ring of all n x n matrices over a field F , and for j = 1 , . . . ,n, let Ij be the set of matrices whose entries are all 0 except in column j . Then (Exercise 7.8) each Ij is a left H-module and R = h ©-•■«!„. Thus each summand Ij is projective. Note that the summands are all isomorphic to one another as left .R-modules. No Ij is a free .R-module, since dim(Jj) = n as a vector space over F, while the dimension of any free R-module is a multiple of n 2 . (iv) Let D be the ring of 2 x 2 diagonal matrices over a field F, and let e = en and / = / n (see Exercises 1.7 and 7.4. Then D = De®Df', which shows that De and Df are (nonisomorphic) projective D-modules. It is easy to see that neither is a free D-module. (v) The cyclic Z-module Z a , a > 1, is not projective. To establish this fact, we argue by contradiction. Suppose that there is an isomorphism 6 : Z a x Q = Z1 for some module Q and index set I. The image of (1,0) is a torsion element in Z 7 , since it is annihilated by a, and it is nonzero since 6 is an injection. But Z 7 contains no nonzero torsion elements. (vi) Let A be an n x n matrix over a field F, and let M be the F[X]-module defined by X acting as A on Fn. Arguing as above, we see that M is not projective as an .F[X]-module. Before we give more examples, it will be useful to reformulate the defi nition in terms of the splitting of homomorphisms.

14.2

Split homomorphisms

Let M and P be left .R-modules and let 7r : M —► P be a homomorphism of left .R-modules. We say that 7r is split by the .R-module homomorphism cr : P -4 M if 7T<7 =

idp,

where idp is the identity map on P. Note that a split homomorphism 7r must be a surjection, since p = 7r(u(p) for all p in P. The existence of a splitting leads to a direct sum decomposition of M with P as a summand.

14.2. Split

homomorphisms

217

14.2.1 Lemma Let M and P be left R-modules and suppose that n : M —► P is split by a : P -> M. Then M = Ker(7r) © a(P) with a(P) S P and M 3 Ker(7r) x P. Conversely, if M = K x P for some module K, then there is a split homomorphism IT : M —> P with K = Ker(7r). Proof Let m e M. Then m = [idM — air){m) + air(m), and ■n{idM — cr7r)(m) = 7r(m) — 7r<77r(m) = 0, so that M = Ker(7r)+cr(P). If m S Ker(7r) n a(P), then m = cr(p) for some p, and then 0 = 7r(m) = 7r<7(p) = p,

which gives m = 0 and hence M = Ker(7r)© ( r(M). It is clear that the map m —> cr(m) is an isomorphism from P to tr(P) and that the map a : M -)■ Ker(7r) x M given by a(m) = ((idM — o-Tr)(m),ir(m)), is also an isomorphism of R-modules. For the converse, let a : M = K x P be the given isomorphism. By the definition of the direct sum, for each m in M we can write a(m) = (k,p) for unique elements k of K and p of P . Define 7r by 7r(m) = p and <7 by a(p)=a"1(0,p). □ We now obtain a very powerful characterization of projective modules. 14.2.2 Theorem Let P be a left R-module over a ring R. Then the following are equivalent.

statements

Chapter 14. Projective Modules

218

(i) P is a projective R-module. (ii) If 7T : M -> P is any surjective homomorphism of R-modules, then ■K is split. Proof (ii) => (i). Let {pt \ i E 1} be any set of generators of P, where the index set I may not be finite. An element x = (XJ) of the free module R1 is a sequence of members Xi of R, indexed by / , and with only a finite number of nonzero terms. Thus we can define a surjective .R-module homomorphism 9 : R1 —> P by 6{x) =

^XiPi. iel

Since 9 is split, P is projective. (i) => (ii). By definition, we have R1

^PxQ

for some index set / and module Q. Let 9 : R1 —> P be the corresponding surjective homomorphism and let ui : P —> R1 split 9. Let {ej} be the standard basis of R1 (Exercise 7.10) and put pi = 9(ei) for each i, so that {pi} is a set of generators of P. Since n : M —> P is surjective, we can choose a set of elements {rrii} of M so that 7r(m;) = pi for all i £ I. Now define A : R1 —> M by the requirement that A(ej) = rrii for all i e i , and write a = \u>. Then 7Tcr = TTXU) = OUJ =

idp,

which shows that we have split 7r.

□

14.2.3 Corollary Let P be a left R-module over a ring R. Then P is a finitely generated projective module if and only if there is a split surjection R* —>P where Rl is a free left moffefeyrfgfiflift ffffikrial

□

14.2. Split

homomorphisms

219

14.2.4 Corollary Suppose that R is a Euclidean domain. projective R-module P is free.

Then every finitely generated

Proof By the preceding result, we can view P as a submodule of a free .R-module of finite rank, so the assertion follows from Theorem 12.1.1. □ E x a m p l e : a projective nonfree ideal. Next, we look at one of the basic examples in number theory. Let R = Z[\/^5], and let / be the ideal 3R + (2 + v7—5)R- Unique factorization does not hold in R, since 2, 3, 1 + \/—5 and 1 — A/—5 are distinct irreducible elements of R with 2 • 3 = (1 + V^5) • (1 - v 7 ^ ) ; furthermore, the ideal I is not principal (Exercise 2.7). However, if / were free as an .R-module, it would have to be a principal ideal, since any Rbasis of / would also be a A'-basis of the field of fractions A = Q(v/—5) of R and so could have only one element. It follows that / cannot be a free R-module. Nevertheless, / is a projective R-module, as we will show. We define 9 : R2 -» I by

fl(j)=3x

+ (2+

^5)y,

so that 9 is a surjective .R-module homomorphism. To split 9, we define to : I -> R2 by u>(z) -z

I

1 An easy calculation confirms that the image of a? is in R?2 and that 9ui = idj.

E x a m p l e : a non-projective ideal. Let R = F[X, Y] be the polynomial ring in two variables over a field F, and let I = RX + RY be the ideal generated by the variables. It is easy to show by direct calculation that / is not principal, and we now show that / is not projective. This result provides a contrast to the fact that any ideal of a Euclidean domain is principal, and hence free. There is an evident presentation 9:R2^I with 9(f,g)Dep^figftt9^IVfate^klf,9

€ R.

Chapter 14. Projective Modules

220

By Theorem 14.2.2, it is enough to show that 6 is not split. We argue by contradiction. Suppose that 9 does have a splitting u. Then u{X) = (a, b) for two elements a, b of R, and we must have X = 8u(a,b) = aX + bY. Write a = a0(X) + ai(X)Y + ■■■ + am{X)Ym, a polynomial in Y with coefficients in F[X}. Comparing the coefficients of each term Yl in our expression for X, we find that m

m-1

a=l + J2ai(x)Yl

and

b=-Yiai+i(X)XYi.

i=l

i=0

Similarly, w(F) = (c,d) with n

n—1

~^2cj+1{X)XY:>.

c = ^ C j ( X ) y ^ and d = I - d{X)X 3=1

Computing

UJ{XY)

3=1

in two ways, we see that Yb = Xd

and hence that X 6 fiX2 + RY, which contradicts the fact that X and Y are independent variables.

14.3

Semisimple rings

An obvious question in the investigation of rings and modules is to ask what can be said about a ring if we insist that all its modules are projective. We need some definitions before we can state the results. A left ideal / of a ring R is minimal if / is nonzero and there is no left ideal J with 0 C 3 C /. Then / is a simple .R-module. A ring R is left semisimple if R is a direct sum

CopyrightedMaterial

14.3. Semisimple rings

221

where each I\ is a minimal left ideal of R, and A is some index set that need not be finite. The first result is as follows. 14.3.1 Theorem A ring R is left semisimple if and only if every left R-module is projective. Proof [Rotman], Theorem 4.13.

□

If we impose a further condition on the ring, we obtain a very concrete description of its structure. A ring R is left Artinian if any descending chain R D h D ■ ■ ■ D Ii D Ii+i D ■■■DO of left ideals in R must have only a finite number of terms. 14.3.2 The Wedderburn-Artin Theorem The following assertions are equivalent. (i) R is a (left) Artinian semisimple ring, (ii) There are division rings D%,..., D\. and integers n i , . . . ,Bfc so that R =

Mni(D1)x---xMnk(Dk),

a direct product of matrix rings. The division rings Di and the integers rii are uniquely determined by R, apart from the order in which they are listed. Proof In one direction, the argument is not too difficult. A matrix ring Mn(F) over a field F is Artinian since it is a finite dimensional vector space over F, and any ideal is a subspace. Exercises 3.10 and 7.8 combine to show that Mn(F) is semisimple. It is not hard to see that essentially the same arguments work when F is replaced by a noncommutative division ring D and that a direct product of rings is Artinian semisimple if its components are. The proof in the reverse direction is much harder. Full details can be found in [Cohn 2], §4.6 or [B & K: IRM], §4.2. □ The structure of modules over an Artinian semisimple ring is transpar ent. For each matrix ring Mi(Di) above, let Si be its "first column", that is, the set of matrices whose entries must be zero outside the first column. Then Si is a left Mi(Di)-module and we can make Si into left R-module by stipulating that the other components of R act trivially on Si. Proofs of the following result c a ^ b e ^ u ^ ^ j h ^ ^ f ^ r e n c e s given above.

222

Chapter 14. Projective Modules

14.3.3 T h e o r e m Let M be a finitely generated left module over an Artinian semisimple ring R. Then there are non-negative integers ai,...»ftfe and an R-module isomorphism M = oiSi x • • • x akSk, where a^Sj denotes the external direct sum of a^ copies of Si. Oi = 0, we take this to be the zero module.) If N is a finitely generated left R-module with TV^feiS1! x •••

(If some

xbkSk,

then M = N <=s> ^=> ai = bi for i =

\,...,k. U

14.4

Representations of groups

We take a brief look at the connection between the representation theory of groups and module theory. We show that that a representation of a group corresponds to a module over a certain type of ring, namely a group ring, and that a group ring is an Artinian semisimple ring provided that the order of the group satisfies a certain condition. First, we must make our definitions. Let G be a finite (multiplicative) group and let F be a field. A representation of G over F is a map p : G -y Mn(F) from G to the ring oin x n matrices over F, with the following properties. GRep 1: p ( l c ) = / „ , where 1Q is the identity element of the group and I„ is the identity matrix. GRep 2: p(gh) = p(g)p(h) for all g,heG. Clearly, each matrix p(g) is invertible, with inverse p(g~x). The group ring FG of G over F is defined as follows. Let k be the order of G, and list the elements of G as 1 = gi,
and y = j/i + j/ 2 02 H

V xkgk

we have

k

22 xhxxhXi)gj. y= ^2( Yl i)9ixy = J2( x

3= 1

h yk9k in FG,

24.4. Representations of groups

223

The identity element of FG is the element 1 = lp ■ (?i, where 1^ is the identity element of F . For example, suppose that C = ( c ) = {l,c,...,Cfc-1} is the cyclic group of order k, with generator c. An element of FG has the form X = X i l + X2C-\

hXfcC f c _ 1

and the multiplication is derived from the rule ck = 1. If we put r? = 1+ cH + c fc_1 , then r?2 = fcr? and (1 - c)rj = 0. Suppose we are given a representation p : G —► Mn(G). Each element
easy to check that p". that M has finite we let 9(g) be the

9(g) : m —> gm for all m e M, and we define a representation by choosing a basis E of M as an F-space, and taking P(9) =

(0(9))E,E,

the matrix of 9(g) with respect to E. Condition GRep 1 is satisfied because the identity element of FG acts as the identity operator. For condition GRep 2, we note that for any g, h in G, 9(gh) = 9(g)9(h) since (gh)m = g(hm) for all m in M, and then that (0(gh))E,E

=

(6(g))E,E(9(h))E,E

by the multiplicativity formula in Proposition 5.9.1. A representation is said to be irreducible if the corresponding module is irreducible, that is, simple. A representation of a cyclic group C = (c) of order k is completely determined by specifying a single invertible matrix A with Ak — In, since it is enough to know the matrix p(c). The action of A also makes F n into a module over the polynomial ring F[X] as in section 3.3, and so calculations such as those in section 13.9 can be exploited to reveal the representation theory of cyclic groups, ffiopyragfeted AteferJBfe considered in the exercises

Chapter 14. Projective Modules

224

to this chapter. A detailed introduction to representation theory can be found in [J & L]. The promised connection between representation theory and Artinian semisimple rings is given by the next result. 14.4.1 Maschke's Theorem Let G be a finite group of order k and let F be a field in which k ^ 0 (that is, the characteristic of F does not divide k). Then the group ring FG is Artinian semisimple. Remark: Thus the complex group ring CG is Artinian semisimple for any finite group G. Proof A left ideal of FG is also an F-subspace of FG, and, since FG has finite dimension, any descending chain of left ideals must therefore be finite. Hence FG is Artinian. Let M be any left FG-modu\e. As in the proof of Theorem 14.2.2, there is a surjective FG-module homomorphism 9 : {FG)1 -> M for some index set /. We now invoke the fact that any vector space over a field has a basis, even the space is not finite dimensional ([Conn 2], §1.4; [B & K: IRM], Theorem 1.2.20). Since 9 is, in particular, an F-linear trans formation, we can define an F-linear transformation w : M —> {FG)1 that splits 9 simply by making an appropriate choice for the values of w on the members of some basis of M. Now define <j> : M -> {FG)1 by {m) = - 2 J /i -1 (w(/im)) for each m € M. For each g in G, we have

{gm) = T X] (^rV((%) m )) = ^(m)> gh€G

so that is an FG-homomorphism, and it is easy to see that p<j> = zdjvf-

□ 14.5

Hereditary rings

The success of the structure theory of modules over a Euclidean domain stems from two facts. OQQffir$0e§ /f^gjj^g/generated module M has a

Exercises

225

presentation, that is, there is a surjective .ft-module homomorphism p from a free module R* to M, whose kernel K is, by definition, the relation module (section 9.1). The other is that a submodule of a free module is again free, with a convenient basis (Theorem 12.1.1), so that we have a tight control over the relations for M. There is no difficulty in making the definition of a presentation for a module over an arbitrary ring, but, as the examples in section 14.2 show, we cannot expect that the corresponding relation module will be a free module, or even projective. The following question then arises: "Are there interesting rings for which each submodule of a free module is projective, even if not necessarily free?" The answer is a resounding yes. A ring R is said to be left hereditary if every left ideal of R is a projective .R-module. It can then be shown that every submodule of a projective left R-module is again projective ([Rotman], Corollary 4.18). Artinian semisimple rings are hereditary, as are Euclidean domains. If a commutative domain is hereditary, then the ring is, by definition, a Dedekind domain. There are several alternative definitions of a Dedekind domain, some of which can be found in [Rotman], Chapter 4. The importance of Dedekind domains stems from the fact that they arise in algebraic number theory as rings of integers, a very special example being the ring Z[V—5]. A discussion of Dedekind domains and their module theory can be found in Chapters 5 and 6 of [B k K: IRM]. In the noncommutative case, the theory of Artinian hereditary rings is very complicated, as can be seen from Chapter 8 of [A, R & S]. There are noncommutative generalizations of Dedekind domains, from the point of view of ring theory in [McC & R], Chapter 5, and from the point of view of number theory in [Reiner]. There are many questions of a similar type, for instance: "If a relation module K is not projective, we can take a presentation of K and look at its relation module, K\ say. Is K\ projective?" "What can be said about rings for which K\ is always projective?" The answers to these questions require the tools of homological algebra, and so they are beyond the scope of this first course. There are many projects here for the enthusiastic investigator. A good introduction is given by [Rotman].

Exercises 14.1 Let R = F[X, Y] be the polynomial ring in two variables over a field F, and regard F as an .R-module with both X and Y acting as 0. Show that the obvious presentation R—>F has kernel / = RX + RY,

Chapter 14. Projective Modules

226

which is not projective by the calculation in section 14.2. Using Exercise 9.3, verify that the presentation

R2 ->L

(0

^Xf

+ Yg,

has a projective kernel. Remark: it can be shown that, for any R-module M which is not pro jective, we must reach a projective module after at most two iterations of the process of taking a presentation module of a relation module. In the language of homological algebra, R has global dimension 2. More generally, if R is a polynomial ring in 771 variables over a field F, then R has global dimension m. 14.2 Let F be a field and let F[e] be the residue ring F[T]/(T2), where e = T; F[e] is called the ring of dual numbers over F. We can view F as an F[£]-module with e acting as 0. Show that F is not projective. Prove also that the obvious pre sentation 9 : F[e] -> F has kernel K = F. Show further that if M is an F[T]-module on which T 2 acts as 0, then M is an F[e]-module, and vice versa. Hence find all finitely generated F[e]-modules. Remark: it can be shown that no matter which presentation of F we take, the relation module is not projective; nor do we ever obtain a projective module by taking a presentation of the relation module and iterating this construction. In contrast to the polynomial rings of the previous question, the ring of dual numbers has infinite global dimension. 14.3 Let C = (c) be a cyclic group of finite order k. Verify that an FCmodule M is given by a module over the polynomial ring F[X] on which the variable X acts as a fc-th root of unity, as in section 13.9. Deduce that the irreducible (that is, simple) FC-modules correspond to the irreducible factors of the polynomial Xk — 1. Here are two extreme cases. (a) Let F be the field of complex numbers C and let w be a primitive fc-th root of unity in C. For i = 0,... ,k - I, let St be C regarded as an CC-module with c acting as u>1. Show that each S{ is a simple CC-module, and that S% and Sj are not isomorphic if i jt j . Deduce that

CC^So as a CC-module.

x •■■ x Sk-i

227

Exercises

(b) Take k = p, a prime number, and put F = Z p , the field of p elements. Write e = 1 — c in FC. Show that ep = 0 and hence that FC = F[e] S F[X]/F[X]X p . Deduce that C has only one irreducible representation over F. 14.4 Maschke's Theorem is false if the order k of G is 0 in F, that is, if the characteristic p of K divides k. Let rj = 5Z{5 I 9 S G}. Verify that n2 = kr) = 0. Now consider the surjection 7r : F G -* FT] given by 7r(x) = r\x for all x in FG. Suppose that 7r can be split by an FG-module homomorphism UJ, and let e = u){rf). Show that ge = e for all g in G and hence that e = arj for some a in F . Deduce that irur = 0, a contradiction.

Hints and Solutions for the Exercises Here are the answers for the exercises that involve numerical calculations, and some hints for the less obvious theoretical problems. If an answer is not provided, then either the problem contains its own hints, or the solution is a routine matter of verifying axioms. 1.4 Zg : Units 1,3,5,7, the remaining elements being zero divisors. Zio: Units 1,3,7,9, the remaining elements being zero divisors. 1.6 (a) Note that F is a domain and use Lemma 1.6.1 twice. (b) Use the preceding exercise, with A = F[Y}. (c) Suppose / is an ideal which contains RX + RY properly. Then there is an element / = /oo + Xh + Yk in I with /oo ^ 0. Since / — Xh — Yk is in / , I contains the unit /oo, and so I = R. A generator / of RX + RY would have to be divisible by both X and Y. 1.9 Write ey for the matrix with entry 1 in place i,j and entries 0 elsewhere, i,j = 1,2. Then, for any nonzero element x of the matrix ring R = M2(F) and any pair of indices, ey = axb for some a, b in R. So every element of R is in RxR and so R has no two-sided ideals except 0, R. 2.5 Let / be the given polynomial. Then 2.5 Let / be the given polynomial. Then / ( y ) = ((y + i ) " - i ) / ( ( y + i ) - i ) , / ( y ) = ((y + i ) " - i ) / ( ( y + i ) - i ) , which has the form which has the form Y"-11 + fp-2Ypp-22 Y"- + fP-2Y -

+ ■ ■ ■ + fiY + p + ■ ■ ■ + fiY + p

with p\fi for i = 1 , . . . ,p~ 2. Thus / is irreducible (Eisenstein) and so also is/. 2.6 FfJf, F[X, Y] has a non-principal ideal. 229

Hints and Solutions

230

2.7 An element r of R has the form r = a + fei/-5 for integers a, b. Attempting to argue as with the Gaussian integers (section 2.4), we put tp(y) = a2 + 5b2, and verify that for r,s 6 R,
Let { p i , . . . ,pfc} be the first k primes, and let a% ~Pi- ■ -Pi-iPi+i ■ ■ -Pk, i = 1, • ■ -,k.

Use induction on k. The result for k — 1 tells us that o i , . . . , a/t_i generate PfcZ; since pfc and a^ are coprime, pjtZ + aj^Z = Z. If we omit a, for any i, the remaining aj's can only generate pjL. 3.4 For any field F, a proper, nonzero, submodule of M must be onedimensional as a vector space over F, and so must be given by an eigenvector of A. The eigenvalues of A are the roots of X2 + 1. Now take F = Z p , p prime. If p = 1 mod 4, Z p contains two square roots of —1. If we call them ±i again, there are two one-dimensional submodules as in the complex case. If p = 3 mod 4, Z p contains no square root of —1, so M has no submodules apart from 0 and M, as in the real case. If p = 2 (I bet you missed this!), X 2 + 1 = (X + l ) 2 = (X - l ) 2 . So there is only one eigenvalue, A = 1 with eigenvector I

), and so there is a

unique one-dimensional submodule of M. 3.5 An n x n complex matrix A has at least one eigenvector and hence at least one one-dimensional eigenspace. Thus M has at least one submodule which is one-dimensiona]Q0isyrfc}ht(£hM3MriM is simple, it must itself be

Hints and Solutions

231

one- dimensional, that is, n = 1. Conversely, if n = 1, then M can have no proper nonzero subspaces and hence no proper nonzero submodules. 3.6 Unchanged if we replace the complex numbers by an arbitrary field of coefficients. 3.7 The eigenvalues of B are 1,LO,U>2 where a; = exp(27ri/3) is a complex (a) cube root of 1. This gives three independent eigenvectors and three distinct /

I

\

one- dimensional subspaces C 1 1

V1/

/

,C

i

\

u> 2

/

i

\

1

1

and C 1 J

V- 1

\ - /

Almost any vector works; try e%, a standard unit vector. Almost any vector works; try e%, a standard unit vector. Write w = I — 1 I. By direct calculation, B2w = —Bw — w, so (c) Write w = 1 —1 1. By direct calculation, B2w = —Bw — w, so C{X 1 is spanned over C bv the vectors w, Bw, which are obviouslv linearlv independent. "] is spanned over C by the vectors w, Bw, which are obviously linearly The existence of one-dimensional subspaces depends on the exis(d) tence of a nontrivial cube root of 1 in C, and will carry over to any field which also has a nontrivial cube root of 1. Z7 has three roots of 1, namely 1,2, 4, so we get the "same" answer. Over R, Z2, or Z3, there is only one one-■dimensional subspace. 4.9 Suppose L = M. There is invertible 2 x 2 matrix T giving the isomorphism, so TA = BT. But then TA2 = B2T, hence T = -T, hence I = —J, since an invertible matrix can be cancelled. But / ^ —I, so L is not isomorphic to M. Similar calculations rule out isomorphisms between all pairs except M and Q; as B2 = E2 = —I, this approach gives no information. (b)

So we compute matrices T with TB = ET, and we find that 1 ) x 0 -1 is such a matrix, and hence M = Q. 4.10 Direct computation. A homomorphism from L to Z is given by a 3 x 2 matrix with

- (

s

0

1

l

0

/0 i 0

1 0 \ 0 0 1 T,

\o 0 1 J

0 0 \ and so T = ( 0 0 I with t e R arbitrary. Thus Hom(L, Z) has diment t ) sion 1 as an K-space. Clearly, the column space of T has dimension 1 or 0, so the nullity dim(Ker(T)) of T is 1 or 2. Hence T is jC^ttjeiQ^eg/tftfetef^urjective - see section 4.10.

Hints and Solutions

232

Hom(Z, L): similar. 4.11 0*6,(L) = L + Ker(0) and 0*0*(P) = P D lm(0). So 0*0.(0) 0*0„(Ker#), whether or not 0 is injective. 5.5 Since (idjvf)2 = id.M i (idM)D,c

■ (idM)c,B

= (idM)D,B >

which gives PD,c • Pc,B = PD,B by the result above. (Alternative proof: calculate the product explicitly!) 5.6 Prom the above, PD,B = -PD.B-PE.BJ where

PE,B

= ( _\ \ J and PE,D = ( _\ _23 J

Hence PD I E = (PS.D)- 1 = ( _3X

_ \ ) and P D , B = ( J

_\ ) .

To confirm the calculation, note that b\ = di and 62 = 4di — c?25.7 By Exercise 5.3, we have to determine when the matrix P = I

1

is invertible over the Gaussian integers Z[i\. By Theorem 5.12.1, P is invertible if and only if det(P) = 2a - (2 + i) is a unit in Z[i], that is, 2a — 2 — i = 1, — l , i o r — i. This gives a = (3 + i)/2, (1 + i)/2,1 + i or 1. But the first two of these are not allowed, since they are not Gaussian integers, so the permitted values are a = 1 + i, 1. 5.8 Similar to the preceding exercise: we determine the polynomials f(X) for which the matrix P = ( _

yiA

j is invertible over R[X]. The

determinant is f(X) — 2g(X), and the units in R[X] are the nonzero real numbers, so the permitted values of f{X) are r + 2g(X), r / 0 e R; g(X) is arbitrary. 6.1 The initial parts are routine checking. An element of Hom(Z p , Z q ) is determined by an integer r such that pr € Zq. This means that q divides pr, and so q divides r since p, q are distinct prime numbers (Proposition 2.8.2). Hence f = 0 in Z q , so the corresponding homomorphism must be 0. Since Z p (p) = Z p (p 2 ) = Z p , we have

Hints and Solutions

233

Z p2 (p) = {x | px e Zp 2 } = {py | y e Z} = pZ p 2 , thus Hom(Z p ,Z p 2) S pZ p 2. (By Exercise 6.4, p.Z p2 S Z p .) 6.5 We have ( X

0 X

0 0 0

XI-A 0

Vo

0 0

-1

0

0

/o

0

/i

\

o h ^ /n-2

-1 /n_!+X /

Adding X times row n to row n — 1, then X times row n — 1 to n — 2, etc., does not change the value of the determinant, but converts XI — A to

f ° -1 0

0 0 -1

0 0 0

0 0 0

0 0

-: 0

0 -1

f(X) \ /i + - + /„-iI"- 2 +X'- 1 f2 + ■ ■ ■ + fn-lXn~3 + Xn~2

B = 0

Io

/ n _2 + / „ - l X + X 2 fn-l+X

Expanding from the first row, det(B) = ( - l ) " " 1 / • ( - l ) n _ 1 = / . 6.6 In these pictures, "x" denotes the submodule xM of M. R/Rlpq •

(c):

Ip

0 •

Hints and Solutions

234 R/Rp2q*

(d):

0 •

R/Rp2q2 „ 2 ,

(e):

pq

pq2

p2q

0 •

•

Hints and Solutions

235

6.7 Write a = p\...pk allowed. By Exercise 6.4,

where each pi is irreducible - repetitions are

RPi...pk-i/Ra^R/Rpk, where R/Rpk is simple. So proceed inductively, taking M% = Rpi.. .pl/Ra for i = 1 , . . . , k. R itself has no composition series, since it has no simple i?-submodule - a submodule of R is a principal ideal Rb for some fr / 0, and p ■ Rb C Rb for any irreducible p of R. 7.1 (a): R/Rlpq = l(R/lpq) © pq(R/Rlpq), among others. (b): R/Rp2q = p2(R/p2q2) © q(R/Rp2q2) - unique. (c): R/Rp2q2 = p2{R/p2q2) © q2(R/Rp2q2) - also unique. 7.2 As we have seen in the solution to Exercise 3.7, TV has three onedimensional submodules, namely the eigenspaces of B corresponding to the three cube roots of 1 in C. Thus TV must be the direct sum of these submodules. Further, the decomposition is unique, since there are no other one-dimensional submodules. The answer is similar over Z7, since this field contains three distinct cube roots of 1. In the remaining cases, we cannot decompose TV into one-dimensional submodules, since B does not have enough eigenspaces. Over R and Z2, we can obtain a decomposition of TV by observing that

N^F[X}/F[X](XZ-1) for any field F, which can be seen by a slight modification of the argument in section 6.6. Since X3 — l = (X — l)(X2+X+l) with the second term irreducible, we have N = P®Q with P = (X2 + X + 1)TV one-dimensional and Q = {X - 1)TV two-dimensional. Over Z 3 , X 3 - 1 = (X - l ) 3 and so TV has no direct sum decomposition its submodules form a chain. 7.3 Define a : Mi x ■ ■ • x Mfc ->■ M1/L1 x • ■ • x

Mk/Lk

by a(mi,...,mfe) = ( m l r . . , m t ) , and check that a is surjective with kernel L\ x ■ • • x Lk, so that the claim follows from the Induced Sag^fi^epfcMsfefl^/

Hints and Solutions

236

7.4

Let 6 : De -> Df be a homomorphism, and note that 0(e) = 9(e2) = e6(e) = exf

for some x i n D. 7.10 An infinite set {&» | i € / } is a basis of a module M if (i) for any m £ M, we have m = XZie/ ri^i> where at most a finite number of scalars r* are nonzero; (ii) J2i£i ribi = 0 if and only if all r, are 0. The rest of the question is routine checking. 8.1 T(L) = L n T(M) and TP(L) = L D r p ( M ) . T(0) need not be a surjection even if 9 is. Take 6 : Z -> Z 2 to be the canonical surjection. Then T(Z) = 0 but T(Z 2 ) = Z 2 . 8.6 Let m € M. Then there is an index s with m* = 0 for all i > s. Since i-rrii = 0 for each component m* of m, (s!)m = 0. Thus every element is torsion. On the other hand, given a nonzero integer a, let ea+i be the element of M which has (a + l)-th component T € Z a + i and all other components 0. Then aea+i =£ 0, hence a e" Ann(M). Thus Ann(M) = 0. 9.1 Multiply each relation Pi = 7 i i « i + - - - + 7 t j e t by the cofactor Fjk, k a fixed index, and then sum to obtain the relation Tupi H

h TtfcPt = det(r)e fc ,

using the formulas from Det 8, section 5.12. Thus the generators m^ = ir(ek) have det(r)mjt = 0 for all k, which shows that det(T)M = 0. 9.2 The Cayley-Hamilton Theorem. If B is an n x n matrix over F with Bv = 0 for every v £ F*, then B = 0. Now h(X) 6 Ann(M) means that h(A)v — 0 for all such v, hence h(A) = 0. By the preceding exercise, if we take h(X) = det(XI — A), we have h(X) e Ann(M), so that h(A) = 0. 9.5 Any irreducible divisor of the greatest common divisor of the set q\,... ,qk would be a divisor of each of the q^s, which is impossible by construction. Hence their GCD must be 1 and so 1 = wiqi H 1- Wkqk for some elements w\,..., Wk of R. Since r = r ■ 1 for any r in R, we see that qi,..., qk generate R. The members of any proper subset oiqi,...,qk will all be divisible by some p*, and so cannot generate R. The (k - l)k/2 relations for R are the differences p^e* —pieu, h
Hints and Solutions

237 hwjtefc e Rh. Then7r(u;)

Takew = w1eiH The generators

z\ = ei -qiw,...,zk

1, giving Rk = Ker(7r)©Rw.

= ek

qkw

of Ker(7r) are those suggested by Theorem 9.4.1. The form of the presentation matrix follows simply by writing out w, and the identity W\Zi + ■ ■ ■ + w^Zk = 0 is immediate. / I 0 0\ 10.2 Invariant factor form 0 1 0 I, giving M = Z 2 . \ 0 0 2 / 10.4 There are many correct answers for P and Q since you can perform many different sequences of row and column operations. Here is one such sequence. P Q r 1 2 I I 2 1 1 2 1 0 I -2 1 0 -3 1 2 1 0 I 0 3 2 -1 1 0 1 0 1 -2 2 -1 0 3 0 1 Then P l = P (pure chance), and e\ = ei + e 2 , e'2 = e 2 and /{ = / i , / 2 = -2/i+/2. The single generator is simply m 2 , with 4m 2 = 0. 10.5

Again, many possible answers. In general, r =

/3 3

\2

reduces to r"

1

1\

0

4

Vo 2

3a/

/ with P ■■

1 0 -1 1

V-3 1

"M

°3 /

P-I

=

/3

3

V2

2^i 6 This ay 1 ■

-1 1 0 1 -1 1

)

- M S °7 )a = 0 : the matrix I" is in invariant factor form, so we can take P as above - anticipating results from the next chapter, we have M = Z4 x Z, nonzero generators m'1 = -mi - m 2 and m 2 = mi + m 2 + 7713. 1 0 / 1 0 -1 \ / 3 -1 0 a = l:A= 0 1 ,withP= 2 0 -3 ,P~1= 3 3 1 \ 0 0 / V - 9 1 1 2 / \ 2 - 1 0 ) ' 1 1 and Q 0 1

Hints and Solutions

238 Then M = Z, / a = 2:A = \

generator m!, = m 2 . 1 0 \ / 1 0 -1 \ / 3 -1 0 2 , with P = 2 0 -3 , P"1 = 3 -3 0 0 / \ 3 1 -6 / \ 2 -1

0 \ 1 , 0/

- « - ( S -.)■ Then M = Z 2 x Z, generators mj = - m i - 3m 2 - m 3 and m 2 = mall.1 Fitj = 1, Fits = 3, Fit 3 = 18, confirming that St = 1, 52 = 3, 53 = 6. 11.2 (a) Fit 2 (r) = ± d e t ( r ) = 1; <5i<52 = 1, hence 5t = 52 = 1. (b) Fiti = <5i = 1. Fit 2 has among its generators 2 x 1 1 - 2 x 1 0 = 2 and 4 x 0 - 7 x 1 1 so Fit 2 = 52 = 1 also. Expanding, say, from row 3, we get det = - 2 so 53 = 2. (c) Fiti = 1, Fit 2 = 3. (d) Fiti = 1 and Fit2 is generated by 12,3o - 4 and 3a - 12, so that Fit 2 is given by the GCD of 3a, 4. For a = 0, this is 4, for a = 1, this is 1, and for a = 2, this is 2. 11.3 Since the GCD of 3a, 4 must divide 4, the only possible values of Fit 2 are 1, 2,4, so the values a = 0,1, 2 cover all the possibilities. 11.4 Fiti = 1 always. If a ^ 0, a(X — 1) is a 2 x 2 minor, and X — 1 is a factor of any 2 x 2 minor. So Fit 2 = X — 1, therefore 52 = X — 1, $3 = (X - l)(X - b). If a = 0, the distinct l x l minors are X — 1 and X — b, and the distinct 2 x 2 minors are {X - l ) 2 and (X - 1)(X - b). If 6 ^= 1, we find that Fitj = 1, Fit 2 = X - 1 = 82 and 63 = {X - 1)(X - b) again. lib = l,61 = 52 = 53 = X-l. 11.5 The presentation matrix is / T = l \

0 + i 0

6 6 4 5-i 2(1 + *) 2£

0 \ 2i \ 0 /

which has invariant factor form A=

/1+i 0 \ 0

0

0 0\ 2 0 0 ) . 0 6 0/

This can be found by using the elementary row operations "row 1 -H- row 2", "row 2 «-> row 3" cleverly, or by finding the Fitting ideals of T.

Hints and Solutions

239

11.6 Fiti = RX + RY is not principal and so there is no invariant factor form. Fit 2 = (X 2 - Y2)R. 12.1 Let x = (xi,x2) £ M. Z = Zx precisely when Ann(x) = 0. Suppose z 6 Ann(x). Then 2x1 = 0 € Z/Za and zx2 = 0 G Z. If x2 ^ 0, then z = 0 and Ann(x) = 0, so Zx = Z. If x 2 = 0, then ax = 0, so Zx ^ Z. Internal direct sum. First, we must have 0 = Zx n Zy. Write y = (2/1,2/2)Then ay2x = (0,ax2y2) = ax2y is in the intersection, so 0x22/2 = 0 in Z. But 0x2 ^ 0, so 2/2 = 0 (which guarantees that the intersection is zero). Next we must have M = Zx + Zy. In particular, m2 = bx + cy for some integers b, c. Looking at the second components, we get 1 = CX2 in Z, so x2 = ± 1 . We also have to be able to write mi = dx + ey for some integers d, e. This gives dx2 + ey2 = 0 and d = ±ey2, so that 1 = ±ey2xi + eyi € Z/Za. But for any given values of Xi and y2 we can find a value of 2/1 with ±2/2^1 + 2/i invertible in Z/Za. Thus x = (xi, ±1) is the general form for x. 12.2 (a) M = Z2 is 2-primary. (b) M = Z 3 is 3-primary. (c) For a = 0, M = Z 4 is 2-primary; for a = 1, M = 0; and for a = 2, M is 2-primary again. 12.3 • If a / 0, the invariant factor decomposition is M S F[X]/F[X](X - 1) x F[X]/F[Jf](X - 1)(X - 6). If also 6 = 1 , this is the (X - l)-primary form of M and the elementary divisors are (X — 1), (X — l ) 2 . if M i , F[X]/F[X](X - 1)(X - 6) S F[X]/F(X]{X - 1) x F[X}/F[X}(X

- b),

so M has (X — l)-primary component F[X]/F[X](X

- 1) x F[X]/F[X](X - 1)

and (X - 6)-primary component F[X]/F[X](X - b). • If a = 0 and b ^ 1, we find that the invariant factor decomposition is as before, so we get a similar analysis. • If a = 0 and 6 = 1 , M = F[X]/F[X](X

- C ? ^ S r f | ^ / M ^ ( « / - 1) x F[X]/F[X](X

- 1).

Hints and Solutions

240 12.4 M is

By the calculations for Exercise 11.5, the invariant factor form of Z [ t ] / ( l + t ) Z [ i ] x Z[i]/2Z[t] x Z[i]/6Z[i].

We have irreducible factorizations 2 = —t(l + i)2 and 6 = - 1 ( 1 + i)2 ■ 3, so the 1 + i-primary component of M is Z[t]/(1 + t)Z[i] x Z[t]/2Z[t] x Z[i]/2Z[t] and the 3-primary component is Z[t]/3Z[i] all other primary components being 0. The elementary divisors are 1 + *, 2,2,3. In the following solutions, "InFs" means invariant factors, "RCF" rational canonical form, and "JNF" Jordan normal form. 13.1 By direct calculation, the InFs of / X +3 XI-A

=

-

2 X -1 1

4

V o

-4 4 X-l

\ I

are 1,1 and {X - l)(X + I)2 = X3 + X2 - X - I. / 0 0 1 \ / 1 0 0 \ RCF: 1 0 1 , JNF: 0 - 1 0 . \ 0 1 -1 ) \ 0 1 -1 / B turns out to have the same InFs, so the same RCF, JNF. But B itself is not in RCF: the l x l block is the companion matrix of the polynomial X — 1 and the 2 x 2 block is the companion matrix of X2 + 2X + 1, which is not a multiple of X — 1. 13.2 0 0 \ 0 A: InFs 1,1, X 3 , RCF & JNF: 0 0 I : \

\o

i

( 0 0 1\ B: InFs 1,1,Xs - 1, RCF: 1 0 0

Vo i o

o)

b

JNF:

u) = exp(27ri/3). C: I n F s l , l , X - 2 , ( X - 2 ) 3 / 2 0 0 0 \ / 2 1 0 0 \ 0 0 0 1 0 1 0 1 RCF: JNF: 0 1 0 - 4 0 2 1 0 \ 0 0 1 4 ) Copyrigntefl IVfategiali

1

0

0

w

v°

0

°\ 0

2

- y

where

Hints and Solutions

241

D: InFs 1 , . , . , 1, (X - l ) n . Put (X - l)n = Xn then the RCF is and JNF are respectively

(°

0 0. .0 1 0 0 . .0 0 1 0 . .0 0

0 0

0 . .0 0 . .. 1

/o

fn-2 fn-1

/ 1 0 0...0 1 1 0...0 0 1 1 ...0

\

h h

fn^Xn~l

/o;

0 \ 0 0

and

)

0 \ 0

0 0 ... 1 0 0 0 ... 1 1 /

13.3 The calculations of the minors involve only integers, so we get the "same" minors whatever the field of coefficients, but we must interpret integers as their residues in Z 2 , Z 3 and Z 5 in turn. The Fitting ideals may change. For the matrix A of 13.1, the answer changes only over the field Z2, since nowFit 2 (A) = X - l a n d F i t 3 ( J 4 ) = (X-l)3. The InFs are 1,-X-l, (X-l)2, / I 0 0\ the RCF is 0 0 1 , and the JNF is the "same". \0 1 l ) For B of 13.1, the answers are all the "same". Note that B has a 2 x 2 identity submatrix, ensuring that Fit2(i?) = 1 for any field of coefficients. For A,C,D of 13.2, the InFs, RCF and JNF are unchanged apart from interpretation. Note that A, C have 2 x 2 identity submatrices, guaranteeing Fit2 = 1 for any coefficient field, and D has an (n — 1) x (n — 1) submatrix with determinant 1 over Z, giving Fit„_i(Z?) = 1 always. For B of 13.2, the InFs are 1,1, X 3 - 1 whatever the field, so the RCF is unchanged. The JNF will depend on the factorization of X3 — 1 in the field of coefficients, which will vary. Over R,Z 2 and Z 5 , X3 - 1 = (X - l)(X2 + X + 1) with the second factor irreducible. So B has an RCF (left to the reader) but no JNF. Over Z 3 , X3 - 1 = {X - l ) 3 , so B has an RCF and a JNF. 13.4 Since (XI - A)T = XI - AT, the h x h submatrices of XI - AT are all transposes of h x h submatrices of XI — A, and vice versa. Since determinants are not changed by transposition, XI — AT and XI — A have the same Fitting ideals and so the same invariant factors. By (13.3.3), this means that A is similar to AT and hence that the F[X]-modules M(A) and M(AT) are isomorphic. 13.8 It is enough to a^,fMM^?&M^fM, with the desired property.

Hints and Solutions

242

Suppose J 2 = J: displaying 3 rows and columns, we have / A2 0 2A A2 1 2A

giving

0 " ^ 0 2 A •

/ A 0 1 A 0 1

0 0 A

A2 = A and 2A = 1.

These equations are incompatible unless the matrix J is 1 x 1, in which case A = 0,1 work. Ir 0 So, collecting all the l's together, the possible JNF's are for 0

identity matrices of size r = 0 , l , . . . , n . No hints for Chapter 14-'

0

Bibliography [Allenby] R. B. J. T. Allenby, Rings, Fields and Groups, 2nd edition, Ed ward Arnold, London 1991. [A k L] D. M. Arnold k R. C. Laubenbacher, Finitely generated modules over pull-back rings, J. Algebra 184, (1996) 304-332. [A k McD] M. F. Atiyah k I. G. Macdonald, Introduction to Commutative Algebra, Addison-Wesley, Reading, Mass., 1969. [A, R & S] M. Auslander, I. Reitun & S. O. Smal0, Representation Theory of Artin Algebras, Cambridge studies in advanced mathematics 36, Cambridge University Press, Cambridge, 1995. [B k K: IRM] A. J. Berrick k M. E. Keating, An Introduction to Rings and Modules, Cambridge University Press, to appear. [B k K: CM] A. J. Berrick k M. E. Keating, Categories and Modules, Cambridge University Press, to appear. [Cohn 1] P. M. Cohn, Algebra, volume 1, 2nd edition, John Wiley k Sons, Chichester, 1982. [Cohn 2] P. M. Cohn, Algebra, volume 2, John Wiley k Sons, Chichester, 1979. [Cohn: FRTR] P. M. Cohn, Free Rings and their Relations, 2nd edition, Academic Press, London, 1985. [E, L k S] R. B. Eggleton, C. B. Lacampagne k J. L. Selfridge, Euclidean Quadratic Fields, Amer. Math. Monthly 99 (1992) 829-837. [Euclid] Euclid's Elements, volume 2, Dover, New York, 1956. [Grayson] D. R. Grayson, SK\ of an interesting principal ideal domain, J. Pure Appl. Algebra 20 (1981) 157-163. [Green] J. A. Green, The characters of the general linear group, Transac tions Amer. Math. Soc. 80 (1955) 402-447. [G k L] R. M. Guralnick k L. S. Levy, Presentations of modules when ideals need not be principal, Illinois J. Math. 32 (1988) 593-653. 243

244

Bibliography

[G, L & O] R. M. Guralnick, L. S. Levy & C. Odenthal, Elementary divisor theorem for noncommutative PID's, Proc. Amer. Math Soc. 103 (1988) 1003-1012. [H & W] G. H. Hardy & E. M.Wright, An Introduction to the Theory of Numbers, 5th edition, Oxford University Press, Oxford, 1979. [Ischebeck] F. Ischebeck, Hauptidealringe mit nichttrivialer Si^-gruppe, Arch. Math. (Basel) 35 (1980), no. 1-2 138-139. [Jacobson] N. Jacobson, Basic Algebra I, 2nd edition, W. H. Freeman, New York 1985. [J & L] G. D. James & M. W. Liebeck, Representations and Characters of Groups, Cambridge University Press, Cambridge, 1993. [L k. S] R. C. Laubenbacher k. B. Sturmfels, A normal form algorithm for modules over k[x,y]/(xy), J. Algebra 184 (1996) 1001-1024. [McC &: R] J. C. McConnell & J. C. Robson, Noncommutative Noetherian Rings, Wiley-Interscience, John Wiley, Chichester, 1987. [Mac Lane] S. Mac Lane, Homology, 3rd corrected printing, SpringerVerlag, Berlin, 1975. [Marcus] D. A. Marcus, Number Fields, Universitext, Springer-Verlag, Berlin, 1977. [Reiner] I. Reiner, Maximal Orders, Academic Press, London, 1975 [Rotman] J. J. Rotman, An Introduction to Homological Algebra, Academic Press, Boston, Mass., 1979. [Rowen] L. H. Rowen, Ring Theory, volume I, Academic Press, Boston, Mass., 1988. [Sharp] R. Y. Sharpe, Steps in Commutative Algebra, London Math. Soc. Student Texts 19, Cambridge University Press, Cambridge, 1990.

Index Index of Symbols. These grouped according to the type of object to which they relate, as far as possi ble.

M(A):

module given by matrix A, 39 M/L: factor module, 92 R1: free module on J, 121 Tp(M): p-primary component, 128 Pi x • • ■ x Pfc: external direct sum, 114 rank(M): rank of module, 71

Ideals. I fl J: intersection, 9 (a, b): GCD, 22 Ann(M): annihilator, 125 Ann(i): annihilator, 61 I + J: sum of ideals, 9 Ra, aR: principal ideals, 8

Homomorphisms. =: isomorphism, 63 Hom(M, N): set of homomorphisms, 65 lm(0): image, 59 Ker(0): kernel, 58 rank(T): rank, 60 a(a): mult, homomorphism, 54 T(X): mult, homomorphism, 54 id,M'- identity homomorphism, 52 inc: inclusion homomorphism, 52 null(T): nullity, 60

Rings. R, S: rings, 3 F[A"]: polynomial ring, 6 Mn(R): n x n matrices over R, 4 C: the complex numbers, 4 Q: the rational numbers, 4 R: the real numbers, 4 Z: ring of integers , 4 Z[i]: Gaussian integers, 19 Z m : residue ring mod m, 12

Matrices. PC,B- change of basis matrix, 79 C(A): rational canonical form of A, 198 J(A): Jordan normal form of A, 201 J+(p,k): nonsplit Jordan block matrix, 205 JS(A): separable Jordan form, 207

Modules. L + N: sum of submodules, 42 L\®-- -®Lk- internal direct sum, 112 M = L® N: internal direct sum, 108 245

246

Js(p,k):

separable Jordan block matrix, 207 diag(di,..., dn): diagonal matrix, 152 Others. C: strict containment, 13 U(R): unit group, 5 \G\: order of group, 189 vol(L): volume, 188 Index of terms. abelian group, 2, 37, 186 action of a linear transformation, 55 of a matrix, 38 addition, 1 additive group, 1, 37 adjoint, 86 algebraically closed field, 203 algorithm for diagonalization, 156 annihilator, 124 of an element, 61 of module, 125 Artinian, 221 associated matrices, 155 associates, 23 basis, 71 change of, 77 infinite, 236 of vector space, 44 standard, 70 block diagonal action, 113 canonical basis, 30 canonical homomorphism, 92 Cartesian product, 114 Cay ley-Hamilton Theorem, 143 characteristic matrix, 142Q&ffi

Index characteristic polynomial, 46, 104, 142, 143, 167, 195 Chinese Remainder Theorem, 116 cofactor, 85 column operations elementary, 146 column space, 60 common denominator, 76 commutative group, 2 ring, 4 companion matrix, 100, 196 complement, 108, 112 component, 108, 112 composition series, 104 congruence, 10 conjugate matrix, 194 coordinate vector, 80 coprime, 23 cyclic decomposition, 178 cyclic group representations of, 226 cyclic module, 43, 61, 94 over Euclidean domain, 97 over polynomial ring, 100 decomposition, 108 Dedekind domain, 225 defining relations, 136 degree, 6, 18 determinant, 85 diagonal matrix, 152 diagram, 98 direct product of rings, 120 direct sum k components, 112 external, 114 infinite, 121 internal, 108 distinct irreducible elements, 23 division algorithm, 18 iferialon. ring, 102

Index domain, 4 eigenspace, 46 eigenvalue, 46 eigenvector, 46, 58 left vs. right, 66 Eisenstein's Criterion, 27 elementary divisors, 181 elementary Jordan matrix, 200 elementary operations matrix interpretation, 149 elementary row & column opera tions, 145 endomorphism ring, 65 entire ring, 4 equivalence problem, 145,155, 167 equivalence relation, 11 equivalent matrices, 155 Euclid's algorithm, 22 Euclidean domain, 17 external direct sum, 114 factor module, 92 factorization standard, 25 unique, 24 field, 5 as Euclidean domain, 18 of fractions, 5 quotient, 5 finite dimensional, 44 First Isomorphism Theorem, 94 Fitting ideal, 163 free module, 71 infinite, 121 standard, 70 free resolution, 143 fundamental parallelepiped, 188 Fundamental Theorem of Algebra, 26 Gauss' Lemma, 27

247 Gaussian integers, 19, 26, 33 GCD, 21 generator, 43 of ideal, 8 generators, 43 greatest common divisor, 21 group abelian, 2 additive, 1 commutative, 2 multiplicative, 2 representation, 222 group ring, 222 hereditary ring, 225 homological algebra, 143, 225 homomorphism canonical, 92 identity, 52 inclusion, 52 induced, 93 multiplication, 54 of F[X]-module, 56 of groups, 52 of modules, 51 of rings, 66 product, 53 product of, 53 sum, 53 ideal, 7 left, 7 maximal, 13 prime, 14 principal, 8, 21 proper, 7 right, 7 two-sided, 7 zero, 7 identity operation, 146 Idiot's Cancellation Lemma, 103 Jte«§/e, 59, 64

248

inverse, 64 indecomposable module, 111 independent vectors, 44 induced homomorphism, 93 Induced Mapping Theorem, 94 infinite matrices, 122 injective, 59 integral domain, 4 intersection of ideals, 9 of submodules, 42 invariant factor, 152 standard choice, 167 trivial, 195 invariant factor decomposition, 178 invariant factor form, 152 invariant factors of XI - A, 195 of a module, 178 uniqueness, 166 invariant subspace, 45 inverse, 4 inverse image, 64 invertible, 4, 63 invertible matrix, 158 irreducible, 23 standard, 25 irreducible elements distinct, 23 irreducible representation, 223 isomorphism of modules, 63 of rings, 15, 66 Jordan normal form, 201 nonsplit, 203 separable, 207 split case, 200 Jordan-Holder Theorem, 104 kernel, 58 Kernel and Image TheoreOtj 60

Index Lagrange's Theorem, 191 lattice, 187 full, 187 leading term, 6 left ideal, 7 left module, 35 linear independence, 71 linear map, 52 linear transformation, 38, 52 long division, 18 Maschke's Theorem, 224, 227 matrix action, 38 adjoint, 86 change of basis, 79 invertible, 85, 158 of homomorphism, 82 ring, 4 root of unity, 209 scalar, 39 minimal ideal, 220 minimum polynomial, 196 minor, 163 module p-primary, 127 cyclic, 43, 61, 94 defined by relations, 136 factor, 92 finitely generated, 44 homomorphism, 51 indecomposable, 111 isomorphism, 63 left, 35 over Z, 37 over commutative ring, 36 over polynomial ring, 39, 55 projective, 215 quotient, 92 regular, 36 relation, 135 afight, 36

Index simple, 41, 50 sum fc-fold, 43 torsion, 124 torsion-free, 124 trivial, 40 zero, 36 monic, 22 multiplicative group, 2 nilpotent matrix, 209 Noetherian ring, 143, 169 normal form, 193 null space, 60 nullity, 60 one-to-one, 59 onto, 59 order of a module, 191 ordered set, 121 orthogonal idempotents, 121 p-primary module, 127 component, submodule, 128 polynomial constant, 6 degree, 6 monic, 22 root of, 30 split, 31 zero, 6 polynomial ring, 5 presentation, 134 diagonal, 175 invariant factor, 175 presentation homomorphism, 140 presentation matrix, 140 primary decomposition, 179 prime ideal, 14 principal ideal domain, 32 product, 2

249 product of homomorphisms, 53 proofs in full detail, 8, 9, 42, 59, 63, 93 proper submodule, 41 purely algebraic, 193 quaternion algebra, 105 quotient module, 92 rank, 60, 70 of a module, 178 of free module, 71 Rank & Nullity Theorem, 60 rational canonical block matrix, 100 rational canonical form, 198 reducible, 23 reflexive, 11 regular module, 36 relation, 135 relation module, 135 representation of a group, 222 residue class, 11 residue ring, 10 residues of integers, 12 right ideal, 7 right module, 36 ring, 3 commutative, 4, 36 domain, 4 factor, 10 homomorphism, 66 isomorphism, 66 of diagonal matrices, 14, 216 of dual numbers, 226 of Gaussian integers, 19, 26, 33 of integers as Euclidean domain, 18 of matrices, 4, 8, 50 infinite, 122

Index

250

of polynomials, 5, 29 as ED, 18 complex, 26 over a finite field, 28 rational, 27 real, 27 of scalars, 36 of triangular matrices, 15 quotient, 10 residue, 10, 28, 29 trivial, 3 zero, 3 row & column operations general, 150 row operations elementary, 146 scalar matrix, 39 scalar multiplication, 35 Second Isomorphism Theorem, 103 semisimple, 220 separable Jordan normal form, 207 separable polynomial, 206 set of generators, 44 similar matrices, 194, 199 similarity problem, 193 simple module, 41 Smith normal form, 152 spanning set, 44 split homomorphism, 216 split polynomial, 31 splitting field, 31 standard basis, 60 standard factorization, 25 standard irreducible, 25 standard matrix units, 120 subgroup, 41 submodule, 40 generated by set, 44 intersection, 42 of cyclic module, 96 proper, 41

sum, 42 torsion, 124 zero, 41 sum of homomorphisms, 53 of ideals, 9 of submodules, 42 summand, 108, 112 surjective, 59 Sylow subgroup, 187 symmetric, 11 Third Isomorphism Theorem, 103 torsion, 124 module, 124 submodule, 124 torsion-free, 124 transitive, 11 transpose matrix is similar, 212 trivial invariant factor, 195 trivial module, 40 two-sided ideal, 7 underlying space, 55 Unique Factorization Theorem, 24 unit, 4 unit group, 5 vector space, 36 basis, 44 spanning set, 44 volume, 188 Wedderburn-Artin Theorem, 221 zero ideal, 7 zero module, 36 zero submodule, 41