Universitext Editorial Board (North America) :
J.H. Ewing F.W. Gehring P.R. Halmos
Universitext Editors (North America): J.H. Ewing, F.W. Gehring, and P.R. Halmos AksoylKhamsl: Nonstandard Methods in Fixed Point Theory Aupetlt: A Primer on Spectral Theory Bachumlkern: Linear Programming Duality BenedettllPetronlo: Lectures on Hyperbolic Geometry Berger: Geometry I, II (two volumes) BlledtnerlHansen: Potential Theory BoosslBleecker: Topology and Analysis CariesonlGamelln: Complex Dynamics Cecil: Lie Sphere Geometry: With Applications to Submanifolds Chandrasekharan: Classical Fourier Transforms Charlap: Bieberbach Groups and Flat Manifolds Chern: Complex Manifolds Without Potential Theory Cohn: A Classical Invitation to Algebraic Numbers and Class Fields Curtis: Abstract Linear Algebra Curtis: Matrix Groups van Dalen: Logic and Structure Das: The Special Theory of Relativity: A Mathematical Exposition DiBenedetto: Degenerate Parabolic Equations Dlmca: Singularities and Topology of Hypersurfaces Edwards: A Formal Background to Mathematics I alb Edwards: A Formal Background to Mathematics II alb Emery: Stochastic Calculus Foulds: Graph Theory Applications Frauenthal: Mathematical Modeling in Epidemiology FukhslRokhlln: Beginner's Course in Topology GaliotIHullnlLafontalne: Riemannian Geometry Gardiner: A First Course in Group Theory Glrdlngfl'ambour: Algebra for Computer Science Godblllon: Dynamical Systems on Surfaces Goldblatt: Orthogonality and Spacetime Geometry Hahn: Ouadratic Algebras, Clifford Algebras, and Arithmetic Witt Groups Hlawka/Schoissengelerrraschner: Geometric and Analytic Number Theory Howe/Tan; Non-Abelian Harmonic Analysis: Applications of SL(2,R) Huml/MllIer: Second Course in Ordinary Differential Equations Hurwltz/KrItlkos: Lectures on Number Theory Iversen: Cohomology of Sheaves JoneslMorrls/Pearson: Abstract Algebra and Famous Impossibilities Kelly/Matthews: The Non-Euclidean Hyperbolic Plane Kempf: Complex Abelian Varieties and Theta Functions Kostrlkln: Introduction to Algebra KrasnoselskillPekrovskll: Systems with Hysteresis Luecklng/Rubel: Complex Analysis: A Functional Analysis Approach MacLane/Moerdljk: Sheaves in Geometry and Logic Marcus: Number Fields McCarthy: Introduction to Arithmetical Functions Meyer: Essential Mathematics for Applied Fields
(continued after index)
Arthur Jones Sidney A. Morris Kenneth R. Pearson
Abstract Algebra and Famous Impossibilities With 27 Illustrations
Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona Budapest
Arthur Jones Kenneth R. Pearson Department of Mathematics La Trobe University Bundoora 3083 Austra lia
Sidney A. Morris Faculty of Informatics University ofWollongong Wollongong 2500 Australia
Editorial Board (North America): J.H. Ewing Department of Mathematics Indiana University Bloomington, IN 47405, USA
F.W. Gehring Department of Mathematics University of Michigan Ann Arbor, MI 48109, USA
P.R. Halmos Department of Mathematics Santa Clara University Santa Clara , CA 95053, USA
Library of Congress Cataloging-in-Pub lication Data Jones, Arthur, 1934Abstract algebra and famous impossibilities / Arthur Jones, Sidney A. Morris , Kenneth R. Pearson p. cm. - (Universitext) Includes bibliographic references and index. ISBN 0-387-97661-2 I. Algebra , Abstract 2. Geometry - Problems, Famous . I. Morris, Sidney A., 1947- . II . Pearson, Kenneth R. III. Title. QA162.J65 1992 512'.02-dc20 91-24830 Printed on acid-free paper.
© 1991 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York , Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly ana lysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden . The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Mark s Act, may accordingly be used freely by anyone . Camera -ready copy provided by the authors. Printed and bound by R.R. Donnelley and Sons, Harr isonburg, VA. Printed in the United States of America . 9 8 7 6 5 4 3 2 (Corrected second printing, 1994) ISBN 0-387-97661-2 Springer-Verlag New York Berlin Heidelberg ISBN 3-540-97661-2 Springer-Verlag Berlin Heidelberg New York
Preface
The famous problems of squaring the circle, doubling the cube and trisecting an angle captured the imagination of both professional and amateur mathematicians for over two thousand years. Despite the enormous effort and ingenious attempts by these men and women, the problems would not yield to purely geometrical methods. It was only the development. of abstract algebra in the nineteenth century which enabled mathematicians to arrive at the surprising conclusion that these constructions ar e not possible. In this book we develop enough abstract algebra to prove that these constructions are impossible. Our approach introduces all the relevant concepts about fields in a way which is more concrete than usual and which avoids the use of quotient structures (and even of the Euclidean algorithm for finding the greatest common divisor of two polynomials) . Having the geometrical questions as a specifi c goal provides motivation for the introduction of the algebraic concepts and we have found that students respond very favourably. We have used this text to teach second-year students at La Trobe University over a period of many years, each time refining the material in the light of student performance. The text is pitched at a level suitable for students who have already taken a course in linear algebra, including the ideas of a vector space over a field, linear independence, basis and dimension. The treatment, in such a course, of fields and vector spaces as algebraic objects should provide an adequate background for the study of this book. Hence the book is suitable for Junior/Senior courses in North America and second-year courses in Australia. Chapters 1 to 6, which develop the link be tween geometry and algebra, are the core of this book. These chapters contain a complete solution to the three famous problems, except for proving that 7r is a transcendental number (which is needed to complete the proof of the impossibility of squaring the circle). In Chapter 7 we give a selfcontained proof that 7r is transcendental. Chapter 8 contains material ' ab out fields which is closely related to the topics in Chapters 2-4,
vi
Famous Impossibilities
although it is not required in the proof of the impossibility of the three constructions. The short concluding Chapter 9 describes some other areas of mathematics in which algebraic machinery can be used to prove impossibilities. We expect that any course based on this book will include all of Chapters 1 - 6 and (ideally) at least passing reference to Chapter 9. We have often taught such a course which we cover in a term (about twenty hours) . VVe find it essential for the course to be paced in a way that allows time for students to do a substantial number of problems for themselves. Different semester length (or longer) courses including topics from Chapters 7 and 8 are possible. The three natural parts of these are (1) Sections 7.1 and 7.2 (transcendence of e), (2) Sections 7.3 to 7.6 (transcendence of 7l'), (3) Chapter 8. These are independent except, of course, that (2) depends on (1). Possible extensions to the basic course are to include one, two or all of these. While most treatments of the transcendence of 7l' require familiarity with the theory of functions of a complex variable and complex integrals, ours in Chapter 7 is accessible to students who have completed the usual introductory real calculus course (first-year in Australia and Freshman/Sophomore in North America). However instructors should note that the arguments in Sections 7.3 to 7.6 are more difficult and demanding than those in the rest of the book. Problems are given at the end of each section (rather than collected at the end of the chapter) . Some of these are computational and others require students to give simple proofs. Each chapter contains additional reading suitable for students and instructors. \Ve hope that the text itself will encourage students to do further reading on some of the topics covered. As in many books, exercises marked with an asterisk >I: are a good bit harder than the others. We believe it is important to identify clearly the end of each proof and we use the symbol - for this purpose. We have found that students often lack the mathematical maturity required to write or understand simple proofs. It. helps if students write down where the proof is heading, what they have to prove and how they might be able to prove it. Because this is not part of the formal proof, we indicate this exploration by separating it from the proof proper by using a box which looks like
P reface
VII
(Incl ude here what must be proved et c.)
Experience has shown th at it helps students to use this material if import ant theorems ar e given sp ecific names which suggest their conte nt . We have enclosed these names in square br ackets b efore the state me nt of th e th eorem . We encourage stude nts to use these names when justifying th eir solutions to exercises. They oft en find it convenient to abbreviate the names to just the relevant initials. (For exam ple, the name "Small Degree Irreducibility Theorem " can b e abbreviated to S.D .LT.) We ar e especially grateful to our colleague Gary Davis, who pointed the way towards a more concret e treatment of field extensions (using residue rings rather than qu oti ent ring s) and thus mad e th e course accessible to a wider class of stude nts. We are grateful to Ernie Bowen , J eff Brooks, Grant Cairns, Mike Canfell, Bri an Davey, Alistair Gray, P aul Halmos, P eter Hodge, Alwyn Horadam, Deborah Kin g, Margar et Mclntyre, Bernhard H. Neumann, Kristen Pearson , Suzanne Pearson, Alf van der Poorten , Brailey Sims, Ed Smi th and Pet er Stacey, who have given us helpful feedback, mad e suggestions and ass isted with the proof reading. We thank Dorothy Berridge, Ernie Bowen , Helen Cook , Margar et McDonald and Judy St orey for skilful Tgxing of the tex t and diagrams, and Norman Gaywood for assist ing with the ind ex. A.J. , S.A.M., K.R.P. April 1991
Contents
Preface
v
Introduction
1
0.1 0.2 0.3
Chapter 1.1 1.2 1.3 1.4
Three Famous Problems Straightedge and Compass Constructions Impossibility of the Constructions
1 Algebraic Preliminaries Fields, Rings and Vector Spaces Polynomials.. The Division Algorithm
7 8 13 17
The Rational Roots Test Appendix to Chapter 1
20 24
Chapter 2 Algebraic Numbers and Their Polynomials 2.1 Algebraic Numbers 2.2 Monic Polynomials 2.3
1 3 3
Monic Polynomials of Least Degree
Chapter 3
Extending Fields
27 28 31 32
39
3.1 3.2
An Illustration: Q( J2) Construction of IF (0:) . .
40 44
3.3 3.4
Iterating the Construction Towers of Fields ..
50 52
Chapter 4 Irreducible Polynomials 4.1 Irreducible Polynomials 4.2 Reducible Polynomials and Zeros
61 62 64
4.3
Irreducibility and irr( a, IF)
68
4.4
Finite-dimensional Extensions
71
ix
x
Fam ous Impossibilities
Chapter 5
Straightedge and Compass Constructions
75
5.1
Standard Straight edge and Compass Constructions
76
5.2
Products, Qu otients, Square Ro ots . .
85
5.3
Rules for Straigh t edge and Compass Constru ction s
89
5.4
Constructible Numbers and Fields .
93
Chapter 6
Proofs of the Impossibilities
99
6.1
Non-Constructible Numbers
100
6.2
The Three Constructions ar e Impossible
103
6.3
Proving th e "All Constructibles Come From Squ ar e Root s" Theorem ..
Chapter 7
Transcendence of e and
108 1T"
115
7.1 7.2
Preliminaries e is Transcendental
116 124
7.3
Preliminaries on Symmetri c Polyn omials
134
7.4
1r
7.5
Preliminari es on Compl ex-valu ed Integrals
7.6
1r
Chapter 8
is Tr an scendental - Part 1 is Tran scendental - P ar t 2 An Algebraic Postscript
146 149 153 163
8.1
The Ring F[X] p(x )
164
8.2 8.3
Division and Reciprocals in F[X]p(x ) Reciprocals in F (0')
165 171
Chapter 9
Other Impossibilities and Abstract Algebra 177
9.1
Con structi on of Regular Polygons
178
9.2
Solution of Quinti c Equations
179
9.3
Integration in Closed Form
181
Index
183
Introduction
0.1
Three Famous Problems
In this book we discuss three of the oldest problems in mathematics. Each of them is over 2,000 years old. The three problems are known as : [I] doubling the cub e (or duplicating the cube, or the Delian problem) ; [II] trisecting an arbitrary angle; [III] squaring the circle (or quadrature of the circl e). Problem I is to construct a cube having twice the volume of a given cube. Problem II is to describe how every angle can be trisected. Problem III is that of constructing a square whose area is equal to that of a given circle. In all cases, the constructions are to be carried out using only a ruler and compass. Reference to Problem I occurs in the following ancient document supposedly written by Eratosthenes to King Ptolemy III about the year 240 B.C. : To King Ptolemy, Etetostheues sends greetings. It is said the: one of the ancient tragic poets represented Minos as preparing a tomb for Glaucus and as declaring, when lle learnt it was a hundred feet eedi way: "Small indeed is the tomb tboii hast chosen for a royal burial. Let it be double [in volume}. And ihou slielt not miss the: fair form if thou quickly doublest eecli side of the tomb ." But iie was wrong. For when the sides are doubled, the surface [area} becomes four times as great, and tlle volume eigllt times. It became a subject of inquiry among geometers in whet. manner one miglit. double the given volume without cllanging the sliepe. And this problem was called the duplication of the cube, for given a cube tlley sought to double it .. . The origins of Problem II are obscure. The Greeks were concerned with the problem of constructing regular polygons, and it is likely 1
Fam ous Impossibilities
2
that the trisection problem arose in this context. This is so becau se the construction of a regul ar polygon with nin e sides necessitates th e tri section of an angl e. The history of Problem III is linked to that of calculating the area of a circl e. Information about this is contained in the Rhind Papyrus, perhaps th e best known an cient mathemati cal manuscript, whi ch was brought by A.H. Rhind to the British Mus eum in th e nin eteenth century. The manuscript was copi ed by th e scribe Ahmes about 1650 B.C . from an even older work. It states that the ar ea of a circle is equal to that of a square whose sid e is th e diamet er diminished by one ninth; that is, A = ( ~)2 d2 • Comparing this with the formula
A
= 7rT 2 = 1r~ gives
7r
= 4. (~r =
2:
6 1
= 3.1604 . ..
.
The Papyrus contains no explan ation of how this formula was obtained. Fift een hundred years later Archim edes showed th a t 10 10 1 3 -< 1r<3-=3-. 71 70 7 (Note that 3¥l- = 3.14084 ... , 3~ = 3.14285 .. . , 1r = 3.14159 .. .. ) Throughout the ages th ese problems were tackled by most of the best mathemati cian s. For some reason , amateur ma th ematicians were also fascinated by them . In the time of th e Gr eeks a sp ecial word was used to describe peopl e who tried to solve Problem III TETpcr ywv/( EW (t etragonidzein ) which means to occupy oneself with th e quadrature. In 1775 the Paris Academy found it necessary to protect its officials from wasting their time and energy examining purported solutions of these problems by amateur mathematicians. It passed a resolution (Histoire de l'Acad emic royal e, annee 1775, p. 61) that no more solutions were to be examined of th e problems of doubling the cube, trisecting an arbitrary angl e, and squaring th e circle and that th e same resolution should apply to machin es for exhibit ing perpetual mo tion. (See [EH, p. 3].) The problems were finally solved in the nin et eenth century. In 1837, Wantz el settled Problems I and II. In 1882, Lind emann disposed of Problem III.
Introduction
0.2
3
Straightedge and Compass Constructions
Construction problems are, and have always been, a favourite topic in geometry. Using only a rul er and compass, a great vari ety of constructions is possible. Some of these construct ions are described in detail in Section 5.1: a line segment can be bisect ed; any angle can be bisected; a line can be drawn from a given point perpendicular to a given line; et c. In all of these problems the ruler is used merely as a straight edge, an instrument for drawing a straight line but not for measuring or marking off distances. Such a rul er will be referred to as a straightedge. Problem I is that of constructing with compass and straightedge a cube having twice the volume of a given cube. If the sid e of the given cube has length 1 unit, then the volume of the given cube is 13 = l. So the volume of the larger cube should be 2, and its sid es should thus have length .y2. Hen ce the problem is reduced to that of constructing, from a segment of length 1, a segment of length ~ . Problem II is to produce a construction for trisecting any given angle. While it is easy to give examples of particular angles which can be trisected, the problem is to give a construction which will work for every angle. Problem III is that of cons truct ing with compass and straightedge a square of area equal to that of a given circle. If the radius of th e circle is taken as one unit, the ar ea of the circl e is Ti, and therefore the area of the construct ed square should b e Ti; that is, the side of the square should be ,fir. So the problem is reduced to that of constructing, from a segment of length 1, a segment of length ,fir.
0.3
Impossibility of the Constructions
Why did it take so many centuries for these problems to be solved? The reasons are (i) the required constructions are impossible , and (ii) a full understanding of these problems comes not from geometry but from abstract algebra (a sub ject not born until the nineteenth century). The purpose of this book is to introduce this algebra and show how it is used to prove the impossibility of these constructions.
Famous Impossibilities
4
A real number "I is said to be constructible if, starting from a line segment of length 1, we can construct a line segment of length 1"11 in a finite number of steps using straightedge and compass. We shall prove, in Chapters 5 and 6, that a real number is constructible if and only if it can be obtained from the number 1 by successive applications of the operations of addition, subtraction, multiplication, division, and taking square roots. Thus, for example, the number is constructible. Now ~ does not appear to have this form . Appearances can be deceiving, however. How can we be sure? The answer turns out to be that if ~ did have this form , then a certain vector space would have the wrong dimension! This settles Problem 1. As for Problem II, note that it is sufficient to give just one example of an angle which cannot be trisected. One such example is the angle of 60°. It can be shown that this angle can be trisected only if cos 20° is a constructible number. But, as we shall see in Chapter 6, the number cos 20° is a solution of the cubic equation
8x 3
-
6x - 1 = 0
which does not factorize over the rational numbers. Hence it seems likely that cube roots, rather than square roots, will be involved in its solution, so we would not expect cos 20° to be constructible. Once again this can be made into a rigorous proof by considering the possible dimensions of a certain vector space. As we shall show in Chapter 6, the solution of Problem III also hinges on the dimension of a vector space. Indeed, the impossibility of squaring the circle follows from the fact that a certain vector space (a different one from those mentioned above in connection with Problems I and II) is not finite-dimensional. This in turn is because the number 7r is "transcendental" , which we shall prove in Chapter 7.
Additional Reading for the Introduction More information about the background to the three prohlems can be found in the various books on the history of mathematics listed below. References dealing specifically with Greek mathematics include [.JG], [Til] and [FLa]. A detailed history of Problem III is given in [Ell].
Introduction
5
The origin al solutions to Problems I and II by Wan t zel a re in [MWJ a nd th e original solution to Problem III by Lind em ann is in [FL] . [E B1] E.T. Bell , Men of Math ematics, Simon a nd Shu ster, New York , 1937. [EB2] E.T . Bell , The Developm ent of Mathematics, McGraw-Hill , New York, 1945. [CB] C.B. Boyer , A History of Math ematics, Wiley, New York , 1968. [RC] R. Cou rant an d H. Robbins, What is Mathemat ics?, Ox ford University Press, New York, 1941. [AD] A. De Morgan , A Budget of Pamdoxes, Dover, New York , 1954. [UDj U. Dudley, A Budget of Trisection s, Springer-Verlag, New York, 1987 . [JG] J . Gow, A Short llistory of Greek Mathematics, Chelsea, New York, 1968. [CH] C .R . Hadlock, Field Theori; and its Classical Problems, Ca rns Mathematical Monographs, No. 19, Mathematical Association of Ame rica , 1978. [T H] T.L. Hea th , A llistoru of Greek Mat hematics Vol.! £3 ll, Clarendon Pr ess, Oxford, 1921. [Ell] E .W . Hobson , Squaring the Circle, Cambri dge Univer sity Pr ess, 1913; reprinted in Squaring the Circle, and Other Monographs , Chelsea , 1953. [HH] H.P. Hudson , Rul er and Compass, Longmans Gr een, 1916 ; reprinted in Squar ing the Circle, anti Other Monographs, C helsea, 1953. [F K] F . Klein , Famous Problem s, and other monographs, Ch elsea , New York , 1962. [MK] M. Klin e, Math emat ical Thought from An cient to Moder n Ti mes, Oxford University Pr ess, 1972 . [F La] F . Lasserre, The Birth of Math emat ics in the Age of Plato , Hut chin son , Lond on , 1964. [FL] F. L. Lind emann, " ti ber die Zahl 11"", Math emat ische Annalen, 20 ( 1882) , 213-225. [OS] D.J . Struik , A Concise llistory of Mathemat ics, Dover , New York , 1967. [VS] V. San ford , A Short /li story of Math emat ics, lIarrap , Loudon, 1958. [HT ] H.W . Turnbull , The Great Mathematicians, Methuen , Lond on 1933. [MW J M.L. Wantzel , "Recherches sur les moyen s de reconnaitrc si un Probl em e de Geometrie peut se resoud re avec la regle et Ie compas" , Journal de Math emat iques Pures et Appliqu ees, 2 (1837), 366-372.
CHAPTER 1
Algebraic Preliminaries
This chapter presen ts tlie background algebra on which the rest of this book dep ends. Mu cli of this m aterial should be familiar to you. TIle Rational R oots Tes t , however, will probably be new to you. I t provides a nice illustra tion of l lOW polynomials can be used to study cert ain properties of real numbers; for exa m ple, we use it to sho w th at tue numbers -Y2, ..j2 + J3, and sin 20° are irra tional. T he applica tion of p olynomials to the st udy of numbers wil! be an impor tan t theme th.rougho ut th e book.
7
8
Famous Impossibilities
1.1
Fields, Rings and Vector Spaces
In this section we summarize the main ideas and terminology of fields , rings, and vector spaces which we shall use throughout this book. If you are not already familiar with some of this material, we refer you to the Appendix to this chapter for more details. Fields A familiar example of a field is the set Q of all rational numbers. For this set, the usual operations of arithmetic
addition, subtraction, multiplication, and division (except by 0) can be performed without restriction. The same is true of the set IR of all real numbers, which is another example of a field. Likewise the set C of all complex numbers, with the above operations, is a field. A formal definition of a field is given in the Appendix to this chapter. Strictly speaking, a field consists of a set F together with the operations to be performed on F. When there is no ambiguity as to which operations are intended, we simply refer to the set F as the field. If F is a field and IE ~ F it may happen that IE also becomes a field when we apply the operations on F to its elements. If this happens we say that IE is a subfield of F and , reciprocally, that F is an extension field of E. Thus Q is a subfield of both C and IR while IR and Care both extension fields of Q. An important relationship between Q and C is expressed in the following proposition . 1.1.1 Proof.
Q is the smallest subfield of C.
Proposition.
We are to show that if F is any subfield of C then Q ~ F.
So we let F be any subfield of C. To prove that Q
~
F we shall show that
if
x EQ
then
x E F.
Let x E Q; that is, . _ ::I, -
S1' '
where
1',
s are integers an d
S
-Ir
O.
Algebraic Preliminari es
9
Our ai m is to prove t hat :r E F . To do th is we use the field prope rt ies of F .
As IF is a field , the number 1 must be in F. Therefore ea ch posi tive int eger is in F , as F is closed und er addition. So each negative int eger is in F , as F is closed under subtrac tion. Also 0 E F , as F is a field. Hence the integers l' and s ar e in IF. Since IF is closed under divi sion (and 8 i= 0) , the quotient 1'/ 8 E IF . _ So x E F, as required. There are , in fact , lots of int eresting fields which lie between Q and C and we shall devote much time to studying th em. Of course not all fields ar e subfields of C. To sec thi s, consider the following . For each positive integer n pu t l" = {O, 1, 2, ... , n - I} .
We define addition of elements a and b in l " by a EB" b = a + b (mod n) ;
that is, add a and b in the usual way and th en subtract mul tiples of n un til the an swer lies in th e set l ". We define mul tiplication simil arly: a 0 " b = a x b (mod n ).
For example, 3 EB6 4 = 1
2 EB6 1 = 3 , 3 064= 0 9 0 JO 7 = 3.
(subtract 6 from 7) , (subtract 12 from 12) ,
It can b e shown th at if n is a prime number then l " is a field . It is obvious that in this case L; is not a subfield of C. For example, if n = 2, 1 EB2 1 = 0 in l 2, bu t 1 + 1 = 2 in C; so the operation of addition in the field l2 is different from that in the field C. Unlike our previous examples Q , Rand C, the field I" is a fin it e fi eld; th at is, it has only a finite numb er of elements.
Famous Impossibilities
10
Rings The concept of a ring is less restrictive than that of a field in that we no longer require the possibility of division but only of addition, subtraction, and multiplication.
Thus the set l of all the integers is an example of a ring which is not a field, whereas the set N of all natural numbers is not a ring. We even allow the concept of multiplication itself to be liberalized. We do not require it to be commutative; that is, ab may be unequal to ba. We also permit zero-divisors, so that it is possible to have ab = 0 but neither a nor b is zero .
Observe that if M is the set of all 2 x 2 matrices with entries from IR, then lit! is a ring with the ring operations being matrix addition and matrix multiplication. However, 111 is not a field since if A
then
=
G~ )
AB =
and
B
= (~ ~ )
(~; ;~) # B A = G~ ~:).
Note also that this ring 111 has zero-divisors; for example, if C
= (~ ~ )
and
D
= (~ ~ )
then
CD
= (~ ~) .
A formal definition of a ring is given in the Appendix to this chapter.
Vector Spaces Recall from linear algebra that a vector space consists of a set V together with a field IF. The elements of 1/ are called vectors and the elements of F are called scalars. Any two vectors can be added to give a vector and a vector can be multiplied by a scalar to give a vector. A formal definition of a vector space is given in the Appendix to this chapter. When there is no danger of ambiguity it is customary to omit reference to both the scalars and the operations and to regard V itself as the vector space. In our study, however, the field of scalars will be constantly changing and to omit reference to it would almost certainly lead to ambiguity. Hence we shall include the field IF in our description and refer to "the vector space V over IF ".
11
Algebraic Preliminari es
Two other notions which you need to recall are those of spanning and linear independence. If VI, V2,. " ,Vn are vectors in the vector space V over the field F, then we say that the set {Vb V2,' .. , v n } of these vectors spans V over F providing that every vector V E V can be written as a linear combination of VI, V2 , • . . , V n ; that is,
for some Ab A2, " " An in F . A vector space is said to be finite-dimensional if there is a finite set of vectors { VI , V2 , ... , v n } which spans it . The set { VI , V2," ., v n } of vectors is said to be linearly independent over F if the zero vector, 0, can be written only as a trivial linear combination of VI , V2 , .. • , V n ; that is,
(Ai E iF) implies that Al = A2 = ... = An = O. The set {VI , V2 ,' .. ,vn } is said to be a basis for V over F if it is linearly independent over iF and spans V over F . While a finite-dimensional vector space can have an infinite number of distinct bases, the number of vectors in each of the bases is the same and is called the dimension of the vector space. Recall also that, in a vector space of dimension n, any set with more than n vectors must be linearly dependent. (For example, any set of four vectors in a three-dimensional space is linearly dependent.)
Pairs of Fields The way in which vector spaces enter our study is as pairs of fields , one of which is a subfield of the other. The most intuitive example, in this regard, is the vector space C over IR in which we take C for the vectors and IR for the scalars. This vector space ar ises from the pair of fields C and IR, in which IR is a subfield of C. We may think of the elements of C as points in the plane and then the vector space operations correspond exactly to the usual vector addition and scalar multiplication in the plane. More generally, if E and IF are fields with E a subfi eld of IF , we may take iF for the vectors and E for the scalars to get the vector space iF over E (Note that th e smaller field is the scalars and the larger one is the vectors.)
12
Famous Impossibilities
1.1.2 Definition. If the vector space IF over IE is finite-dimensional its dimension is denoted by [F : IE] and called the degree of IF over IE .
•
For example, it is readily seen that [C : R] = 2 and [R : R] = 1.
Exercises 1.1 1. (a) Which of (i) the (ii) the (iii) the (iv) the (v) the (vi) the (vii) the (viii) the
the following are meaningful? vector space Cover R; vector space Rover R; vector space Rover C; vector space Cover Q; vector space Cover C; vector space Q over Q; vector space Q over R; vector space Q over C. (b) Which of the above vector spaces has dimension 2? Write down a basis in each such case.
2. How may subfields of Q are there? (Justify your answer.) 3. (a) Is {(I + iV2), (V2 + 2i)} a linearly independent subset of the vector space Cover R? (b) Is {(I + iV2), (V2 + 2i)} a linearly independent subset of the vector space Cover Q? 4. (a) Let M 2(R) be the set of all 2 x 2 matrices with entries from R What are the natural vector space addition and scalar multiplication which make M 2(R) a vector space over R? (b) Find a basis for M 2(R). (c) Let M 2(C) be the set of all 2 x 2 matrices with entries from C. Show that it is a vector space over R and find a basis for it. Is it also a vector space over C? 5. LetF={a+bV2:a,bEQ} . (a) Prove that IF is a vector space over Q and write down a basis for it. (b)* Prove that IF is a field. (c) Find the value of [IF : Q].
Algebraic Preliminari es
13
6. A ring R is said to be an in tegral doma in if it s multiplication is commutative, if there is an element 1 such that l. x = x. I = x for all x E R, and if there are no zero-divisors . (a) Is I4 an integral domain ? (b) * For what n E N is I n an integral domain?
7. Let E be a subfield of a finite field F such that [F : E] = n, for some n E N. If E has m element s, for some m E N, how many element s does F have? (J ustify your an swer. )
8. (a ) If A , B , and C are finit e fields with A a subfield of Band B a subfield of C , prove that [C : A] = [C : B].[B : A]. [Hint. Use the result of Ex ercise 7.] (b) If furthermore [C: A] is a prime number , deduce that B = C or B = A. 1.2
Polynomials
The simplest and most naive way of describing a p olyn omial is to say that it is an "expression" of the form (1)
The coefficients a o, ail . . . , an are ass umed to be long to som e ring R . The above expression is then called a polyno m ial over th e ring R. As yet we have said nothing abo ut the symb ol x appearing in the ab ove expression . There are, in fact, two distinct ways of inte rpret ing this symbol. On e of these ways leads to the conce pt of a polynomial form, the other t o t hat of a polynomial fun ction. Polynomial Forms To arrive at the conce pt of a polynomial form , we say as little as possible about the symbol "x" . Vve regard x and it s var ious powers as simply performing the role of "markers" to indi cate t he position of the various coefficients in the expression . Thus, for exa mple, we kn ow that the two p olyn omials 1 + 2x
+ 3x 2
and
2x
+ 3x 2 + 1
ar e equal b ecause in each expression the coefficients of the same powers of x are equal.
14
Famous Im possibilit ies
Agai n, when we add two polynomials, as in the followin g calculation, (1 + 2x + 3x 2) + (2 + X 2) = 3 + 2x + 4x 2 , we may regard the powers of x as simply t elling us which coefficients to add together. They play a similar, although more comp licated, role when we mul tiply two polyn omials. Although x is regarded as obeying the sa me algebraic rul es as the element s of the ring R, we do not think of it as assuming values from R. For this reason it is called an indeterminate. For emphasis, we shall use upper cas e letters for indeterminates in this book and write "X" instead of "x" so our polyn omial (1) b ecom es
ao + alX
+ a2X2 + ...+ a"X" ,
and we call it a polsmomuil f orm (over the rin g R , in the indeterminate
X ). What reall y matters in a polyn omial form are the coefficients. Accordingly we say that two polyn om ial for m s are equal if and only if they have th e same coefficien ts.
1.2.1 Definitions. If in the above polyn omial form the las t coefficient a" t= 0, then we say that the polyn omial has degree n . If all the coefficients are zero , we say that t he polynomial is the zero polynom ial and leave its degree undefined . When we do not wish to state the coefficients explicit ly we shall use symbols like f(X ) ,g(X ) , h(X ) t o denot e polyn omi als in X . The degree of a nonzero pol yn omial f (X ) is denot ed by degf(X). If in the above polyn omial form a" t= 0, we call an the leading coeffi cie n t. The collection of all po lynomials over th e rin g R in the indet erminate X will be denot ed by
R[X]
(note t he square brac kets, as distinct from round ones). Since pol yn omial forms can be added , subtracted and mul tiplied (and these operations ob ey the usu al algebr aic laws) this set of polyn omials is it self a ring.
Algebraic Preliminaries
15
In particular, if the ring R happens to be a field IF, we obtain the polynomial ring IF[X]. This ring will not be a field, however, since X E F[X] but X has no reciprocal in F[X] because
-# 1,
X.f(X)
for all f(X) E IF[X] .
Polynomial Functions Back in the original expression (1), an alternative way to regard the symbol "x" is as a variable standing for a typical element of the ring R . Thus the expression (1) may be used to assign to each element x E R another element in R. In this way we get a function f : R -+ R with values assigned by the formula f(a:) = ao + alx
+ a2x2 + .. .+ ana:" .
(2)
Such a function f is called a polynomial function on the ring R. Thus if we regard polynomials as functions, emphasis shifts from the coefficients to the values of the function . In particular, equality of two polynomial functions f : R -+ Rand g: R -+ R means that f( .'/;)
= g(.'/;) ,
for all x E R,
which is just the standard definition of equality for functions. Clearly each polynomial form f(X) determines a unique polynomial function f : R -+ R (because we can read off the coefficients from f(X) and use them to generate the values of f via the formula (2)). Is the converse true? Does each polynomial function f: R -+ R determine a unique polynomial form? The answer is: not always! The following example shows why. 1.2.2 Example. Let f(X) and g(X) be the polynomial forms over the ring 1,4 given by
f(X)
= 2X
and
g(X)
= 2X 2 •
These two polynomial forms are different, yet they determine the same polynomial function . Proof. g: 1,4 -+
These polynomial forms determine functions f: 1,4, with values given by
f(x)
= 2x
and
g(x)
= 2x 2 •
1,4 -+ 1,4
and
Famous Impossibilities
16
Hence calculation in l4 shows that
f(O) = 0 = g(O) f(l) = 2 = g(l) f(2) = 0 = g(2) f(3) = 2 = g(3) . So f(x) = g(x), for all x E l4. Thus the functions f and 9 are equal.a This is something of an embarrassment! We have produced an example of a ring R and two polynomial forms f(X),g(X) E R[X] such that f(X) =1= g(X) and yet f = g . Fortunately, however, the only rings which are relevant to our geometrical construction problems are those which are subfields of the complex number field, C. For such rings it can be shown (see Exercises 1.2 #6) that the above phenomenon cannot occur and we get a one-to-one correspondence
f(X)
~
f
between polynomial forms f(X) E R[X] and polynomial functions
f:R-+R.
The result of all this is that while there is a fine conceptual distinction between polynomial forms and polynomial functions, we can ignore the distinction in the remainder of this book without running into practical difficulties. Thus
f(X)
and
f
will be more or less the same while f (x) will be quite different, being the value of f at x .
- - - - - - - - - - Exercises 1.2 1.
Give an example of an element of the polynomial ring IR[X] which is not a member of Q[X].
2.
In each of the following cases, give a pair of nonzero polynomials f(X), g(X) E Q[X] which satisfies the condition: (i) deg(J(X)
+ g(X)) < deg f(X) + degg(X);
17
Algebraic 'Preliminaries
+ g(X)) = deg f(X) + deg g(X) ; deg(f(X) + g(X)) is undefined.
(ii) deg(f(X) (iii) 3.
(a) Let IF be a field. Verify that if f(X) and g(X) are nonzero polynomials in IF[X] then f(X) g(X)
=f. 0 and
deg f(X) g(X) = deg f(X)
+ deg g(X) .
(b) Deduce that the ring IF[X] is an integral domain. (See Exercises 1.1 #6 for the definition of integral domain.) (c) Is 16[X] an integral domain? (Justify your answer.) 4.
Let f(X) denote the polynomial form over the ring 16 given by f(X) = X 3 • Find a polynomial form g(X) , with f(X) =f. g(X) , such that f(X) and g(X) determine the same polynomial function.
5.* Let R be any finite ring . Prove that there exist unequal polynomial forms f(X) and g(X) over R such that f(X) and g(X) determine the same polynomial function. 6.
Let IF be any subfield of the complex number field C and f(X) and g(X) be polynomial forms over the field IF. If the polynomial functions f and g are equal (that is, f (x) = g( x), for all x E IF), prove that the polynomial forms are equal (that is, f(X) = g(X)). [Hint. You may assume the well-known result that for any polynomial function h of degree n, there are at most n complex numbers x such that h(x) = 0.]
1.3
The Division Algorithm
When you divide one polynomial by another you get a quotient and a remainder. The procedure for doing this is called the division algorithm for polynomials. It leads to a theorem which is of fundamental importance in the study of polynomials. We begin with an example.
1.3.1 Example. We shall divide the polynomial X 3+2X2+3X +4 by the polynomial X + 1 to produce a quotient and remainder in Q[X].
18
Famous Impossibilities
I
X2 + X
+2
divisor -; X + 1 X3 + 2X2 + 3X + 4
X
3
+
+-
quotient
+-
dividend
2
X X 2+3X +4 X2+ X
2X +4 2X +2 2
+-
remainder
It is easy to check that the algorithm has done its job and given us what we want, namely,
dividend = divisor
X
quotient + remainder.
To understand why the division algorithm works it is helpful to write the results from the various stages of the process as equalities:
4
2 2 2 3 X + 2X + 3X + = (.X + 1)X + (X + 3X + X 2 + 3X + 4 = (X + 1)X + (2X + 4) 2X + 4 = (X + 1)2 + 2
4))
(1)
To help understand the significance of the equations (1), let us introduce some names for the various terms occurring in them and hence write each of the equations (1) as
current dividend = (X + 1) (monomial in X) + CU7Tent remainder. (A monomial in X means a polynomial of the type cX i . ) In the first of the equations (1), the current dividend is the actual dividend. At each step, the aim is to choose the monomial so as to balance out the highest power of X in the current dividend. This will ensure that deg current remainder
< deg current dividend.
At the next step, the old current remainder becomes the new current dividend. The process stops when eventually (at the third stage in our example) the degree of the current remainder falls below the degree of the divisor (X + 1 in this case).
19
Algebraic Preliminaries
The final step is to put the three equations (1) together to get X 3 + 2X 2 + 3X + 4 = (X + 1)(X 2 + X + 2) + 2
dividend
divisor
quotient
remainder
thereby verifying that the algorithm has achieved its goal.
-
The division algorithm can be applied more generally to a dividend f(X) and divisor g(X) to give a quotient q(X) and remainder r(X) as in the following theorem. 1.3.2 Theorem. [Division Theorem] Let F be a field. If f(X) and g(X) are in IF[X] and g(X) =f; 0, then there are polynomials q(X) and r(X) in IF[X] such that
f(X)
= g(X)q(X) + r(X)
and either r(X) = 0 or degr(X) < degg(X). Proof. The division algorithm is used to construct the polynomials q(X) and r(X). The proof that the algorithm succeeds uses the same ideas as those explained in the above example. -
Exercises 1.3 1.
Find the quotient and remainder when X 3 + 2X + 1 is divided by 2X + 1 in Q[X].
2.
(a) Repeat Exercise 1 above with 17[X] replacing Q[X] . (b) Write your answer to (a) without using any "-" signs.
3.
Use the division algorithm to find the quotient and remainder when X3 + iX2 + 3X + i is divided by X 2 - i in qX] .
4. Use the division algorithm to find the quotient and remainder when X2 + V2X - 3 is divided by X - J3 in R[X]. 5.* Write out the proof of Theorem 1.3.2 in detail. [Hint. Your proof may use mathematical induction.] 6. * Find an example which shows that Theorem 1.3.2 would be false if we assumed that IF was only a ring rather than a field. 7.* In Theorem 1.3.2 if IF, f(X) and g(X) are given, show that q(X) and 1'(X) are uniquely determined.
20
1.4
Famous Impossibilities
The Rational Roots Test
This simple test will enable you to prove very easily that certain numbers, such as
v'3 ,
~,
and
V2 + v'3 ,
are irrational. The test illustrates how polynomials can be applied to the study of numbers. A zero of a polynomial f(X) is a number 13 such that f(j3) = O. The test works by narrowing down to a short list the possible zeros in Q of a polynomial in Q[X]. A preliminary observation is that every polynomial in Q[X] can be written as a rational multiple of a polynomial in 1[X]. This is achieved by multiplying the given polynomial in Q[X] by a suitable integer. For example
1 1 + -3 X
+ ~X2 + !X 3 = 1 (42 + 14X + 12X 2 + 21X 3 ) .
7 2 42 In looking for zeros in Q of polynomials in Q[X] we may as well, therefore, look for zeros in Q of polynomials in 1[X]. The Rational Roots Test uses some terminology which we now record. By saying that an integer p is a factor of an integer q, we mean that there is an integer m such that q = mp. By saying that a rational number 13 is expressed in lowest terms by 13 = ~ we mean that r and s are in 1 with s 1= 0 and r and shave no common factors except 1 and -1.
1.4.1 Theorem. [Rational Roots Test] polynomial of degree n so that
Let f(X) E 1[X] be a
= ao + alX + ... + anX n for some ao, all . . . ,an E 1, witli an 1= O. If 13 is a rational number, written in its lowest terms as 13 = ~ , and 13 is a zero of f(X) then f(X)
(i) r is a factor of 00, and (ii) s is a factor of 0" .
Proof.
Let f(;3) 00
= 0 so that
') + °I (1 -; + a2
(1') 2+ . . . + -;
all
(1') n -; =0
Algebraic Preliminaries
21
and hence
This gives aos n =
(n-l + a21'S n-2 + .. . + a n1' n-l) als
-1'
so that r
is a factor of
n
aos •
But since r and s have no common factors except 1 and -1, there can be no common prime numbers in the prime factorizations of r and s. This implies that I' is a factor of ao. (This follows from the Fundamental Theorem of Arithmetic; see Exercises 1.4 #9.) Similarly on writing
we see that s
is a factor of
a n1'n
and therefore
s
is a factor of
an'
•
The following example is a typical application of the Rational Roots Test .
1.4.2
Example.
The real number .y2 is not rational.
Proof. Note first that .y2 is a zero of the polynomial 2 - X 5 in Q[X] . This polynomial also lies in Z[X], and hence we can use the Rational Roots Test to see if it has a zero in Q. Suppose therefore that ~ is a zero of 2 - X5, where 1', s E Z with s i- 0 and ~ is expressed in lowest terms. By the Rational Roots Test, r is a factor of 2
and
s is a factor of -1 .
This means that the only possible values of r are 1,-1 ,2, and -2, while the only possible values of s are 1 and -1. Hence ~ must be 1,-1,2, or -2. This means that ±1 and ±2 are the only possible zeros in Q. Substitution shows, however, that none of ±1 and ±2 is a zero of 2 - X5, and so 2 - X5 has no zeros in Q. Hence .y2, being a zero , is not in Q. •
22
Famous Impossibilities
Exercises 1.4 1. Let p(X) be the element of Q[X] given by
p(X)
= 2X 3 + 3X 2 + 2X + 3.
(a) Use the Rational Roots Test to find all possible rational zeros of p(X) . (b) Is -1 a zero of p(X)? (c) Is -~ a zero of p(X)? 2. Use the Rational Roots Test to prove that 3. Find a polynomial in Q[X] which has show that V2 + v'3 is irrational.
J5 is irrational.
V2 + v'3 as
[Hint. To obtain the polynomial, first let square both sides.]
0'
=
a zero. Hence
V2 + v'3 and
then
4. (a) Use the Rational Roots Test to prove that for each mEN, is rational if and only if m is a perfect square.
,;m
(b) Let m and n be any positive integers. Prove that rational number if and only if it is an integer. 5. If n is any positive integer, prove that ..;nn are irrational.
..;n -
6. Prove that
n.
\fiii
is a
..;n + ..;nn and
J1l+l- .;rt=l is irrational, for every positive integer
7. (a)* Prove that (b)* Prove that
v'3 + J5 + .j7 is irrational. V2 + v'3 + J5 is irrational.
[Hint. Both (a) and (b) involve some arithmetic. Do not be afraid to use a calculator or a computer program in an appropriate fashion.] 8. Find a polynomial in Q[X] with
0'
as a zero, where
0'=~-q'2 and then deduce that [Hint. Calculate
0'3
0' ~
Q by applying the Rational Roots Test.
and keep an eye on
0'
at the same time.]
23
Algebraic Preliminaries
9.
The Fundamental Theorem of Arithmetic says that every natural number n > 1 can be written as a product of prime numbers and, except for the order of the factors, the expression of n in this form is unique. Thus n has a unique expression
where Pl ,P2," . ,Pk are distinct prime numbers and ai, a2, . . . , ak are positive integers. (a) Use the Fundamental Theorem of Arithmetic to prove that if r and 8 are natural numbers such that a prime P is a factor of the product 1'8, then P must be a factor of r or of 8 (or of both). [Hint . Express l' and 8 as products of primes and then use the uniqueness part of the Fundamental Theorem.] (b) Let 1', 8, and b be positive integers such that rand 8 have no common factors except 1 and -1. Use the Fundamental Theorem of Arithmetic to show that if r is a factor of bs , then r must be a factor of b. (c) Let r , 8, b, and In be positive integers such that rand 8 have no common factors except 1 and -1. Show that if r is a factor of bs'", then r must be a factor of b. 10.* Use the Rational Roots Test to prove that the number sin is, sin 20°) is irrational.
g (that
[Hint . First. use the formulae cos2 ()
sin( () + 1» cos(() +
1»
+ sin 2 () = 1 ,
= sin () cos 1> + cos () sin 1> ,
= cos () cos 1> -
sin () sin 1> .
to write sin 3() in terms of sin () (without any cos terms). Next, put () =
g, x = sin e, and use sin J =
4.]
24
Fam ous Im possibilit ies
Appendix to Chapter 1 We record here formal definit ions of ring, field, vector space, and of relat ed te rms . If you are not familiar wit h these conce pts, we recommend that you read Sections 23, 24 and 36 of [JF] or Chapters 1 and 2 of [AT] or the early parts of Ch ap ter 3 of [AC). l.A.1 Definitions. A group (G, + ) is a set G together with a binar y ope ration + such that
+ g2 E G; if gI, tn , g3 E G , then gl + (g2 + g3) = (gl + g2) + g3;
(a) if gt, g2 E G , then gl (b)
(c) there is an element 0 E G such that g+O
= 0+ 9 = 9 for all 9 E G ;
(d) for every element 9 E G , there exists an element g' such that 9 + g' = g' + 9 = O. The element 0 is called the identity of the group G . T he element g' in (d) is called the inverse of g. The group G is said to be abelian or commutative if gl + g2 = g2+g1 for all gl, g2 E G. • We shall be more interested in sets with two binary operations . l.A.2
Definitions.
+ and . such that
A ring is a set R with two binary operations
(a) R together wit h the operation (b) if
rI, 1'2
E R , then
rl .1'2
+ is an ab elian gro up ;
E R;
(c) if rl ,r2 ,r3 E R, t hen rdr2.1'3) =
( 1'/ .1'2). 1'3;
(d) if 1'1,1'2,1'3 E R, then 1'1. (1'2
+ 1' 3 )
= 1'1·1'2
+ 1'1·1'3
an d
(1'2
+ 1'3) .1'1 =
1'2.1' 1
The addi tive identity of R is de noted by O. The ring R is said to be a commuiaiiue ring if 1'1.1'2 = 1'1,1'2
E
+ 1'3.1'1 .
1'2.1' 1
for all
R.
A ring wit h unity is a ring R with an element 1 (called the unity of R) such that 1. 1' = 1'.1 = r for all r E R. An element r of a ring (R, + ,. ) is sai d to be a zero-divisor if r =j:. 0 and there exists an element 8 =j:. 0 in R such that 1'. 8 = 0 or 8.1' = O.
25
Algebraic Preliminaries
A ring (R, +,.) is said to be an integral domain if it is a commutative ring with unity and it contains no zero-divisors. A subset 8 of a ring (R, +,.) is said to be a subring of R if (8, +,.) is itself a ring. -
1.A.3 Definitions. A ring (IF, +, .) is said to be a field if IF \ {O} together with the operation . is an abelian group. A subset 8 of a field (IF, +, .) is said to be a subfield of IF if (8, +, .) is itself a field. The additive identity of a field F is usually denoted by 0, and the multiplicative identity of IF is denoted by 1. Note that while every field is an integral domain, there are integral domains which are not fields: for example, the ring 7L of integers is an integral domain which is not a field.
l.AA Definitions. A set V together with two operations + and . together with a field IF is said to be a vector space over the field IF if (a) V together with the operation
+ is an
abelian group;
(b) if A E IF and v E V, then A.v E V; (c) if AI, A2 E IF and v E V, then (AI + A2)'V = AI.V + A2.V; (d) if VI, V2 E V and A E IF, then A,(VI + V2) = A.VI + A.V2; (e) if 1 is the multiplicative identity of IF, then Lv = v for all
V
EV;
(f) if AI, A2 E IF and v E V , then (AIA2)'V = AdA2'V), We refer to the elements of IF as the scalars and the multiplication of an element of IF by an element of V (for example, A.V in (b) above) as scalar multiplication: The elements of V are called vectors. The identity of the group (V, +) is called the zero vector and is denoted by 0. The operation + is often called vector addition. A subset 8 of the vector space If over th e field IF is said to be a subspace of V if 8 together with the operations + and . of V is a vector space over the field IF. -
l.A.5 Definitions. If V is a vector space over a field IF and 8 is a subset of V, then the set span(8) =
{AI .VI
+...+ A".V" : n ;::: 1 and each
Ai E IF, Vi E S}
is called the span of 8 over IF or the linear span of 8 over IF. If span(8) = V then 8 is said to span V over IF .
-
26
Famous Impossibilities
l.A.6 Definitions. If V is a vector space over a field IF and S is a subset of V, then S is said to be a linearly independent set over IF if the zero vector, 0, can be written only as a trivial linear combination of vectors in S ; that is, for VJ, V2, "" V n E S,
+ A2,v2 + .,.+ An.vn Al = A2 = ... = An = 0. 0=
Al,Vl
implies that S is said to be linearly dependent over IF if it is not linearly independent over IF. •
I.A.7 Definitions. If V is a vector space over a field F and S is a nonempty subset of V which is linearly independent over IF and which spans V over IF, then S is said to be a basis of V over IF. If S is finite, then the number of elements of S is called the dimension of V over F. • Additional Reading for Chapter I General algebra books which include the background information we assume about rings, fields, and vector spaces (and which also deal with geometrical constructions) include [AC], [JF] and [LS], while [AT] is a good introduction to vector spaces. Deeper results about irrational numbers can be found in [GH] (which is the classic reference on number theory) and in [IN1] and [IN2]. [GH] contains a proof of the Fundam ental Theorem of Arithmetic and is also a good reference on prime numbers. [AC] A. Clark , Elem ents of Abstract Algebra, Wadsworth , Belmont , Californi a, 1971. (JF] J.B . Fraleigh, A First Course in Abstract Algebra, 3rd edition, Addison-Wesley, Read ing, Massachusetts, 1982. [GHJ G.H. Hardy and E.M. Wright , An Introduct ion to the Theory of Numbers, Clarendon, Oxford, 1960. [IN1] I. Niven, Numbers: Rational and Irrational, Random Honse, New York, 1961. [IN2] I. Niven, Irrational Numbers, Carus Mathematical Monographs, No. 11, Mathematical Association of America, 1967. [LS] L.W. Shapiro, Introduction to Abstra ct Algebra, McGraw-Hill, New York, 1975. [AT] A.M. Tropp er, Linear Algebra, Thomas Nelson, London, 1969.
CHAPTER 2 Algebraic Numbers and Their Polynomials
Straightedge and compass constructions can be used to produce line segments of various lengths relative to som e preassigned unit length. Althougll th e lengths are all real numbers, it turns out that not every real number can be obt ained in this way. Tb e lengths which can be constru cted are teth er special. As th e first step towards classifying th e lengths which can be constructed, this chapter introduces the concept of an algebraic number (or more specifically of a number which is algebraic over a field). Each sucli number will satisfy many polynomial equations and our immediate goal is to choose the simplest one.
27
28
2.1
Famous Impossibilities
Algebraic Numbers
Numbers which lie in R but not in Q are said to be irrational. Wellknown examples are ..;2, e and 1r. The irrationality of ..;2 was known to the ancient Greeks whereas that of e and 1r was proved much later. Although these three numbers are all irrational, there is a fundamental distinction between ..;2 and the other two numbers. While ..;2 satisfies a polynomial equation with coefficients in Q, no such equation is satisfied by e or 11" (which we shall prove in Chapter 7.) For this reason ..;2 is said to be an algebraic number whereas e and 1r are said to be transcendental numbers. The precise definition is as follows. 2.1.1 Definition. A number a E C is said to be algebraic over a field F ~ C if there is a nonzero polynomial f(X) E F[X] , such that a is a zero of f(X) ; that is, there is a polynomial
f(X)
= ao + al X + ...+ an X n
whose coefficients ao, aI , ... ,an all belong to F, at least one of these coefficients is nonzero, and f( a) = O. • Observe that for each field F, every number a in F is algebraic over F because a is a zero of the polynomial X - a E F[X]. This implies that e and 1r are algebraic over R even though they are not algebraic over Q, as noted above. 2.1.2
Examples.
(i) The number ..;2 is algebraic over Q because it is a zero of the polynomial X 2 - 2, which is nonzero and has coefficients in Q. (ii) The number .y2~ is algebraic over Q because it is a zero of the polynomial X l 2 - 72, which is nonzero and has coefficients in Q.
•
In order to show that a number is algebraic, we look for a suitable polynomial having that number as a zero.
29
Algebraic Numbers
2.1.3
Proof.
Example. Let
0'
The number 1 + .,fi is algebraic over Q.
J
= 1 + .,fi.
We want to find a polynomial , with Cl as a zero, which has coefficients in Q. This suggest s squa ring to get rid of t he squa re roots: 0'2
= 1 + 2./2 + 2,
which is no good as ./2 st ill a ppea rs on th e right hand side. So we go back and isolat e ./2 on one side before squa ring!
Isolating V2 gives Squaring both sides gives
0' -
1=
Vi.
(0: - 1)2 = 2, 0:2 - 20: - 1 = O.
and so
Thus 0: is a zero of the polynomial X 2 - 2X - 1, whi ch is nonzero and has coefficients in Q . Hen ce 0' is algebraic over Q. • 2 .1.4
Other Versions.
It is useful to b e able to recognize the defini ti on of "algebraic over a field F" when it appears in different guises : t hus 0' E C is algebraic over F if and only if
(i) th ere exists f (X ) E F[X] sucli that f (X ) i- 0 and f (o:) = 0,
OR (ii) th ere is a posi tive int eger n and numbers no t all zero, such tllat 0.0
0.0, a j , ... , a n- I, a n
in IF ,
+ a I 0' + a 2O'2 + . . . + a,, _IO' n- j + anO' = 0 . 11
The last statement may ring a few bells! Wher e have you seen it b efor e? Well, to put things in context recall that, since IF is a subfield of C, we can regard C as a vector space 011er IF. (See the paragraph preceding Definition 1.1.2.) Now the numbers
are all eleme nts of C, and hen ce ca n be regarded as vectors in t he vector space C over IF. The coefficients aO,a j,a2, . .. ,a n -],a n, on t he other hand, are all in F so we can regard them as scalars . Thus, we can write (ii ) in the alternative way:
30
Famous Impossibilities
(iii) there is a positive integer n such that the set of powers
{I ,0:,0'2 , ... ,0:n-I ,0'n} of the number 0' is linearly dependent over F .
•
You will often meet the terms "algebraic number" and "transcendental number" where no field is specified. In such cases the field is taken to be Q. We formalize this below.
Definitions. A complex number is said to be (i) an algebraic number if it is algebraic over Q;
2.1.5
(ii) a transcendental number if it is not algebraic over Q.
•
Exercises 2.1 1.
Verify that each of the following numbers is algebraic over the stated field: (a)
/2 ~ over Q; /2 + J3 over Q;
(b) (c) ..[iF over IR; (d) i over Q; (e) 2.
/2 + J3i
over R.
(a) Write down three different polynomials in Q[X], each of degree 2, which have J5 as a zero. (b) Write down three different polynomials in Q[Xj, each of degree greater than 2, which have J5 as a zero . (c) Write down three different polynomials in Q[X], each of degree 100, which have J5 as a zero.
3.
Prove that if 0' E C is algebraic over Q then so is 20'.
4.
Prove that if 0' E C is algebraic over a subfield F of C then so is -0'.
5.
Prove that if 0' is a nonzero complex number which is algebraic over a subfield F of C, then 1/0' is also algebraic over F.
31
Algebraic Numbers
6. Show that every complex number is algebraic over R 7. * Let a be a positive real number, and let F be a subfield of R. Prove that a is algebraic over F if and only if ..fa is algebraic over IF .
2.2
Monic Polynomials
An algebraic number such as V2 will be a zero of many different polynomials. Ultimately we want to pick out from all these polynomials one which is, in some sense, the best. Here we make a start in this direction by restricting attention to monic polynomials, defined as follows:
2.2.1 Definition. A polynomial f(X) = ao + a1X + .. . + anX" in F[X] is said to be monic if its leading coefficient an is 1. Thus, for example, X2 - 2 is a monic polynomial whereas 3X2 - 6 is not. Note that both these polynomials have V2 as a zero and, for our purposes, the polynomial X 2 - 2 is somewhat nicer than 3X 2 - 6. The following proposition shows that, in the definition of "algebraic over a field", we can always assume the polynomial is monic.
2.2.2 Proposition. If a complex number a is a zero of a nonzero polynomial f(X) E IF[X] then 0' is a zero of a monic polynomial g(X) E F[X] with degg(X) = degf(X) . Assume that a is a zero of a polynomial f(X) i= O. Since 0, some coefficient must be nonzero and (because there are only finitely many coefficients aj) there must be a largest i such that aj i= O. Let n be the largest such i. Hence
Proof.
f(X)
i=
f(X)
= ao + a1X + a2X2 + ... + anX n
where an i= O. We choose
g(X) =
~ f(X). an
Thus g(X) is monic, has 0' a') a zero and the same degree n as f(X). All the coefficients of g(X) are in F, moreover, since IF is a field. -
32
Famous Impossibilities
Exercises 2.2 1. If 0' is a zero of 3X3 - 2X + 1, find a monic polynomial with coefficients in Q having 0' as a zero.
2. If 20'3 - 1 = a zero.
J3, find
a monic polynomial in Q[X] which has 0' as
3. (a) Write down three different monic polynomials in Q[X] which have .ys as a zero. (b) Write down three different monic polynomials in Q[X], each of degree 100, which have .ys as a zero. (c) How many monic polynomials in Q[X] have
2.3
.ys as
a zero?
Monic Polynomials of Least Degree
Even if we restrict attention to monic polynomials, there are still a lot of them which have v'2 as a zero. For example, v'2 is a zero of each of the polynomials
X 4 _ 4, (X 2 - 2)2 , (X 2 (X 2 _ 2)(X I00 + 84X 3 + 73)
-
2)(X 5 + 3),
all of which are in Q[X] . What distinguishes X 2 - 2 from the other polynomials is that it has the least degree. If a number 0' is a zero of one monic polynomial then (as suggested by Exercises 2.2 #3) there are infinitely many monic polynomials of arbitrarily high degree having 0' as a zero. Although there is no maximum possible degree for such polynomials, there is always a least possible degree. (For example, if we know 0' is a zero of a monic polynomial of degree 12, we can be sure that the least possible degree is one of the finitely-many numbers 1,2,3, . . . ,11,12 .) The following proposition shows that by focusing on the least possible degree, we pick out a unique polynomial.
2.3.1 Proposition. If 0' E C is algebraic over a field F ~ C then among all monic polynomials f(X) E F[X] witu f(O') = 0 there is a unique one of least degree.
33
Algebraic Numbers
Proof. Since a is algebraic, there is (by Proposition 2.2.2) a monic polynomial f(X) E F[X] with f(a) = O. Hence there is one such polynomial of least possible degree. To prove there is only one such polynomial, suppose there are two of them, say heX) and heX), each having the least degree n, say. Thus heX) = ao + alX + + an_1x n- 1 + x n heX) = bo + blX + + bn_1Xn- 1 + X n • Hence if we put f(X) = heX) - heX) we get f(X)
= Co + C1X + ... + cn_1X n- 1 ,
where
c;
= aj -
bj •
~ant to prove that !I(X) :: h(X) . To do this we show that f(X):: O.
Clearly f(X) E F[X], f(a) = 0 and either f(X) is zero or f(X) has degree < n. If f(X) i= 0 then, by Proposition 2.2.2, there is a monic polynomial which has the same degree as f(X) and which has a as a zero. But this is impossible since n is the least degree for which there is a monic polynomial having a as a zero. • It follows that f(X) = 0, and hence heX) = heX). The above theorem enables us to assign to each algebraic number a unique polynomial. This polynomial is so important that we introduce some terminology and notation for it.
2.3.2 Definitions. Let a E C be algebraic over a field F ~ C. The unique polynomial of least degree among those polynomials f(X) in F[X] satisfying (i) f(a) = 0 and (ii) f(X) is monic is called the irreducible polynomial of a 011er F. This polynomial is denoted by irr(a, F). Its degree is called the degree of a over F and is denoted by deg(a, F).
•
34
Famous Impossibilities
The word "irreducible" means "cannot be reduced" and its use in the above definition stems from the idea that we cannot reduce the degree below that of irr(a, F) if we still want to have a polynomial with the desired properties (i) and (ii). Later you will find that irr(a, F) is also irreducible in a different sense. Thus the use of the word "irreducible" is doubly justified. 2.3.3 Example. over Q is X 2 - 2.
We prove that the irreducible polynomial of
../2
Proof. It is clear that this polynomial has ../2 as a zero , that its coefficients ar e in Q and that it is monic. It remains to show there is no polynomial of smaller degree with these properties. If there were such a polynomial it would be a + X,
for some a E Q .
But then we would have so that ../2 Hence
= -a E Q, which
is a contradiction.
irr(..[2, Q)
=X2 -
and therefore deg(..[2, Q) = 2.
2
•
Note that, although the irreducible polynomial of ../2 over Q is X 2 - 2, its irreducible polynomial over R is X -../2. This is because X - ../2 is in IR[X] but not in Q[X] . The idea in the worked example above is quite simple - find the polynomial you think is of least degree and, to show it really is of least degree, consider all smaller degrees . This can be far from straightforward if the polynomial in question has degree greater than 2. For example, you would suspect that irr(.y2, Q) is X 7 - 2. But to prove that .y2 is not a zero of any monic polynomial in Q[X] of degrees 1, 2,3, 4, 5 or 6 would be very difficult (to say the least) with the techniques you know at present. Special techniques have been developed for these sorts of problems; you will meet some of them in Chapter 4.
Algebraic.Numbers
35
Exercises 2.3 1. (a) Write down the irreducible polynomial of y'5 over Q and then prove your answer is correct.
(b) Write down deg( y'5, Q) . 2. In each case write down a nonzero polynomial f(X) satisfying the stated conditions: (a) f(X) is a monic polynomial over Q with (b) f(X) is a polynomial over Q with momc.
J2
J2 as a zero.
as a zero but is not
(c) f(X) is a polynomial of least degr ee such that
f(X) E R[X]
and
f( V2) = O.
(d) f(X) is another polynomial satisfying conditions (c). (e) f(X) E Q[X] and has both
J2 and v'3 as zeros .
3. (a) In each case write down th e unique monic polynomial f(X) of least degree which has the stated property:
v'3 as a zero. and has v'3 as a zero .
(i) f(X) E IR[X] and has
(ii) f(X) E Q[X]
(b) Explain why the polynomials you gave in part (a) have least degree. (c) What do your answers to part (a) tell you about (i) irr( v'3, IR) and deg( v'3, IR) ?
(ii) irr( v'3, Q) and deg( v'3, Q)? 4. Assume a is algebraic over IF. Let n be the least degree of all monic nonz ero polynomials f(X) in F[X] with f(a) = 0 and let m be the least degree of all nonz ero polynomials g(X) (monic or not) in IF[X] with g(a) = O. Explain why m = n. (Which theorem or proposition from an earlier section do you use?) 5. Let IF be a subfield of IR and let a , b, c E IF with c positive. (a) Show that a in IF[X].
+ bjC
is a zero of a monic quadratic polynomial
(b) What can be said about the degree of a
+ bjC over IF?
36
Famous Impossibilities
6. In accordance with # 4, 5,7 of Exercises 2.1, if a E R is algebraic over F then so ar e - a , ..[0: when a > 0, and l/a when a =I- O. (a) What is the relationship between deg(a , F) and deg( - a , IF )? (b) Prove that deg(..[O:,F) ~ 2deg(a ,F). (c) Is the following state ment tru e or false? (Justify your an swer. ) If a =I- 0 is algebraic over F then deg(a , F) = deg
(±'F) .
7. Let F ~ C and let a E C . In each case decide wheth er the st ate ment is true or false (and justify your an swer). (a) If deg(a, F) (b) If deg(a, IF)
= 2 then a rt. IF. = 2 th en a 2 E IF .
8. If a E C is algebra ic over a su bfield IF of C and { I , a , a 2, a 3 }
is linearl y indep end ent over IF, what can be said about deg(a , IF)? 9. Let IF be a subfield of C and let a be a com plex number. Assume that the set {I , 0: , 0: 2 a 3 , a 4 } is linearl y dep endent over F . (a) Prove tha t a is algeb raic over F . (b) What can be sa id about deg(o:, IF )? (Justify your answer.) 10. Let IF b e a subfield of C and let a E C. Prove that deg(a , IF) = 1 if and only if 0: ElF . 11. Is the following st a teme nt true or false? (Justify your answer.) If a, j3 and aj3 are algebra ic over Q then
deg(o:j3 , Q) = deg(o:, Q) deg(j3, Q). 12. Let IE ~ IF be subficlds of C and ass um e that j3 E C is algebraic over E. (a) Prove that j3 is algebraic over IF also. (b ) Prove that deg( j3 , IE ) 13. Let E
~
~
deg(j3, IF ).
IF be subfields of C such th at [IF : IE] = 5. Let j3 E IF .
(a ) What is the maximum numb er of elements which ca n be in a subset of IF which is linearl y independen t over IE?
37
Algebraic Num bers
(b) Stat e why the following powers of {3 ar e all in IF: 1, {3, {32, {33, {34,{35. (c) Ded uce from (a) and (b) that (3 is algebraic over E. (d) What can be sa id about (i) deg({3, E)?
(ii) deg({3, F)?
14. Let F be a subfield of C and let 0' E C. Let f (X ) E F[X] be a monic polynomial wit h f (O') = O. If there are polynomials g(X) , h(X) in IF[X ], each of degree ~ 1, such that f (X ) = g(X) h(X), show that f (X ) is not irr (O', IF ).
Additional Reading for Chapter 2 We have introduced irr(a, F ) with very few preliminary results ab out polynomials. However , tr eatments of irr(n, F) in most other book s depend heavily on techniqu es for showing that a par ti cular polynomial cannot be redu ced (th at is, factori zed furth er) . For thi s reason , you may find it confu sing to read ab out irr( n ,F) in oth er book s before you have finished reading ou r Cha pter 4. A discussion of numb ers algebra ic over a field is given ill th e first two sections of Chapter XIV of [GB]. For some histo rical fact s about transcend ent al nu mber s see Section 5.4 of [IN 1]. [G B] G. Birkhoff and S. Mac Lane, A S ur l1ey of Mod ern A lgebra, New York , Macmillan , 1953. [IN1] I. Niven, Nu mbers: Ration al and Irrational, Rand om House, New York , 1961.
CHAPTER 3
Extending Fields
If IF is a subfield of C and 0' is a complex number which is algebraic over IF, we show how to construct a certain vector space IF (0') which contains 0' and wliicli satisfies
IF
~
F(O')
~
C.
This vector space is then shown to be a subfield of C. Thus, from the field IF and tile number 0', we have produced a larger field IF (0'). Fields of the form IF (0') are essential to our analysis of the lengths of those line segments wliicu can be constructed with straightedge and compass.
39
40
3.1
Famous Impossibilities
An Illustration: Q( J2)
As Q is a subfield of C, we can consider C as a vector space over Q, taking the elements of C as the vectors and the elements of Q as the scalars. 3.1.1
Definition.
The set Q(J2) ~ C is defined by putting
Q(J2)
= {a + bJ2:
a,b E Q}.
-
Thus Q( J2) is the linear span of the set of vectors {I, J2} over Q and is therefore a vector subspace of Cover Q. Hence Q( J2) is a vector space over Q. 3.1.2 Proposition. Tile set of vectors {I, vector space Q( J2) over Q.
J2} is
a basis for tile
Proof. As noted above, this set of vectors spans Q( J2). Hence it is sufficient to establish its linear independence over Q. To this end, let a, bE Q with
a + bJ2
= o.
(1)
If b =1= 0, then J2 = -alb, which is again in Q as Q is a field. This contradicts the fact that J2 is irrational. Hence b = O. It now follows from (1) that also a = O. Thus (1) implies a = 0 and b = 0 so that the set of vectors {I, J2} is linearly independent over Q. -
Note that the dimension of the vector space Q( J2) over Q is 2, this being the number of vectors in the basis 3.1.3
Proposition.
Q(J2) is a ring.
Proof. To show that Q( J2) is a subring of C, it is sufficient to check that it is closed under addition and multiplication. The former is obvious while the latter is left as a simple exercise. 3.1.4
Proposition.
Q( J2) is a field.
Proof. To show that this subring of C is a subfield, it is sufficient to check that it contains the reciprocal of each of its nonzero elements.
Ext ending Fi eld s
41
So let x E Q()2 ) b e such that x
f
O. Thus
= a + bV2 ol' b f o. It follows a - bV2 f 0 x
wher e a,
se Q and a fO
that
by t he lin ear ind ep enden ce of t he set {I , )2}. Thus 1
1
x - a+b)2 1
- a + b)2
a - b)2 a - b)2
= (a 2 ~ 2b2) +
C2=b2b2) V2
whi ch is ag a in an eleme nt of Q( )2), since a/ (a 2_ 2b2) and -b/ (a 2- 2b2) are b oth in Q. • The followin g proposi tion gives a way of describing Q( J2") as a field with a certain prop er ty. The proof, like that of Prop osit ion 1.1.1 , in volves proving a set inclu sion.
3.1.5 Proposition. Q()2 ) is the sma llest field containing all the n umbers in the field Q and the number )2. Proof.
Let IF b e a ny field containi ng Q and )2.
It is clear from what has been said ear lier in t his section t hat Q( J2)
is a field which contai ns bot h Q and J2. To show it is the s m lIest J such field we sha ll prove t hat
Q(J2) ~ F.
To prove that Q()2 ) ~ IF we shall show t hat
if x
E Q( V2), then
1:
E IF.
To prove t h is we start by ass um ing the hyp othesis of (2) . Let x E Q( J2"); t hat is ,
x
= a + bV2
for some a , b E Q.
(2)
42
Famous Impossibilities
~im is to prove that x E F. ;h:t ~ is a field.
To do this we shall use the fact
II
~
II
By assumption, ...;'2 E F. Also a, b E IF as IF is assumed to contain Q. Hence b...;'2 E IF as F is closed under multiplication. So a + b...;'2 E IF as F is closed under addition. Thus x E F, as required.
•
Because Q(...;'2) is a field, the theory developed in Chapter 2 for a field F ~ C can now be applied in the case IF = Q(...;'2) . The following example illustrates this. 3.1.6 Example. The number degree 2 over this field.
v'3 is algebraic over Q(...;'2) and has
Proof. The polynomial X 2 - 3 has coefficients in Q, and hence also in Q(...;'2). It is monic, moreover, and has v'3 as a zero . Hence v'3 is algebraic over Q(...;'2). Suppose X 2-3 is not the irreducible polynomial of v'3 over Q(...;'2). Then the irreducible polynomial is a monic first degree polynomial
a+X for some a E Q(...;'2). Hence a+ v'3 that
= 0 and so v'3 = -a, which implies
v'3 = c+ dV2
for some c, d E Q. Squaring both sides gives
3 = c2 + 2../icd + 2d 2 • If we have while if
c=O 2d 2
= 3 which leads to the contradiction that )3/2 is rational,
d=O
we get the contradiction that v'3 is rational. Hence cd gives 2 d2 In 3-c-2 v2 = 2cd E Q,
1=
0, which
43
Extending Fields
which is also a contradiction. Thus our original assumption that X 2 - 3 is not the irreducible polynomial of J3 over Q(.;2) has led to a contradiction. Hence X 2 - 3 is the irreducible polynomial of J3 over Q( V2); that is, irr( V3, Q( V2)) and hence
=X2-
3
•
deg( V3, Q( V2)) = 2.
- - - - - - - - - - Exercises 3.1 1. Verify that each of the following numbers is in Q( V2) by expressing it in the form a + bV2, where a,b E Q. (i) (2 + 3.;2)(1 - 2V2) ,
(ii) (1 + 3V2)/(3 + 2V2) , (iii) .)3 + 212. sides.J
[Hint. Let this equal a
+ bV2
and square both
2. Show that the number .)1 + 12 is not in Q( V2). [Hint. Let this number equal a+bV2, square both sides and derive a contradiction.J 3. Show that the zeros in C of the polynomial X 2 + 2X - 7 are both in Q( V2) . 4. Verify that the number degr ee over this field.
J5 is algebraic
over Q(.;2) and find its
5. Verify that the number J144 is algebraic over Q(.;2) and find its degree over this field. 6. Complete the proof of Proposition 3.1.3, begun in the text. [Hint. Assume that x, y E Q(.;2) and then spell out explicitly what this means. Hence show that .T y E Q( .;2).J
v:m
7. Verify that is algebraic over each of the fields Q and Q( .;2), for each integer m 2: O. State in which cases it is true that deg(vm, Q)
= deg(vm,
Q(V2)) .
44
8.
Famous Impossibilities
Let a, b E Q. Show that if a + b/2 is a zero of a quadratic polynomial in Q[X] then its "conjugate" a - b/2 is also a zero of this polynomial.
9.* Prove the result of Exercise 8, but without the restriction that the polynomial is quadratic.
3.2
Construction of F (a)
In the previous section we constructed, from the field Q and the number /2, a vector subspace Q( /2) of C by putting
Q(V2)
= {a + bV2 : a ,b E Q}.
We then found that Q( /2) is closed under multiplication (as well as addition) and hence is a ring. Next Q( /2) was shown to contain the reciprocal of each of its nonzero elements, and hence it is also a field. Finally we showed that Q( V2) was the smallest subfield of C containing all the numbers in Q together with the number V2. In this section we generalize the construction of Q( /2) by constructing a subset F(a) of C for any subfield F of C and any number a E C which is algebraic over F. We begin by defining IF(a) and then showing that this set is in turn a vector subspace, a subring, and then a subfield of C. Our final goal is Theorem 3.2.8, which says that IF(a) is the smallest subfield of C containing a and all th e numbers in IF. Throughout this section, we assume that a is algebraic over IF.
3.2.1 Definition. Let F be a subfield of C and let a E C be algebraic over F with deg(a, F) = n . The extension of F by a is the set F(a) ~ C where F(a)
= {bo + b10: + . . . + bn _ 10: ,,- 1 : bo, b1, ••• , bn - 1 ElF}.
•
Thus F (a) is the linear span over IF of the powers 1,
0: ,
2
a , . .. , a
n-l
and so is a '(lector subspace of Cover F . In this section we show that IF(a) is a field - indeed it is the smallest field containing F and a. These results are proved in Theorems 3.2.6 and 3.2.8.
45
Extending Fields
Why did we stop at the power an-I? The answer is given by the following result, which shows that any further powers would be redundant. . 3.2.2 Proposition. positive powers of a:
The set F(a) contains all the remaining
Proof. Since n = deg(a, IF), a is a zero of a monic polynomial of degree n in F[X]. Hence Co
+ cIa + ... + clI_la
ll
1
-
+
all
= 0
for some coefficients CO,CI, ... ,CII_I E IF. Thus (1)
and each -Cj ElF, as IF is a field. Hence , by the definition of IF(a) (Definition 3.2.1), all
E F(a).
If we multiply both sides of (1) by a we see that
(2)
Now a, ... , a,,-I are all in IF(a) (by definition) and we have just shown that an E IF(a). Thus, because IF(a) is a vector subspace of C, it follows from (2) that a ll + 1 E F(a) . Now multiply both sides of (1) by a 2 and then proceed in the same way, thereby showing that a ll+ 2 E IF(a).
It is clear that by proceeding in this way we can show that the powers " ," ,\..( " ,11+1 ,Ll "."+2 ,
(....{.
all belong to IF(a).
•• •
•
Thus, in our definition of F(a) we have, in some sense, included "enough" powers of a. Have we included "too many"? An answer can be deduced from the following result.
46
Famous Impossibilities
3.2.3 Proposition. If n is the degree of a over IF, then the set of vectors {I, a, a 2 , • •• ,an-I} is linearly independent over IF. Proof. Suppose on the contrary that this set is linearly dependent; so there are scalars co, CI, . .. ,Cn-I E IF ,not all zero, such that
Among the coefficients co,cI, . . . , Cn- I pick the one, say Ci, farthest down the list which is nonzero. Dividing by this coefficient gives
( ~~) + (~:) a + ... + (C~~ I )
i I
a -
+ ai = a
where i ~ n - 1, and the coefficients are still in the field IF . Hence a is a zero of a monic polynomial in IF[X] with degree smaller than n (which was the least possible degre e for such polynomials) . This contradiction shows that our initial assumption was false. Suppose we had tried to take S = {bo + bia + ...+ bn _ 2a n - 2
:
bo, . .. , bn - 2 E IF}
as a candidate for a ring (or field) containing IF and a. Then (if n ~ 2), a and a n - 2 are both in S but their product a n - I is not in S by the above proposition. Thus S is too small and so we need all the powers 1, a, ... , a n - I in IF(a) for it to be a ring or a field. 3.2.4 Theorem. [Basis for F(a) Theorem] Let IF be a subfie1d of C and let a E C be algebraic over IF with deg(a, IF) = n. Then the set of vectors {I , a, a 2 , .• • , an-I} is a basis for tlie vector space IF(a) over F . In particular this vector space lies dimension n , the degree of a over F . Proof. By Definition 3.2.1 , this set of vectors spans the vector space F(a) over IF and, by Proposition 3.2.3, the set of vectors is linearly independent. Hence it is a basis for the vector space and the number n of elements in the basis is the dimension of the vector space. As F(a) is a subset of C, its elements can be multiplied together. This leads to the following proposition.
47
Extending Fields
3.2.5
Proposition.
IF (0') is a ring.
Proof. To show IF(a) is a subring of C, it is sufficient to show it is closed under addition and multiplication. The former is obvious while the latter will now be shown. Consider any two elements in IF(a), say p = b + blo: + b a 2 + ... + bll_Ia n- 1 E IF(a)
o
2
and
q = Co + cIa + C2a2 + ... + cn_Ia n- 1 E IF(a), where the b's and c's belong to F. If we multiply these two elements together, we get a linear combination of powers of 0', with coefficients still in the field IF. Because all positive powers of 0' are in IF(a) (by Proposition 3.2.2) and because IF(o:) is closed under addition and scalar multiplication, it follows that the product pq is also in IF(a). At long last we are in a position to establish that IF(a) is a field.
3.2.6 Theorem. Let IF be a subfield of C and let 0' E C be algebraic over IF . Then IF (0') is a field . Proof. In view of Proposition 3.2.5, what remains to be shown is that 1/(3 is in IF (0:) for every nonzero (3 in IF(0'). So let (3 be a nonzero number in 1F(0:). Firstly note that {I, (3,(32, .. . , (3"} is a set of n + 1 numbers which are all in IF (0'), since IF (0') is closed under multiplication. Because IF (0') is an n-dimensional vector space over IF, this set must be linearly dependent, which means that there are scalars do, d l , ... ,dk in IF (not all zero) with k ::; n such that (3) If do = 0 in (3), we could divide by (3 (or multiply by 1/(3) to reduce the number of terms in (3). By repeating this if necessary, we see that (3) can be assumed to hold with do =1= O. If we multiply (3) by -1/do we have -1 + el(3 + e2(32 + ...+ ek(3k = 0, where each e, = -dildo is in IF since IF is a field. Thus 1 = el(3 + e2(32
+ ...+ ck(3k = (3(el + e2(3 + ... + ek(3k-I).
It follows that 1/(3
= el + e2(3 + ... + ek(3k-l,
which is in F (0') since the ei and (3i are in IF (0:) and IF (0') is a ring. -
48
Fam ous Impossibili ties
In simple cases the method used to prove Theorem 3.2.6 can be used to find a formula for 1/(3, as the following example shows.
3.2.7 Example. In Q( V2), if (3 = 1 + 2V2 then (32 = 9 + 4V2 and it is easy to see that (32 - 2{3 - 7 = O. This can be rewritten as (3({3 - 2) = 7 so tha t 1/(3
= ((3 -
1 2 2)/7 = -"7 + "7 V2.
-
3.2.8 Theorem. [Smallest Field Theorem] Let IF be a subfi eld of C and let 0' E C be algebraic over F . Th en IF (0' ) is the smallest field containing 0' and all th e numbers in F. Proof.
This is analogous to the proof of Proposition 3.1.5.
-
The Smallest Fi eld Theorem gives a way of describing IF( O') which is just as useful for many purposes as the definition of IF(O') (D efinition 3.2.1 ). Hence, in solving problems involving F (O') you should keep both Definition 3.2 .1 and Theorem 3.2.8 in mind and the n use whichever seems m ore appropri a te for the particul ar pr obl em . In some bo oks the roles of our Smallest Field Theorem and our Definiti on 3.2.1 are reversed , t he state me nt of the former appearing as a definition and the la t ter as a theorem. Our treatme nt is perhaps more concret e and so hopefully easier for those studyi ng the topic for the first t ime . Our approach assumes, however , that cv is algebraic over IF. If this is no t the case , our definition of F( O') is no t applicabl e. Hen ce, in such cases, we would use the stateme nt of Theorem 3.2. 8 as our definition. The followin g example is a typi cal application of the Small est Field Theorem.
3.2.9 0' E C.
Example.
F(cv 2 ) ~ F (O') for each subfield IF of C and each
Proof. Let F be a subfi eld of C and let 0' E C. By the Smallest Fi eld Theorem , F (cv) contains 0' and IF . Hen ce F (O') contains 0'.0' = 0'2 since, bein g a field , IF(O') is closed under multiplication. This shows that F (0') is a field cont aining 0'2 and IF. But F(o:2) is the smallest such field . Therefore F(0'2) ~ F(o:), as required. _
49
Extend ing Fi eld s
- - - - - - - - - Exercises 3.2 - - - - - - - - 1. What does the definition of F (Q') say in each of the following cases?
(i) degfo , F) (ii) degf o , F)
= 1, = 2,
(iii) degfo , F) = 3. 2. (a) What do es the definiti on of F( Q') tell you in the sp ecial case where F = Q and Q' = V3 ? (b) What do Propositions 3.2 .2, 3.2.3 and Theorem 3.2.4 t ell you in this sp ecial case? (c) Verify directly that (i) th e set {I , V3} is linearly independen t over Q,
(ii) Q( V3) contains the product (2 + V3)(3 + V3) and the qu oti ent (2 + .,13)/ (3 + .,13). 3. What is the degree of V2 over R? Whi ch familiar field is R(V2 )? 4. What is the degree of i over R? Which familiar field is R(i)? 5. (a) Stat e why 1 + V2 E Q ( V2) and why V2 E Q (l
+ V2).
(b) Prove that Q (l + V2 ) = Q( V2 ) by showing that each field is a subset of the ot her. (You may use the Small est Fi eld Theor em .)
6. (a) Write down the irreducible polynomial of followin g fields: (i) Q,
(b) Stat e the degree of
V3 over
each of these fields.
7. Find irr( V2 , Q(l + J2)) and deg( V2 , Q (l ans wer .)
rf.
of th e
(iii) Q(V3).
(ii) Ill ,
8. (a) Prove that V2
V3 over each
Q( V3).
(b) Show that V2 is algebraic over Q( V3). (c) Wri te down irr ( V2 , Q(V3)). (d) Justify your answer to (c). (Use (a) .)
+ V2) ). (Justify your
50
Famous Impossibilities
9.
(a) What does your answer to Exercise 8(c) tell you about deg( V2, Q( V3))? (b) Let F denote Q( V3). Use (a) to write down a basis for the vector space F (V2) over F.
10. Complete the proof of Proposition 3.2.5 by showing that F(O') is closed under addition. 11.
Let F be a subfield of C and let 0', f3 E C. (a) Prove that if F(f3) ~ F(O') then f3 E F(O'). (b) Prove the converse of the result stated in part (a). [Hint. Use the Smallest Field Theorem.] (c) Deduce that F (0:) = IF (f3) if and only if 0' E IF (f3) and f3 E IF(0').
12. Let f3 = 1 + {Y2. Follow the method used in the proof of Theorem 3.2.6 and in Example 3.2.7 to obtain a formula in Q( ij2) for 1/f3. 13.
Let IF be a subfield of C and let 0' be a nonzero complex number. Show that if 1/0' can be expressed in the form do + dlo: + ...+ (hO'k
for k 2: 1 and d, E F, then
0:
is algebraic over F.
[Remark. It follows that if F(O') is defined to be the linear span of all positive powers 1,0',0:2 , • .• of 0: (much as in Definition 3.2.1) then IF(O') is a field precisely when 0' is algebraic over IF .] 14.* Find all the fields IF such that F (J2) = Q( J2) . (Justify your answer.) 15.* Describe the field Q(7r) as explicitly as possible. [Note that 7r is not algebraic over Q and so Definition 3.2.1 is not applicable. Instead, Q(7r) is defined as the small est subfield of C which contains Q and 7r.] 3.3
Iterating the Construction
The starting point for the previous section was a field IF ~ C and a number 0' E C which was algebraic over F. From these ingredients a new field was constructed, IF(O') ~ C.
51
Extending Fields
We can now take the field F(a) and a number (3 E C which is algebraic over F (a) as the starting point for a further application of the construction process to get a further new field F(a)((3) ~
c.
This process can be repeated as often as we like to give a "tower" of fields, each inside the next,
Q
~ F ~ F(a) ~ F(a)((3) ~
F(a)((3)(T)
~
...
~
c.
3.3.1 Example. Q( v'2)(.;3) is the linear span over Q( v'2) of the set of vectors {I, .;3}. This set, furthermore , is a basis for the vector space Q( v'2)(.;3) over Q( v'2). Proof.
First note that by the solution to Example 3.1.6 ,
irr(V3, Q(V2)) = X 2 - 3 and hence deg( V3, Q( V2)) = 2. It now follows from Definition 3.2.1 that
Q(V2)(V3)
= {x + V3y : x,y E Q(V2)} ,
which is just the linear span of the set of vectors {I, .;3} over Q( v'2). That this set of vectors forms a basis follows from Theorem 3.2.4. • The tower of fields
Q ~ Q( V2) ~ Q(V2)( V3) invites us to consider Q( v'2)(.;3) as a vector space over Q also. This leads to the following example. 3.3.2 Example. Q( v'2)(.;3) is the linear span over Q of the set of vectors {I , v'2,.;3, v'2.;3} . Proof.
By the previous example
Q(V2)( V3) = {x + V3y: x, y E Q( V2)} = {(a + bV2) + V3(c+ dV2): a,b,c,d E Q} = {a + bV2 + cV3 + dV2V3 : a,b,c,d E Q} which expresses it as the required linear span.
•
52
Famous Impossibilities
One might guess from the above example that the set of vectors {I , V2, V3, V2V3} is in fact a basis for the vector space Q( V2)(V3) over Q. One of the aims of the next section is to prove a theorem from which this result will follow very easily.
Exercises 3.3 1. Find a spanning set for Q( V2)( i)
(a) as a vector space over Q( V2),
(b) as a vector space over Q. 2. Simplify 1F(0:)(1+0:), where IF is a subfield of C and 0: E C. (Justify your answer.) 3. Write down three different extension fields of Q which contain
V2.
4. If IF is a subfield of C and 0:, (3 E C then IF (0:, (3) is defined to be the smallest subfield of C which contains all numbers in IF and also the numbers 0: and (3. Use this definition to show that, if 0: and (3 are algebraic over IF, then
IF(a)((3)
= 1F(0:, (3) = 1F((3)(a)
where, as in this section, IF(o:)(;3) means the smallest field containing all numbers in the field 1F(0:) and the number (3.
3.4
Towers of Fields
The aim of this section is to produce a theorem which is of vital importance in questions involving a tower
consisting of three distinct subfields IE, IF and K of C. Implicit in this set-up are three different vector spaces and the theorem to be proved will give a precise relationship between their dimensions. The three vector spaces arising from the tower are as follows. Since IE is a subfield of IF, we may take IF as the vectors and E as the scalars to give the vector space (i) IF over IE. Likewise there are the vector spaces
Extending Fields
53
(ii) K over IF, and (iii) IK over IE. It may be helpful to see all these vector spaces together on a single diagram, as shown below .
IF over IE
IK over IF
IK over IE Here IF plays a schizophrenic role: looking back we regard its elements as vectors, but looking forward we take them as scalars. The theorem on dimensions will follow easily from the following theorem. 3.4.1 Theorem. of subfieIds of C,
[Basis for a Tower Theorem]
Consider a tower
If the vector space F over IE has a basis {(I'\ , ... ,(I'm} and the vector space IK over IF has a basis {/3h ... , /3,, }, then the set of vectors
... , . .. ,
forms a basis for the vector space IK over IE.
Proof. The following diagram may help you to keep track of where the vectors come from. IE
~
IF QjEF
c
K ajt3iEK t3iEK
Firstly we show that the given set of vectors spans the vector space IK over IE. To do this, let k E IK.
Famous Impossibilities
54
~jm is to prove that k is a linear combination of the OJ{3i's II :i:~ ~coefficients in E. ~
II
Because the /3's form a basis for Kover F, there exist n scalars il, ... , In E F such that k=
n
L
i=1
f;/3i.
But since the os form a basis for F over E, there exist scalars (1 ~ i n, 1 ~ j m) in E such that
s
s
m
Ii = L
j=1
eijD'j'
(1) eij
(2)
Substituting (2) in (1) gives k= =
t
(f
eij Q'j) /3i i=1 j=1 n m e ij(Q'j/3i) . i=1 j=1
LL
Thus the Q'j/3/s span the vector space Kover E. Secondly we show that these vectors are linearly independent over E. Suppose eij (1 ~ i ~ n, 1 ~ j ~ m) are elements of E such that n
m
L L eij Q'j/3i = o.
(3)
i=lj=1
Our aim is to prove that all the e's are zero.
We can rewrite (3) as
L em L
n i=1
'= 1
eij
Q'j
) e. = O.
But in this sum, the coefficients of the /3/s (each coefficient is a sum of the eij Q'/s) are elements of F and the /3 's are linearly independent over F. Hence, for 1 ~ i ~ n , m
L eij Q'j =
j=1
O.
Extending Fields
55
But the o ''s are linearly independent over IE, which means that, for 1 ::; j ::; m, eij = O. This establishes the required linear independence. Thus the 0:/3 i 's span the vector space K over IE and are linearly independent over IE. Hence they form a basis for the vector space II< o~rE . 3.4.2 Example. of vectors
The vector space Q( V2)( /3) over Q has the set
as a basis. Proof.
Consider the tower
Q ~ Q( V2) ~ Q( V2)( V3). By the Basi s for 1F(0:) Theorem (Theorem 3.2.4) and Example 2.3.3 , the vector sp ace Q (V2) over Q has a basis {I , V2} while by the same theorem and Example 3.1.6, the vector space Q( V2)( /3) over Q( V2) has a basis {I , /3}. Hence, by the Basis for a Tower Theorem (Theorem 3.4.1), the vector space Q(V2)( /3) over Q has the basis
{I , V2,V3,
V2V3}.
-
By counting the numbers of elements in the bases for the vector spaces occurring in Theorem 3.4.1 we can get a relationship between their dimensions. This gives us the following very useful theorem. 3.4.3 Theorem. [Dimension for a Tower Theorem] a tower of subfie1ds of C,
Consider
If the vector spaces IF over IE and II< over IF Irave finite dimension then so does K over IE and
[II< : IE]
= [K : F][IF : IE]
or, in eltettietive notation, dimj, II<
= dim« II<
. dimj, IF.
56
Famous Impossibilities
Proof. There are altogether mn of the af3's in Theorem 3.4.1 and these form a basis for K over IE. This vector space therefore has dimension mn. Since the dimension of a vector space must be an integer, Theorem 3.4.3 has the following corollary.
3.4.4 Corollary. Under the conditions of Theorem 3.4.3, [K : F] is a factor of [K : IE] and [F: IE] is a factor of [K : IE].
EI
As an application of this corollary, observe that there cannot be any field F such that
with [IF : Q] = 3 since, by Example 3.4.2 ,
[Q( Y'2)( V3) : Q]
=4
and 3 is not a factor of 4.
Exercises 3.4 1. How many vector spaces can be constructed from a tower of four distinct fields 1E~1F~6~K
by using larger fields as vector spaces over smaller fields? 2. Consider a tower of subfields of C, IE~IF~K.
(a) What is [K : IE] if [K : F] (b) Is it possible that [K : IE] answer.)
= 2 and
= 3 and
[F : IE] [IF : IE]
= 6?
= 2?
(Justify your
3. Give an example of a tower of three distinct subfields of C,
IE such that [K : IE] = 4.
~
IF ~ K,
Extending Field s
57
4. By usin g t he tower Q ~ Q( V3) ~ Q ( V3)(V2) write down a basis for the vector space Q( V3) (V2) over Q.
Ex ercises 5 t o 10 will refer to the tower
5. (a) Is the set {l ,.,fl, J3, J6} linearl y independent over Q? Why?
[Hint. Use your an swer to Ex ercise 4.]
(b) Does .,fl + J3 E Q ( J6 )? Wh y?
(c) Does J6 E Q(.,fl + J3)? Wh y? [Hint. Calculate (.,fl + -/3)2.J
6. Which of your an swers to Ex ercise 5 would you use
(i) t o validate t he above tower? (ii) to show t hat Q ( J6)(.,fl + J3) = Q(.,fl + J3)? 7. (a) Show that the number .,fl +
J3 is algebraic
over Q( .;6).
(b) Wri te down irr (.,fl + J3, Q( J6)) and state which fact from Ex ercise 5 you would need to use to justify your ans wer . (c) Deduce the valu e of (d ) Deduce the valu e of
J3) : Q( .;6)]. [Q (.,fl + J3) : Q ( J6)J.
[Q( J6)(.,fl +
8. Use your an swer to Ex ercise 7(d) and th e Dim ension for a Tower Theorem to deduce (i) [Q(.,fl + J3) : Q], (ii) a basi s for th e vect or space Q( .,fl + .)3) over Q. 9. Let (3 E Q(.,fl + .)3).
(a) Use the fact that any m + 1 vectors in a vect or space of dimension m arc linearl y dep end en t , and your ans wer to Exer cise 8(i), t o show that {I , (3, (32, (33,(34} is linearly dependent over Q.
58
Famous Impossibilities
(b) Deduce from (a) that j3 is algebraic over Q. (c) What can you say about deg(j3, Q)? 10. (a) Use your answer to Exercise 8(ii) and the dimension part of the Basis for F(a) Theorem (Theorem 3.2.4) to deduce the value of deg( J'i + 13, Q). [Hint. Remember that any two bases of a vector space contain the same number of vectors.] (b) Use (a) and the Basis for F(a) Theorem to write down a basis for Q( J'i + 13) over Q. Is the fact that this is different from the one in Exercise 8(ii) a problem?
Exercises 11 to 15 will refer to the tower Q ~ Q( Y'2) ~ Q( 12).
11. (a) Show that V'2 is algebraic over Q. Is it also algebraic over
Q(J'i)? (b) Does V'2 E Q(J'i)? (Justify your answer.) (c) Does J'i E Q( V'2)? (Justify your answer.) 12. Which of your answers to Exercise 11 would you use (ii) to validate the above tower? (ii) to show that Q( J'i)( V'2) = Q( V'2)? 13. (a) Write down the polynomial irr( V'2, Q( J'i)) and state which fact from Exercise 11 you would use to justify your answer. (b) Deduce th e value of (c) Deduce the value of
[Q( J'i)( V'2) : Q( J'i)]. [Q(V'2): Q(J'i)].
14. Use your answer to Exercise 13(c) and the Dimension for a Tower Theorem to deduce (i) the value of
[Q( V'2) : Q],
(ii) a basis for the vector space Q( V'2) over Q, (iii) the polynomial irr( V'2 , Q).
Ext ending Fields
59
15. Let F be a subfield of C and let a be a real number which algebraic over IF of degree n .
IS
(a) What is [IF (a) : F] ? (b) What is the maximum number of vectors in a linearly independent subset of the vector space F (a ) over F ? (c) Assume now that (3 E F(a) . (i) Show that (3 is algebraic over F .
(ii) Use a suitable tower to prove that th e degree of (3 over IF is a factor of that of a over IF . (d) Use your answe r to (c) to show that every number in Q( V"2) is algebra ic over Q and has degree 1, 2 or 4 over Q.
Additional R eading for Chapter 3 Our t reat ment of F(a) uses relat ively little algebraic machinery. Oth er t rea t ments rely on t he algorithm for finding th e greatest common divisor of two polynomials (see, for exa mple, Sect ion 110 of [AC], Sect ion 2.2 of [CHI, or Section 10.1 of [LSD or th ey rely on results a bout ma ximal ideals in polynomial rings (see, for exa mple, Section 35 of [.JF], or Chapte r 11 of [W GJ). Th e proof of th e Basis for a Tower Th eorem is quite st anda rd and can also be found in Section 97 of [AC], Section 38 of [JF] , and Sect ion 2.2 of [CH]. [AC] A. Clark, Eleme nts of Abstmct Alqebra, Wadsworth , Belmont , Ca lifornia, 1971. [JF ] J .B. Fraleigh , A First Course in A bstract Alqebra, 3rd edit ion, Addison-Wesley, Reading , Massachusetts, 1982. [WG] W .J . Gilbert, Modern A lqebra with App lication s, Wiley, New York , 1976. [CH] C.R. Hadlock , Field Th eory and its Classical Problem s, Ca rus Math emati cal Monogra phs, No. 19, Mathemati cal Association of America, 1978. [LS] L.W . Shapiro, Introduction to Abstmct Algebra, McGr aw-Hill, New York , 1975.
CHAPTER 4 Irreducible Polynomials
In our earlier definition of th e irredu cible polynomial of a number, th e word "irred ucible" was intend ed to convey the idea thet. the degree of th e polynomial "could not be reduced Iurtli et", In this chap ter it will be shown tllRt this polynomial is also "irreducible" in the sense that it "cannot be fa ctorized furth er ". This will lead to a practical technique for finding tlie irreducible polynomial of a number. Witll the earlier definition we were able to show, for example, th at th e irreducible polynomial of..j2 over Q is X 2 - 2. Th e techniques to be developed in this cliep te: will enable us to move beyond quadratics and to show, for example, tllat irr(~,Q) = X
3
-
2.
This will feature lat er in our proof of th e impossibility of doubling th e cube.
61
62
Famous Impossibilities
4.1
Irreducible Polynomials
In this section the concept of an "irreducible polynomial " will b e defined. You have already met this phrase in the combin at ion "t he irreducible polynomial of a number a". Whereas the earlier concept was defined in terms of "least degree" , we shall now turn to the idea of "cannot be fur ther factorized". The precise relationship b etween an "irred ucible polynomial" and "the irreducible polynomial of a number a" will be explained later, in Section 4.3. For the moment, however, the main thing is that you do not confuse the two conce p ts.
4.1.1 Definition. Let F b e a field. A polynomial f(X) E F[X] is said to b e reducible O1ler F if ther e are polynomials g(X) and h(X) in F [X] such that (i ) ea ch has degr ee less than tha t of f(X ), and (ii) f(X) = g(X)h(X).
-
Thus a polynomial is reducibl e if it can b e factorized in a sui table way.
4.1.2
Example.
The polynomial X 2
-
2 is reducible over R.
Proof. Observe t hat X 2 - 2 = (X - v'2 )(X + v'2), wh er e each of the factors b elongs to R[X] and has degr ee less t han tha t of X 2 - 2.-
4.1.3
Example.
The polynomial X 2
-
2 is no t reducible over Q .
Proof. To see this , suppose, on the contrary, that X 2-2 is reducible over Q. This means that X2
-
2 = (aX
+ b)(eX + d)
for some a , b, e, d E Q . Clearly neither a nor e can ' be zero. Evaluating b oth sid es at v'2 gives
0= (ah Hen ce v'2 rational.
+ b)(eh + d ).
= -bla or -die, whi ch gives t he
contradiction that
v'2 is -
Irreducible Polynomials
63
4.1.4 Example. Con stant polynomials and pol ynomials of degree 1 ar e never reducible. _ Linguistically sp eaking, th e word "irreducible" should simply mean not reducible, because the pr efix i!: means not. (For example, i!:replaceable means "not replaceable" , i!:responsible means "not responsible" , i!:religious means " not religious".) This is nearly true in th e following definition, but th ere is a slight twist in that constant polynomials ar e excluded .
4.1.5 Definition. Let F be a field. A polynomial f(X) in F[X] is said to be irreducible over F if it is not reducible over IF and it is not a constant. _ Thus a polynomial is irr edu cibl e if it cannot be factorized suitably. The reason for excluding constant pol ynomials in this definition is that, without this exclusion, certain theorems which appear later would need to be stated in a more cumbersome way. In terms of "irreducible", the examples given earlier show that X2 X2
-
2 2
is not irreduc ible over R, is irreduci ble Oller Q.
Note that if a polynomial is irreducible over a field , th en it is irreducibl e over all subfi elds of that field. Mor e pr ecisely, if IE is a subfield of F and f(X) E IE [X] is irreducible over F , then f(X) is irreducible over IE. Notice also that constant polynomials are neither reducible nor irreducible; but all other polynomials are either reducible or irreducible over a given field.
Exercises 4.1 1. In each case state (giving reasons) whether the pol ynomial is reducible, irreducibl e or neither over th e given field. (a) the pol ynomial X 2 + 3X + 2
(i) over R,
(ii) over Q;
Famous Imp ossibilities
64
(b ) the polynomial X 2 + 1 (i) over Q ,
(ii) over C ,
(iii) over
(iv) over
l 2,
l 3;
(c) the polynomial 10 over Q; (d ) the polynomial X 2 - 2
(i) over Q ,
(ii) over Q( J2).
2. Do es t he fact that 3X 2+3 = 3(X 2+1) in Q[XJ mean that 3X 2+3 is reducible over Q? 3. (a) Is X3 - 1 irreducible over Q? (b) Is X3 (c) Is
+ 1 irreducible
X666 -
over Q?
1 irreducible over Q?
4. Let f(X) E F[XJ be irreducible over IF and let g(X), heX) in F[XJ b e such that f(X) = g(X) heX) . Show that eithe r g(X) or heX) is a const ant polynomial. 5. Assume t hat Q E C is alg ebraic over a field IF ~ C and that f (X) = irr(Q, F ). Show t hat f(X ) is no t reducibl e over F. [Hint. Use the fact t ha t f (X ) is of least p ossible degree amongst t he p olynomials in F[XJ with Q as a zero , and t he fact t ha t if pq = 0 for p, q in C the n eithe r p = 0 or q = O.J
4.2
Reducible Polynomials and Zeros
In this section the relationship b etween a polynomial having a zero and b eing reducible will be explored. This relationship will form the basis of our technique for proving the irreducibility of certain polynomials. As a first step in this direction, the relationship b etween having a zero and having a factor of degree 1 will be explore d . 4.2.1 Definition. Let F be a field. A polynomial f(X) E F[XJ is said t o have a fa ctor of degree 1 in IF[XJ if
f (X ) = (aX
+ b) g(X )
whe re a, s e F with a =/= 0 and wher e g(X ) E IF[X].
•
Irr edu cible Polynomi als
65
4.2.2 Theorem. [Fact or Theorem] Let F be a field. A p olynomial J(X ) E F[X] hes a factor of degree 1 in F [X] if and only if J (X ) has a zero in F. Proof. Assume firstly that J(X ) has a factor of degree 1. With not ation as in Definition 4.2.1 , it follows t hat
-bla E F is a zero of J(X ) sin ce
J( -bla ) = (a( -bla) + b)g(-bla) =0. Conversely, assume tha t 0: E F is a zero of J(X ). We aim to show that X -
Q
is a factor of f (X ).
If we divide J(X ) by X - 0: t hen (by the Division Theorem, Theorem 1.3.2 ) there exist q(X ), reX) in F [X ] with
J(X ) = (X - 0:) q(X ) + 1'(X) and
eithe r r eX) = 0 or deg r eX)
(1)
< deg(X - 0:) = 1.
Since r eX) = 0 or its degree is less tha n 1, r eX ) must be a constant p olynomial c E F ~ F[X], which mean s we can rewri te (1) as
J(X ) = (X - 0:) q(X)
+ c.
If we substitute X = 0: in to this and use th e fact that J(a ) we have 0= J(o:) = (0: - o:)q(o: ) + c = 0 + c = c.
(2)
= 0 (why?),
Thus, from (2), J(X ) = (X - o:)q(X ) which means J(X ) has the factor X - a of degree 1 in F[X] . • The mo st us eful theo rem for dealing wit h po lynomials of degree 2 or 3 is the following one.
Famous Impossibilities
66
4.2.3 Theorem. [Small Degree Irreducibility Theorem] Let F be any field. Let f(X) in F[X] luive degree 2 or 3. Then f(X) is reducible over F if and only if f(X) Iias a zero in F .
Proof. Let f(X) E F[X] have degree 2 or 3. Assume firstly that f(X) is reducible over F , so that f(X) = g(X) h(X)
for some nonconstant polynomials g(X), h(X) E F[X]. Since the degrees of g(X) and h(X) must add up to 2 or 3, one (or both) of these degrees must be 1. Hence, by Theorem 4.2.2, one of them must have a zero in F; hence f(X) must also have a zero in F. To prove the converse is easy: assume f(X) has a zero in IF and then apply Theorem 4.2.2 . • 4.2.4
Example.
The polynomial 2X3 - 5 is irreducible over Q.
Proof. By the Rational Roots Test (Theorem 1.4.1) the only possible zeros in Q of the polynomial are 1
5
± 1, ± 2' ± 2' ± 5, none of which works (that is, none gives zero when substituted into
2X3 - 5). Thus the polynomial has no zeros in Q. Since its degree is 3, the Small Degree Irreducibility Theorem is applicable and shows that f(X) is irreducible over Q. • It is important to observe that the Small Degree Irreducibility Theorem would not be true if we removed the restriction about "degree 2 or 3" . For example, the quartic
is reducible over Q but has no zero in Q. Fortunately we do not need to worry about the irreducibility of quartics (or higher degree polynomials) in order to prove the impossibility of the three famous geometric constructions.
67
Irreducible Polynomials
Exercises 4 .2 1. Use the Rational Roots Test and either the Factor Theorem or
the Small Degree Irreducibility Theorem to decide which of the polynomials listed below is irreducible over the stated field: (a) X 3 - 2X
(b) X3
+ lover Q, + 2X + lover Q,
(c) X3 - 5 over Q. 2. In which of the following cases can the technique of Exercise 1 be used to decide if the polynomial is irreducible over the given field?
+ 2X3 + 2X2 + 2X + lover Q, 4 (b) X + 2X3 - 2X 2 + 2X + lover Q , (c) X3 - V2X + lover Q, (a) X4
(d) X 2
-
2 over Q,
(e) X2 - 2 over Q(V2). 3. In each case find a zero in C of the given polynomial and then use the method of proof of Theorem 4.2.2 to write down a factor of degree 1 in qX] of the given polynomial.
(a) lOX 2 + 10, (b) X 2 - 4X + 5. 4. In each cas e give an example of a field F and a polynomial f(X) in F[X] such that f(X) has no zeros in F and yet f(X) is not irreducible over F. (a) f(X) is constant. (b) f(X) is not constant. Does this contradict the Small Degree Irreducibility Theorem? 5. In each cas e decide whether the statement is true or false, where F is a field . (Justify your an swers.) (a) If f(X) E F[X] is reducible over F then f(X) has a zero in F . (b) If f(X) E F[X] has a zero in F then f(X) is reducible over F . (c) If f(X) E F[X] has degree 4 and has a zero in IF, then f(X) is reducible over IF. (d) If f(X) E F[X] has degree 2 or more and has a zero in IF, then f(X) is reducible over F .
68
Famous Impossibilities
6. In each case decide whether the statement is true or false, where IF is a field. (Justify your answers.) (a) If f(X) E F[X] has degree 6 and is irreducible over IF, then f(X) does not have a zero in F . (b) If f(X) E F[X] is irreducible over IF then f(X) does not have a zero in F. (c) If f(X) E F[X] has degree 2 or more and is irreducible over IF, then f(X) does not have a zero in IF.
Irreducibility and irr( 0, IF)
4.3
The purpose of this section is to state and prove a theorem which links the idea of "irreducible polynomial" (as in Definition 4.1.5) to the idea of "irreducible polynomial of a number" (as in Definition 2.3.2). 4.3.1 Theorem. [Monic Irreducible Zero Theorem] Let IF ~ C be a field and let 0: E C be algebraic over IF. The following conditions on a polynomial f(X) E IF[X] are equivalent:
(i) f(X) = irr(o:, IF), (ii) f(a) = 0 and f(X) is monic and irreducible over IF . Before proving this theorem, we give some examples to show how the theorem can be used to check that a guess for irr( 0: , IF) is indeed the correct one. 4.3.2
Example.
irr( J2, Q) = X2 - 2.
Proof. Clearly X 2 - 2 has J2 as a zero and it is monic. By the Rational Roots Test, the only possible zeros in Q are ±1, ±2, none of which works . Because this polynomial has degree 2, it follows from the Small Degree Irreducibility Theorem that it is irreducible over Q. The Monic Irreducible Zero Theorem now gives the required result. Actually, this example was done earlier (as Example 2.3.3) . The advantage of the Monic Irreducible Zero Theorem is that it lets us handle cubics as well.
69
Irreducible Polynomials
4.3.3
Example.
irr( -Y2, Q)
= X 3 - 2. -Y2 as a zero and it
Proof. Clearly X3 - 2 has is monic. By the Rational Roots Test, the only possible zeros in Q are ±l, ±2, none of which works. Because this polynomial has degree 3, it follows from the Small Degree Irreducibility Theorem that it is irreducible over Q. The Monic Irreducible Zero Theorem now gives the desired result. Note that the same argument could not be used for quartics because of the requirement on the degree of the polynomial contained in the Small Degree Irreducibility Theorem. Note also that since we are appealing to the Rational Roots Test, the above technique for finding irr( 0', F) works only in the case IF = Q. We now complete this section by proving the theorem. Proof of Theorem 4.3.1. notation is introduced:
P
= {p(X)
To facilitate the proof, the following
E IF[X] : p(O')
= 0 and p(X) is monic}.
We first show that (i) implies (ii). Assume that (i) is true. By Definition 2.3.2 this means that
f (X) is the polynomial of least degree in
P.
Thus the first two statements in (ii) of the theorem hold and it remains to prove that f (X) is irreducible over IF. Suppose f(X) were not irreducible over F. Since it is not constant, it must be reducible over IF and so there are polynomials g(X) and h(X) in IF[X] such that
f(X)
= g(X) h(X)
where the degrees of g(X) and h(X) are both less than that of f(X) . We may assume that both of these polynomials are monic. Evaluation of both sides at 0' gives, furthermore,
0= g(O') h(O') and so, C being a field, this implies g(0')
g(X) E P
or
= 0 or h(0') = O.
h(X) E P.
Thus either
Famous Impossibilities
70
But this is a contradiction since f(X) had the least degree in P. Hence f(X) is irreducible over F. So (i) does imply (ii). Next we show that (ii) implies (i). Assume that (ii) is true. Thus f(X) E P and so degree f(X) 2: degree irrfo , F) . By the Division Theorem (Theorem 1.3.2),
f(X) = irr(a, F) q(X) where q(X) and l'(X) E F(X) with r(X) degree r(X)
< degree
+ r(X),
(1)
= 0 or irr(o-, F) .
Evaluation of the equation (1) at a gives
0= O.q(a)
+ l'(a)
so that r(a) = O. If r(X) i= 0 we can divide r(X) by its leading coefficient to get a monic polynomial in P whose degree is less than that of irr(a, F) , which is a contradiction. Hence l'(X) = 0 and so by the equation (1), f(X) = irr(a, F) q(X) . Thus q(X) must be a constant, since otherwise f(X) would not be irreducible over F . Since both f(X) and irr( a, F) are monic, this means q(X) is 1. •
Exercises 4.3
1. Find irr( J5 + 1, Q) and irr(.y3 - 2, Q) and then use the method of Examples 4.3.2 and 4.3.3 to justify your answer. 2. (a) Guess irr( iff, Q) and deg( iff, Q).
(b) Now use the definition of Q( iff) to get an explicit description of this set. (c) Give a basis for the vector space Q( iff) over Q. (d) Give a careful proof, with full justification for each step, for your answer to part (a) .
Irreducible Polynomials
71
3. Suppose that ..[7 E Q( iff) . (a) Deduce that Q(..[7) ~ Q( iff). (b) Thus we have a tower
Q ~ Q(..[7) ~ Q( iff).
Derive a contradiction from this tower. (c) What can you now conclude? 4. Note the obvious tower
Q ~ Q( iff) ~ Q( iff) ( V7). (a) Guess irr(..[7, Q( iff)) and deg(..[7, Q( iff)) . (b) Prove your answer to part (a) using the result of Exercise 3. (c) Hence give a basis for the two vector spaces Q( iff) ( V7)
Q( iff) ( V7)
over over
Q( iff) ,
Q,
and state their dimensions. 5. In each of the following cases show that the number sin 0 is algebraic over Q and find its irreducible polynomial and its degree over this field.
(a) 0 = 7r/6, (b) O=7r/3, (c) 0 = 7r/18. [Hint. Use the identity sin(30)
= 3sinO -
4sin 30 .]
6. Suppose that IE is an extension field of Q, and 0: E IE is such that
2(i + 30:2 + 40: + 3 = 0 . (a) Write down a monic polynomial f(X) in Q[X] such that 0: is a zero of f(X). (b) Can we find irr(o:, Q) from the information given? Why?
4.4
Finite-dimensional Extensions
Let K be an extension field of F such that the vector space Kover F is finite-dimensional. Must numbers in K be algebraic over F and, if so, what can we say about deg(o:, F) for 0: in K? The following theorem answers these questions.
Fam ou s Impossibilities
72
4.4.1 Theorem. Let F be a subfield of a field K with [K : F] = n . Th en every numb er a in K is algebraic over IF and deg(o-, F ) ~ n. Proof. Let a E K. Because [K : FJ = n , every set of (n + 1) numbers in II< must be linearly dep endent over F. Now {1, a, .. . , a "}
is such a set and thus there exist co , CI, that
• • . c"
E F , not all zero , such
+ cIa + ...+ cna" = O.
col
If we let
p(X)
= Co + clX + ...+ cnX"
then p(X) is a nonz ero polynomial in F[X] which has a as a zero. Thus (by definition) , a is algebraic over IF. It follows easily (using Proposition 2.2.2) that deg(a , F) ~ ti sin ce degp (X) ~ n. 4.4.2 Corollary. Let F be a subfield of C and let a be a com plex number whicll is algebraic over IF of degree n. E very number in IF(a ) _ is then algebraic over F and has degree ~ n . 4.4.3 Theorem. Let F be a subfield of a field IE . Th e set of numbers in IE wllich are algebraic over F is a subfield of IE . Proof. Let a, /3 E E be algebraic over F . We must show that a + /3, a - /3, a/3 and (prov ided /3 i= 0 ) al/3 are all algebraic over IF. We conside r the tower IF ~ IF (a ) ~ IF (a )(/3). Since 0' is algebraic over IF , IF (a) is a finit e-dimension al exte nsion of F. Since /3 is algebraic over IF it is also algebraic over IF(a) and hence IF(a)(/3) is a finit e-dimensional extension of F(a) . Thus, by the Dimension for a Tower Theor em (Theorem 3.4.3) , we see that 1F(0')(/3) is a finite-dimensional extension of IF and so, by Theorem 4.4.1 , every element of IF(a) (/3) is algebraic over IF. This completes the proof since a + /3, 0' - /3, 0'/3 and (provided /3 i= 0) 0'1/3 are all in IF (a )(/3). Recall that algebraic numbers ar e complex numbers which are algebraic over Q. 4.4.4
Corollary.
TIle algebraic numbers are a subfield of C.
-
Irreducible Polynomials
73
Exercises 4.4 1.
Use Theorem 4.4.1 to show that every complex number z is algebraic over R. What can you say about deg(z , R)?
2.
(a) Show that, with the hypotheses of Theorem 4.4.1, deg(o', IF) is a factor of n. (Consider the tower IF ~ IF(O') ~ K.)
(b) If a E Q( -Y2), what can you say about deg(O', Q)? 3.
Explain why the number 17 -Y2 - 3.;2 is in the field Q( -Y2) and then use Theorem 4.4.1 to prove that this number is algebraic over the field Q.
4.
(a) Explain why Q( i)(.;2) is a finite-dimensional extension of Q. (b) Use Theorem 4.4.1 to prove that (3i algebraic over Q.
+ 2.;2)/(1
- 4.;2i) is
5. * Let K be an extension field of R such that every element a in K is algebraic and of degree at most 2 over R. Prove that K = R or C. [Hint . Assume deg( 0', R) = 2. By completing the square in irr(o-, R), show that there are real numbers band e such that (3 = (a - b)/e satisfies (32 = -1.] 6.* Suppose that there exists an extension field K of R such that [K: R] =3. (a) Prove that there exists an clement a in K which is algebraic and of degree 3 over R [Hint. Use Exercise 5.] (b) Now note that every cubic polynomial in R[X] has a zero in R. (Proof via the Intermediate Value Theorem, as in [MS], page 103.) (c) Explain why (a) and (b) contradict the original supposition. [Remarks: elementary vector algebra. The set of all vectors in the plane can be regarded as the set R2 of all ordered pairs of real numbers (a, b). The vector space R2 can be made into a field by adopting the definition of multiplication from C; thus
(a, b)(e, d)
= (ae -
bd,ad + be).
Elements of the form (a,O) then form a "copy" of R in R2. The set of vectors in three-dimensional space, however, behaves quite differently. This set can be regarded as the set R3 of all ordered
74
Famous Impossibilities
triples of real numbers (a, b, c). Exercise 6 shows that it is impossible to make the vector space R3 into an extension field of IR, regardless of how we choose the vector multiplication in R3. Barehanded proofs of this result are included in [HH] and [RY], while Chapter 7 of [IH] contains a discussion of a more general result known as the Theorem of Frobenius.]
Additional Reading for Chapter 4 In Chapter 4 we developed techniques for proving that polynomials of degr ee 3 or less a re irreducible over Q. Example 31.6 in PF] will show you the sorts of computations that polynomials of degree 4 and higher can involv e. There are also somewhat general techniques which are not restri cted to such small degree polynomials a nd which apply over fields other than Q. See, for example, Section 31.2 of [JF], Sections 101- 107 of [AC], Chapter 9 of [WG], Section 2.1 of [CH], and Section 2.6 of [LS]. Th e problem of deciding, in general, whether a polynomial is irreducible over a given field is a difficult one . New, rath er deep techn iques , amenab le to computer use, have been developed recently, and much work is still being done on the problem . See, for example , the chapter "Factorization of polynom ials " by E. Kaltofen in [CAl . You may now wish to refer to other treatments of in ( 0, F) su ch as those given in Section 35 of [JF], Section 109 of [AC] and Section 2.2 of [Cll], Our Theorem 4.3.1 is the link between our treatment and that in these books. When discussing irr(o,F), we have usually taken F to be a subfield of the complex number field C and o to be a compl ex number because this is all that our im possibility proofs in Chapter 5 require. You should note that nearly all of our results about irr(o,F) generalize easily to th e case where F <;;; E is a ny tower of fields and (} E E. Results about irr(o,F) are st ated in t his more general form in [AC] and [JF] , for example. [CAl Computer Algebm : Symboli c (lnd Algebraic Computation, ed . B. Buchberger, G .E . Collins, R . Loos and R . Alb recht , Spr inger, Vienna. 2nd edi tion , 1983. [AC] A. Clark, Elements of Abstract Algebra, Wadsworth, Belmont, California, 1971. [JF] J .B. Fraleigh , A First Course in Abstract Algebra, 3rd edition, Addison-Wesley, Reading , Massachusetts, 1982. [WG] W .J . Gilbert, Modern Algebra with Applications, Wiley, New York, 1976. [CH] C.R. Hadlock, Field Theory and its Classical Problems, Carus Mathematical Monographs , No. 19, Mathematical Association of America, 1978. [HH] H. Hold en , "Rings on R2 " , Mathematics Magazine, 62 (1989), 48-51. [Ill] LN . Herstein, Topics in Algebm, Blaisdell, New York , 1964. [LSJ L.W. Shapiro, Introduction to Abstract Algebm, Mcflraw- Hill, New York, 1975. [MSJ M. Spivak, Calculus, W .A. Benjamin , New York, 1967. rRY] R.M . Young, "When is Rn a field?", The Mathematical Gazette, 72 (1988), 128-129.
CHAPTER 5
Straightedge and Compass Constructions
In the previous chapters we developed tile algebraic machinery for proving that tile three famous geometric constructions are impossible. In this chapter we introduce some geometry and start to show the connection between algebra. and the geom etry of constructions. We begin by showing you how to do some basic constructions with straightedge and compass, and then 110w to construct line segments whose leugths are products, quotients or square roots of ones already constructed. Strict rules, passed down from tile Greek geometers, must be followed in performing a construction. An important part of this chapter involves spelling out these rules very precisely. This leads to the notion of constructible number, on wbicl: the impossibility proofs in the next chepte:
75
76
5.1
Famous Impossibilities
Standard Straightedge and Compass Constructions
In this section we show you how to do some basic constructions with straightedge and compass. In each of these constructions, we are given some points and some lines passing through these points (and possibly some circles) . We can construct new lines and circles using the straightedge and compass as described in 1 and 2 below. 1. The straightedge may be used to draw a new line, extended as far as we like, through any two points already in the figure.
2. The compass may be used to draw new circles in two ways. (a) Put the compass point on one point in the figure and the pencil on another such point and draw a circle or an arc of a circle .
• (b) Place the compass point and pencil as in (a) but then move the compass point to a third point in the figure before drawing the circle (or an arc) with this third point as centre.
We get new points where the new line or circle meets other lines or circles . In the rest of this section we show how you can use a straightedge and compass to perform several constructions. These constructions will be used later as building blocks for other constructions.
77
Geometrical Constructions
5.1.1
Bisecting a line segment.
Purpose. To construct the midpoint
e
of a given line segment AB .
D
X I I I I I
I I I
I I I
I I I
Ie I I I
I I I I I I I I I
X E
Method. (i) Put the compass point on A and extend the compass until its pencil is exactly on B . Now draw an arc above AB and another arc below AB. (ii) Put the compass point on B and extend th e compass until its pencil is exactly on A. Now draw arcs to int ersect the arcs in (i) . Call the points of intersection D and E respectively. (iii) Join D and E with the straightedge. Where DE meets AB is the required point e. • [That e is the midpoint of AB seems obvious from the symmetry of the figure, and we give no formal proof of this (or of the validity of the constructions in the rest of this section) . Readers familiar with Euclidean-geometry proofs may be interested in providing their own proofs, using facts about congruent triangles.]
78
Famous Impossibilities
5.1.2
Transferring a length (using a compass).
Purpose. Given a line segment A B an d a (longer) line segment C D , to construct a point P on C D such that the line segments A B and C P have equal lengths.
~B A
C
D Method.
(i) P ut the compass po int on A and extend the compass un til its pencil is exactly on B .
(ii) P ut the compass point on C and with the same radius as in (i), draw an arc to cut C D at P . T hen P is the required point. 5 .1.3
-
B isecting an a ngle.
Purpose.
To construct a line DC which bisects a gi ven angle ADB .
D
Method.
(i) With centre D and radius DB, draw an arc to meet DA (extended if necessary) at D .
(ii) Draw arcs with centres D and B , and radius BD , to meet at a point C. Then DC bisects angle ADB.
-
79
Geom etrical Constructions
5.1.4
Constructing an angle of 60°.
Purpose. Given a line segment OA, to cons truc t a line OB so tha t angle AOB = 60°. I I I
I
' B
I I I I I I I
0'---------' Method.
Draw arcs with centres 0 and A , and radius OA , which meet at a • point B . Then angle AO B = 60°. 5 .1.5
Constructing an angle of 90°.
Purpose. Given a line segmen t OA, to construct a line OB so that angle AOB = 90°.
B
X
Q(
0 I I I I I I I
X
C
A
80
Famous Impossibilities
Method. (i) With centre 0 draw an arc of radius OA which meets AO extended, at the point Q. (ii) With centre A and radius AQ, draw arcs, one above the segment AQ and the other below AQ . (iii) With centre Q and the same radius as in (ii), draw arcs to meet those in (ii) at points Band C respectively. (iv) Join Band C, using the straightedge. This line passes through 0 and angle AOB = 90°. [Note that steps (ii) -(iv) are the same as those for bisecting the line segment AQ. J 5.1.6
Copying an angle.
Purpose. Given an angle ABC and a line segment DE, to construct a line EF so tlie: the angles ABC and DEF are equal.
A
E
R
,.--------1----
D
\ \ \
\ \
\ \
B
\ \ \ \
F\ C
\ \ \ \
Method. (i) With centre B and radius BC draw an arc to meet AB (extended if necessary) at P. (ii) With centre E and the same radius as in (i), draw an arc to meet ED (extended if necessary) at R. (iii) With centre R and the radius PC, draw an arc to meet the arc drawn in (ii) at F. Then angles ABC and DEF are equal. -
81
Geometrical Constructions
5.1.7
Constructing a line parallel to a given line.
Purpose. Given points A and B and a point C not on the line through A and B, to construct a line segment CD wbica is parallel to
AB.
C........",,..-------D
A--------""'-"'-B Method.
(i) Use the straightedge to draw the line joining C to B. (ii) Use the angle copying construction given in 5.1.6 above to construct CD so that angles ABC and DCB are equal. Then CD is parallel to AB. •
5.1.8
Use of Compass and Straightedge.
In all of the above constructions, the straightedge has been used in strict conformity with the rules laid down by the ancient Greeks, who imagined the straightedge as being free of any markings. Thus, although it could be used to draw straight lin es, it could not be used to measure distances between points or to transfer lengths. Our use of the compass, on the other hand, calls for a more versatile compass than that allowed by the Greeks. The extra versatility lies in our assumption that the compass can be used to draw circles in both of the ways described in 2(a) and 2(b) at the start of Section 5.1, whereas the compass imagined by the Greeks could only be used in the first of these two ways. By allowing 2(b) , we are assuming that our compass can be opened to a specified radius (as determined by the two points) and then picked up from the paper and used to draw a circle of this radius around a third point. We have used the compass this way in steps (i) and (ii) of 5.1.2 above, for example. The Gr eeks presumably thought of their compass as "collapsing" as soon as it was lifted from the paper and so it could not be used directly to transfer lengths.
82
Famous Impossibilities
It turns out, however, that it does not matter whether we use a noncollapsing compass (as in the constructions 5.1.2 and 5.1.6 above) or the apparently less powerful collapsing compass of the Greeks. It can be shown that any construction which can be performed with a noncollapsing compass can be performed also with a collapsing compass. Use of the collapsing compass may, of course, involve taking more steps in the construction. For the details, see Exercise 10 below.
Exercises 5.1 1. Given two points A and B and a point C not on the line through A and B, show how to construct (using only straightedge and compass) a line through C which is perpendicular to AB.
• C A--------B 2. (a) On a sheet of paper construct a line segment A'B' having exactly the same length as AB shown below:
A--------B (b) Construct a line segment having one quarter of this length. For which positive integers n can you construct, in a similar way, a line segment whose length is times that of AB?
k
3. (a) On a sheet of paper construct an angle A'O'B' which is equal to the angle AOB shown below: A
o B
Geometrical Constructions
83
(b) Construct an angle which is one quarter of this angle. (c) For which positive integers n can you construct, in a similar fashion, an angle which is ~ times the angle AGB? 4. (a) Construct an equilateral triangle using straightedge and compass. (b) Which angle have you trisected in part (a)? How would you trisect an angle of 90°? (c) Can you trisect, with straightedge and compass, an angle of 60°? [Hint. Re-read Section 0.3.] 5. (a) Draw on a sheet of paper a line segment about 2 em long and, taking this as your unit of length, construct line segments of lengths V3 and 1 + V3. [Hint. What is sin(600)?] (b) Appearances can be deceiving! Show that
[Thus you have already constructed in part (a) line segments having these horrible looking lengths.] 6. How would you construct a line segment of length J2? 7. Find positive integers a, b, c, d (different from those in Exercise + bV3 = c + d/3. 5(b)) such that
Va
8. (a) Given an angle AGE , describe how you would construct (using only straightedge and compass) an angle which is approximately equal to one third of angle AGB, with an error which is less than one eighth of the original angle. [Hint. Find an integer a with the fraction a/8 sufficiently close to 1/3.] (b) As for (a), but now suppose that the error is to be less than one sixteenth of the original angle. (c) Assume now that k is a positive integer. Describe a method of approximately trisecting an angle, using only straightedge and compass, which gives an error of at most 2- k of the original angle.
84
Famous Impossibilities
9. Verify that the constructions 5.1.1, 5.1.3 and 5.1.4 can be performed exactly as described, using a collapsing compass. 10. First read through the construction set out below and then do (a) and (b) below. (a) Prove that the circle constructed at step (iii) has the required properties (thereby proving that the construction achieves its purpose). (b) Explain why the construction shows that a collapsing compass can be used to draw any circle that a noncollapsing compass can draw.
Purpose. Given points 0, A and B, to construct a circle with centre 0 and radius AB witu a collapsing compass.
___-
P
- ___
Q Method. (i) Draw one circle with centre 0 and radius equal to the length of OA and another circle with centre A and radius OA. Let P and Q be the points where these two circles intersect. (ii) Draw a circle with centre P and radius PB and another circle with centre Q and radius equal to QB. These circles intersect at B and also at another point which we call R. (In the diagram above, only parts of these circles are shown.) (iii) Draw the circle with centre 0 and radius OR . This is the required circle.
85
Geometrical Con structions
11. The figur e b elow shows how to construct a line segment CD which is parall el to a line AB by usin g a "rusty compass" , that is, one which never alters its radius.
D
B
Wh y is CD par allel to A B ? (For more on this fascinating topic see Ch ap ter 17 of [MG].)
5.2 5.2.1
Products, Quotients, Square Roots Constructing a product.
Purpose. Given line segments of lengths 0' and {3, to construct a line segment of length 0'{3 .
B
o
x
Method. (i) Draw two lines which intersect in a single point O. (ii) On one of these lines, construc t a point A such that the length of OA is 0' . (iii) On the ot her line construct a p oint U such that the length of OU is 1, and a po int B such that the length of OB is {3.
Famous Impossibilities
86
(iv) Construct (as in 5.1.7 above) a line through B parallel to U A and meeting the line OA, extended if necessary, at X.
Result. The line segment OX has length 0:{3. Proof. Let x be the length of OX . Because the triangles a X B are similar, x {3 = 0: 1 and so x = 0:{3. 5.2.2
a AU and
•
Constructing a quotient.
Purpose. Given line segments of lengths 0' and {3 line segment of length 0'/{3 .
=I 0,
to construct a
U
o
.A
X
Method. (i), (ii) and (iii) are as in 5.2.1. (iv) Construct (as in 5.1.7) a line through U parallel to BA and meeting OA (extended, if necessary) at X. Result.
The line segment OX has letigtl: 0'/{3.
Proof.
Let x be the length of OX. By similar triangles,
x
1
= {3 0' and so
•
87
Geometrical Constructions
5.2.3
Constructing square roots.
Purpose. Given a line segment of lengtli segment of lengtli
va.
0'
> 0, to construct a
x
U"-L-_~-'-------------1"'"'A
...- 1 ~)"""!-------O' -----~
Method. (i) Draw a line segment OA of length U so that 0 U has length 1.
0'
and extend it backwards to
(ii) Bisect the line segment UA (as in 5.1.1) and construct a semicircle with the midpoint as centre and radius half the length of UA . (iii) Erect a perpendicular to U A at 0 (as in 5.1.5) which intersects the semicircle at a point X , say.
Result. The line segment OX ues leugtu
va.
Proof. Let x be the leng th of OX. The angle UXA is a right angle, being the angle subtended by a diameter of the circle at a point on the circumference. Hence th e triangles UOX and XOA are similar. It follows that x
1
which gives :1:
0'
= :1:
=)0:.
•
Exercises 5.2 1. (a) St art with a line segment which you define to have length 1 (a segment about 2 ern long should be ideal) and construct, with straightedge and compass, line segments of length
(i) 4,
(ii) 1/3,
(iii) 7/5.
88
Famous Impossibilities
(b) Would you now revise your answer to Exercises 5.1 #2(b)? (c) For which positive rational numbers r can you construct (with straightedge and compass) a line segment of length r? 2. Start with a line segment you define to have length 1 and then construct, with straightedge and compass, segments of length (i)
vI2,
(ii)
VI + vI2,
(iii)
V2 - vI2,
(iv) ~.
3. Suppose you are given a line segment of unit length. Convince yourself that you could construct (but do not bother to carry out the actual constructions accurately) line segments of lengths (i) V3/7 + 6V5
,
(ii) 3 V7/3
+ 15/7 .
4. In the figure IOAI = lOBI = 1, angle AOB = 36°, r is the length of AB and the angles P AO and P AB are equal. A
r
o B
(a) Assume that you are given line segments of lengths 1 and r . Show how you would use straightedge and compass to construct an angle of 36°. (b) By calculating the sizes of several angles in the figure, show that triangles 0 AB and APB are similar (isosceles) triangles and hence calculate r . (c) Starting from a suitable line segment of unit length, construct (with straightedge and compass) a segment of length r. 5. Use Exercise 4 to construct (with straightedge and compass only) a regular pentagon (figure with 5 equal sides and angles) inscribed inside a circle. (Make the radius of the circle about 4 cm.)
89
Geometrical Constructions
5.3
Rules for Straightedge and Compass Constructions
A fairly precise description of the way the straightedge and compass can (and cannot) be used in performing constructions has already been given in the Introduction and, in more detail, at the start of Section 5.1. The purpose of this section is to spell out in complete detail the very precise rules (as passed down from the Greek geometers) which must be followed in these constructions. In each of these construction problems, a set of points is given at the start, say {Po, PI, .. . ,'p'n}, which we shall call the initial set of points for the construction. Some lines and circles may also be given. Finitely many new points
(and extra lines and circles) may be added, using straightedge and compass, to achieve the desired result. We are not permitted to put points, lines and circles randomly on the page - rather we must base new lines and circles on points given at the start or already constructed. New points are where these lines and circles meet other lines and circles . Thus the precise rules are that, at any stage of the construction process, we may add new points (and lines and circles) to the old ones already in existence by performing an operation of one of the following two types:
5.3.1
Construction Rules.
(i) Draw a line throvqb. two old points nand Pj to get new points where this line, extended if necessary, intersects other lines and circles.
p.,.
•
...--- ... ,,, ,
,, ,
,
\
/
r
90
Famous Impossibilities
(ii) Draw a circle with centre at an old point Pi and radius equal to the distance between two old points Pj and Pk to get new points where this circle intersects other lines and circles.
....
-,
-, \
..-
........ -,
•
Rule (i) tells us very precisely how the straightedge is to be used, while rule (ii) does the same for the compass. These rules tell us the only ways we are allowed to enlarge the set of points, lines and circles. Note that as a special case , P, may be the same as Pi in rule (ii). This permits us to draw a circle with centre at an old point Pi and radius equal to the distance between Pi and another old point Pk •
....
....
.... \
-,
....
"
..-
Thus rule (ii) does allow us to use the compass in the two ways 2(a) and 2(b) described at the start of Section 5.1. Case 2(a) is when Pi = Pj and 2(b) is when Pi and Pj are different. In summary then, in any construction problem, begin with an initial set of points (and lines and circles), and enlarge the set of points using finitely many operations of types (i) and (ii) to achieve the desired result. As you would expect, all of the constructions in Sections 5.1 and 5.2 have been carried out according to these rules. In Example 5.3.2
91
Geom etri cal Construct ions
below, we spell this out for the construction in 5.1.1 and leave you (see Ex ercises 5.3 #1 -3) to check thi s for the rest. of the constructions .
5.3.2 Example. The construction "bisect a line segment" given in 5.1.1 follows t he rul es just given . Proof. The ini ti al set of points for the construction is {A ,B} . Hence, in the notati on above we may pu t Po = A and PI = B . The construction proceeds by the drawing of two circles with centres A and B resp ectively, and radius equal to the distan ce betw een the points A and B. Thus these circles ar e of th e ty pe permitted by rule (ii), and so the new p oints D and E , being points where the circl es intersect , can be obtained by two op erations of typ e (ii). Hence we may put P2 = D and P3 = E . The final stage in the cons truction involves dr awing t he line DE joining these two points t o get the point C where this line intersect s the line AB . Hence C can be obt ained by an operation of type (i), and so we may cho ose P4 = C. The desir ed midp oin t C has thus be en obtaine d from the set of initial points by successi ve operations of the ty pes (i) and (ii) . We conclude this section by re-examining the fam ous construction problems in the light of the construction rul es given above.
5.3.3 Doubling the Cube. We ar e given a cube and asked t o construct a cube with double the volume . \Ve first translate this into the following equivalent pr oblem in two dim ensions. Given two points Po and PI whose distan ce ap ar t is equal to one 'sid e of the origin al cube, we are asked to construct two points Pi and P, whose distance apa rt is exactly {Y2 times the distance between Po and PI-
r,
Po d PI Pi {Y2d The qu estion is whether or not this is possible using operations of ty pes (i) and (ii) only. -
92
Famous Impossibilities
5.3.4 Squaring the Circle. As in 5.3.3, this is equivalent to being given just two points Po and PI (distance apart equal to the radius of the circle) and being asked to construct two points Pi and P, whose distance apart is exactly ..j1i times the distance between Po and Pl .
As always , the question is whether or not this is possible using only finitely many operations of types (i) and (ii). •
5.3.5 Trisecting an Angle. We are given the points Po, PI and P2 , which determine the angle. We are then asked to construct (using only operations of types (i) and (ii)) a point Pi such that angle PiPOPl is exactly one third of angle P2POPl •
Po
• Exercises 5.3 1. Verify that the construction 5.1.2 (transferring a length) can be performed by successive operations of types (i) and (ii) of 5.3.1.
93
Geometrica.l Constructions
2. Verify that the construction 5.1.3 (bisection of an angle) can be performed by successive operations of the types (i) and (ii) . 3. Satisfy yourself that all the constructions in Sections 5.1 and 5.2 can be performed by successive operations of types (i) and (ii).
5.4
Constructible Numbers and Fields
In analysing construction problems, it often helps to introduce a convenient unit of length. For example, in analysing the problem of doubling the cube, a convenient unit of length is the length of one side of the given cube. Doubling the cube (see 5.3.3 above) is then possible if, given points Po and PI one unit apart, we can construct points Pi and Pj whose distance apart is ij2, using only operations of types (i) and (ii). This leads to the following definition. 5.4.1 Definition. Let 1 be a real number with absolute value 11/. Then 1 is said to be constructible if we can construct points Pi and Pj whose distance apart is 111 units by: starting from an initial set of points {Po, PI} whose distance apart is 1 unit and then performing a finite number of operations of types (i) and (ii) of 5.3.1. • Clearly, doubling the cube is possible if and only if the number ij2 is constructible, while squaring the circle is possible if and only if V7f is constructible. The following two theorems and the related discussion make it clear that the set of constructible numbers is quite large. They also mark the place where geometry and algebra begin to overlap. 5.4.2 Theorem. [Field with Square Roots Theorem] The set CO N of all constructible numbers is a sllbfield of R Furthermore (a) all rational numbers are in CON, and (b) if 0' E CON and
0'
> 0 tliea
..;a E
CON .
Proof.
~m to show that CON is a subfield ofR; that is, the oper:Jation s
II
~~ _a~dition, subtraction, multiplication and division (except by 0) can be performed without restriction in C ON .
94
Famous Impossibilities
Let a and 13 be in CON. Hence line segments of lengths lal and can be constructed by successive operations of types (i) and (ii) of 5.3.1, starting from a segment of length 1 unit. Because we can transfer lengths (by 5.1.2), it is easy to see that segments of lengths [o + 131and la - 131 can be constructed from the above two segments. (If, for example, the segment AB has length a and segment CD has length 13, we can extend AB and use 5.1.2 to construct a point E so that BE has length 13 and AE has the desired length a + 13.) We can also construct, using operations of types (i) and (ii), segments of lengths la131 and lal131 (if 13 i= 0), by 5.2.1 and 5.2.2. Thus the numbers a + 13, a - 13, a13 and al13 (if 13 i= 0) are all constructible and so are in CON. Hence CON is a field.
1131
We now prove that all rat ional numbers ar e in CON.
Since we are given a segment of length 1, we can construct segments of lengths 1 + 1 = 2, 2 + 1 = 3 and so on, from which we see that we can construct segments of length equal to any positive integer. Hence, by 5.2.2, we can construct a segment of length equal to any desired positive rational number min (with m , n EN) . Because of the use of I/'I in Definition 5.4.1, it follows that all rational numbers (negative as well as positive) are in CON . (Alternatively, the result that Q ~ CON follows from Proposition 1.1.1 and the fact that CON is a field.) Finally, by 5.2.3, if a E CON and a> 0, then ,jO: is constructible.
•
By applying this theorem repeatedly we see that each of the numbers 1+1=2,
5,
2
5'
is constructible. Thus we see that every real number which is obtained from Q by performing successive field operations and taking square roots is constructible. The following theorem expresses this precisely, in the notation of field extensions.
95
Geometrical Constructions
5.4.3 Theorem. [Successive Square Roots Give Constructibles] A real number I is con structible if th ere exist positive real numbers I I , 12 , .. . ,In sucl: th at I I E IF I where IF I = Q , 12 E 1F 2 where 1F 2 = F I ( v0i),
I n E IF n where F n = Fn- I(Vl n-I) , and, finally, I E F n+ 1 where IF n +1 = IF n (.tYn").
Proof. This follows immediately from th e Fi eld with Squ are Root s Theorem . Notice that th ere is a tower of field s
implicit in this theorem. This tower will play an import ant role in the impossibility pr oofs in the next chapte r. In the followin g example, the fields IF I, 1F 2 , . .. and the numbers , 1,,2, . .. relevant to a particular I are indicated.
5.4.4
Example.
Let
VS -
3y'2 I = 5V2 + (1 _ y'2) . Prove that I is cons truc tible by using t he "Successive Square Ro ot s Giv e Constructibles" Theorem.
Proof. r=;;:-:ust prod uce a posi tive integer n and posit ive rea l nUE lbers J ~1.'_~2' . . . , In as in the statement of t he" Successive Square Root s Give Cons t ructi bles" T heorem suc h t hat I E F n + t = F n ( vr;;-).
II
Put
II =2 12 = =
E
IF I
where
IF I
=
Q,
E
1F 2
where
1F 2
=
IF I ( ~) .
E
1F 3
where
1F 3 = 1F 2 ( ..fi2).
S-3V2 S-3~
Hen ce 1 =5 ~ + 1 ..fi2 v0i I I
96
Famous Impossibilities
Thus we have produced positive real numbers 1'1,1'2 which satisfy the hypothesis of the "Successive Square Roots Give Constructibles" Theorem. Hence I' is constructible. In this chapter we have concentrated on showing that there are many different constructible numbers. In the next chapter (where we prove the impossibility of the three famous constructions) we take the opposite point of view and concentrate on showing that there are many real numbers which are not constructible. Indeed, we will prove there that the only constructible numbers are those given by the "Successive Square Roots Give Constructibles" Theorem. That is, we prove that any number which cannot be obtained from Q by performing successive field operations and taking square roots is not constructible. This result, which we call the "All Constructibles Come From Square Roots" Theorem, is the converse of Theorem 5.4.3 .
Exercises 5.4
V5 -
1. Let I' = 3/2 + 3/2. Prove in the following two different ways that I' is constructible: (a) firstly, by applying the Field with Square Roots Theorem, (b) secondly, by applying th e "Successive Square Roots Give Constructibles" Theorem. 2. Show that the zeros in R of the polynomial 2X 4 - 6X 2 - 3 are all constructible real numbers. [Hint. This is a quadratic in X 2 .] 3. Given constructible numbers 0' and /3, describe how you would use 5.1.2 to construct a segment of length 10' + /31 in each of the following cases:
0' and /3 have (ii) 0' and /3 have (i)
the same sign , opposite sign.
4. [This exercise shows that, in some sense, if we could use infinitely many operations of types (i) and (ii), every real number would be constructible.]
Geome trical Constructions
97
(a) Show that every real number can be approximated arbitrarily closely by a rational number. (b) Show that every real number can be approximated arbitrarily closely by a constructible number.
Additional R eading for Chapt er 5 Dasic straightedge and compas s constructions , including bisection of line segments and angles, are given in Chapter 4, Lesson 8 of [IIJ]j proofs that the constructions achieve their intended purpose are includ ed. A discussion of collapsing and modern (or noncollapsing) compas ses is given in Section 4.1 of [lIE]. Th e constructions for sums, differences, products, quoti ents and square roots of const ruct ible numb ers are given in Sect ion 39.1 of [JF ]. Th e rules governing the use of straightedge and compass are explained on pages 7 and 8 of [ER]. Thos e who have previously met the constructions of bisectin g a line segment , bisecting an angle, etc , will notice that our constructions do not use arbi t rary choices. Phrases which are common in elementary geomet ry textbooks such as "draw a suitable arc" or "choose a suitable radius" are absent from our presentation. For a discussion of such ind et ermin at e const ruct ions see page 13 of [lIII]. It is interest ing to not e th at all const ructions that can be performed using st ra ightedge and compass can also be perform ed using a compass alone. It is also tru e t ha t all such constructions can be perform ed using a st raightedge and a single circle, or even an arc of a circle, tog eth er with it s cent re. For a discussion of th is work of J ean Victor Poncelet , J acob Steiner, Lorenzo Mascheroni and Georg Mohr see Chap te r 17 of [MG], and also [AI<] and [JS]. A very different probl em from that considered in th is book is t hat of squaring th e circle using scissors. More precisely, the problem is to cut a circle (or rat her , a disc) int o finit ely many pieces using scissors (cutting along Jordan ar cs) and then to reassemble th e pieces into a square of the same area. It is proved in [LD], however , th at thi s const ruction is impossible. Wheth er th e construction is possible when more weird "pieces" are allowed than t hose obtained by cutt ing with scissors is described in [SW] as an unsolved problem . As mentioned in [RR] this probl em has been solved recent ly with a proof th at the const ruction is possible. The analagous probl em in three (or higher) dimensio ns is solved by th e Ban ach Tar ski (Paradox) Theorem. An exposition of this astonishing th eorem appears in [RF ]. [LD] L. Dubins, M.W . IIirsch and J . Karu sh, "Scissor Congruence" , Israel Journa l of Mathematics, 1 (1963),239-247 . [lIE] II. Eves, A Survey of Geomet ry, Allyn and Dacon, Boston , Massachu setts, 1972. [JF] J .D. Fraleigh ,.Ii First Course in Abstmct A lgebra, 3rd edition, Addi son-Wesley, Reading, Massachusetts, 1982. [R F] R.M. Fren ch, "T he Bana ch-Ta rski Th eorem" , Th e Math ematical Int elligencer, 10 (1988) , 21-28. [MG] M. Gardner, Math ematical Circus, Penguin Dooks, Middl esex, England, 1981.
98
Famous Impossibilities
[EH] E.W. Hobson, Squaring the Circle, Cambridge University Press, 1913; reprinted in Squaring the Circle, and Other Monographs, Chelsea, 1953. [HH] H.P. Hudson, Ruler and Compass, Longmans Green, 1916; reprinted in Squaring the Circle, and Other Monographs, Chelsea , 1953. [I1J] H.R . Jacobs, Geometry, Freeman, San Francisco, California, 1974. [AK] A.N. Kostovskii, Geometrical Constructions Using Compasses Only, Blaisdell, New York, 1961. [RR] R . Ruthen, "Squaring the Circle", Scientific American, 261 (1989), 11-12. [JS] J. Steiner, "Geometrical Constructions with a Ruler Given a Fixed Circle with its Center", Scripta Mathematica, 14 (1948), 187-264. [SWJ S. Wagon, "Circle-Squaring in the Twentieth Century", The Mathematical Intelliqencer, 3 (1981),176-181.
CHAPTER 6
Proofs of the Impossibilities
H/ithin this clinp te: it finally becomes clear w11Y tlie tlitee famous geometrical constructions are impossible. You are now, at long last , in a position to see tlie solutions of problems wliicl: defied tiie world's best taetlieuuiticinsss for over two tlIousand years. Th e key to tlie solutions lies in combining the geometrical ideas from Cheptet 5 witl: th e algebraic ideas from earlier cliep ters . Our first task is to show t1wt th ere are no cons tructible numbers ot her tluu: tho se already found in Cliepte: 5. T11is enables us to S110W tluu. every construc tible number is algebraic over Q and has a degree whicu is a power of 2. Hre are tbex: abl e to sho w tliet, if doubling tbe cube or trisecting an arbit rary angl e were p ossibl e, tliete would luive to exist a constructible number whose degree is not a p ower of 2. This cont radiction mean s tluit we can be certa in thnt these cons truct ions (Problems I and II) are impossibl e. lFe com plete tlie proof of tlie impossibility of squaring the circle (P ro blem III) by showing in Clieptet 7 that 11" is not alge braic over Q.
99
100
Famous Impossibilities
6.1
Non-Constructible Numbers
In Chapter 5 we found that real numbers like
which are obtain ed from elements of Q by successively applying field operations and taking square roots, are all constructible. For the purposes of this chapter, however, it is of the utmost importance to know whether we can go in the reverse direction: can every constructible number be expressed in terms of repeat ed sq7wre roots and field operations, starting from numbers in Q ? The following theorem shows that th e answer is "yes" .
6.1.1 Theorem. [All Constructibles Come From Square Roots] If the real number "( is constructible, tben tliete exist positive real numbers "(1 ,"(2 , ... ,"(n sucl: tluit "(1
E IF I where IF I = Q,
"(2
E F 2 whete 1F 2 = IF I (v0!),
"(n E IF n where IF n = IFn-I(J"(n-t), and, finally, "( E IF n+1 where F n+ 1 = IF n( ~).
Proof. We postpone this rather long proof until Section 6.3. The idea of the proof is that when you intersect lines and circles, th e worst that can happen is that you need extra square roots to describe the coordinates of the points of intersection. The example below shows this happening in a particular case . 6.1.2 Example. Start from two points Po = (0,0) and PI = (1,0), at a distance 1 apart, as in the definition of a constructible number (Definition 5.4 .1). Draw two circles with centres at Po and PI respectively, each with radius POPI . Show that the coordinates of each of the points of intersection, P2 and P3 , of the two circles involve a single square root.
Proofs of the Impossibilities
Proof.
101
The circles have equations
x 2 + y2 = 1, (x - 1)2 + y2 = 1.
(1) (2)
The points of intersection satisfy both equations. Subtracting (1) from (2) gives x = 1/2 and substituting in (1) gives y = ±.;3/2. Thus the two points of intersection are P2 = O '~) and P3 = (~, -~) , which involve nothing worse than a square root, .;3.
•
We now apply the "All Constructibles Come From Square Roots" Theorem to show that every constructible number must be algebraic over Q and must have degree over Q which is a power of 2. This result , which is the key to the impossibility proofs, enables us to b e certain that many numbers are not constructible.
6.1.3 Theorem. [Degree of a Constructible Number Theorem] If a real number "i is constructible, tlieu , is algebraic over Q and deg( "i, Q) = 25 for some integer s 2: o. Proof.
Let "i be constructible and let "iI, . . . , "in be as in the "All Constructibles Come From Square Roots" Theorem. The number .JYi is a zero of the polynomial X 2 - "ii, which is in Fi[X] since "ii E F i. Hence, by Definition 2.3.2, deg(.JYi, F;)
= 1 or
2
Famous Impossibilities
102
and since F i+! = Fi( v0i) it follows from Theorem 3.2.4 that [Fi+! : Fd = 1 or 2
(1~i~n).
Repeated application of the Dimension for a Tower Theorem (Theorem 3.4.3) to the tower Q = F! ~ F 2 ~ F 3 ~ ... ~ F Il+!
shows that
[FIl+! : Q] = [FIl+! : FIl] [FIl : FIl-d . .. [IF 2 : IFd = 2u
,
for some integer u
~
O.
It follows from Theorem 4.4.1 that "'( is algebraic over Q. Also, by considering the tower
Q ~ Q("'() ~ F Il +! we see that deg("'( ,Q) is a factor of [IF Il+! : Q]. Hence
for some integer s
~
•
O.
It should be noted, perhaps with a little surprise, that the converse of the above result is false. A counterexample to the converse is given as Example 13-18 in [WG]; this example describes a specific number "'( which is not constructible but which is algebraic over Q with deg("'(, Q) = 22•
Exercises 6.1 1. Let "'( = 2 +
J3 + .)5.
(a) Prove that "'( is constructible by using the Field with Square Roots Theorem (Theorem 5.4.2). (b) Prove that "'( is constructible by using the "Successive Square Roots Give Constructibles" Theorem (Theorem 5.4.3). [Hint. Show how to choose
"'(I, "'(2,
etc.]
(c) What does the Degree of a Constructible Number Theorem now tell you about the degree of "'(?
Proofs of the Impossibilities
103
2.
Combine the "Successive Square Roots Give Constructibles" and the "All Constructibles Come From Square Roots" theorems into one theorem.
3.
Use the Degree of a Constructible Number Theorem and the fact that irr( ij5, Q) = X 3 - 5 to prove that
4.
.ys is not constructible.
(a) Use the Degree of a Constructible Number Theorem to prove that the number 1 + .ys is not constructible. Include a full proof that deg(l + .ys, Q) is what you claim it to be . (b) Use the result of Exercise 3 and the fact that the constructible numbers form a field to give an alternative proof that the number 1 + .ys is not constructible.
5.
In Example 6.1.2 in the text, let P4 be the point of intersection of the line segments POPl and P2P3 . (a) Write down the equation of the circle with centre P2 and radius P2P4 • (b) Show that this new circle intersects the two circles already constructed in four points all of whose coordinates lie in the field Q( J3)( /39).
6. ** Show that the the set of all constructible numbers is countably infinite. [Hint . Recall that a set S is said to be countably infinite if there exists a function f : 1\1 ~ S which is one-to-one and onto.]
6.2
The Three Constructions are Impossible
At long last, we are in a position to see why the constructions in the Introduction are impossible. In each case we shall show that if the construction were possible, then there would exist a constructible number which is not algebraic or whose degree over Q is not a power of 2. Because the existence of such a constructible number would contradict the Degree of a Constructible Number Theorem (Theorem 6.1.3), we can be certain that each construction is impossible.
Famous Impossibilities
104
6.2.1
Problem I - Doubling the cube.
A cube whose volume is 2 must have sides whose lengths are ij2. Hence doubling the cube amounts to constructing a line segment whose length is ij2 (starting from a line segment of length 1 and using only straightedge and compass) . Thus if the cube could be doubled, then ij2 would be a constructible number. By Example 4.3.3, however, irr(~, Q) = X 3
and so deg(~, Q)
-
2
= 3,
which is not a power of 2. This shows that ij2 is not a constructible number, and so the cube cannot be doubled.
6.2.2
Problem II - Trisecting an arbitrary angle.
If we could trisect every angle then, in particular, we could trisect an angle of 60°. But since 60° is an angle which can be constructed (see Section 5.1), it follows that we could then construct an angle of 20° also. Hence, using a right-angled triangle (see Exercises 6.2 #2 below), we could also construct a line segment of length cos(200). It is easy to show, however , that the irreducible polynomial of cos(200) over Q is a cubic (sec Exercises 6.2 #3 below for the details) . Hence deg(cos(200), Q) = 3
and so cos(200) is not constructible, as 3 is not a power of 2. Thus it is not possible to construct an angle of 20°. You should note that the argument just given shows that there is one angle (namely 60°) which cannot be trisected. But it is easy to deduce from this result (as, for example, in Exercises 6.2 #4 below) that there are infinitely many angles which cannot be trisected. Note also that the argument given above that the angle 60° cannot be trisected, relics crucially on the fact that we can construct an angle of 60°. Indeed, the argument given cannot be used to decide whether it is possible to trisect other angles (such as 10°) which cannot themselves be constructed (starting just from a pair of points Po and Pd.
Proofs of the Impossibilities
6.2.3
105
Problem III - Squaring the circle.
A circle of radius 1 has area equal to 7r square units, and so a square with the same area would have sides of length ft. Thus if squaring the circle could be done with straightedge and compass, it would follow that ft is constructible. This would mean (by Theorem 6.1.3) that ft is algebraic over Q, and hence that 7r = ft.ft is algebraic over Q by Theorem 4.4.4. We complete the proof of the impossibility of squaring the circle in Chapter 7 in which we show that 7r is transcendental (that is , not algebraic over Q).
6.2.4
Other constructions which are impossible.
In [CR] it is shown that not every angle can be divided into five equal parts, with straightedge and compass. As a generalization, we indicate a proof below (in Exercises 6.2 #7 and 8) that, for any given positive integer n which is not a power of 2, it is impossible to divide an arbitrary angle into n equal parts. Although there are proofs for the impossibility of trisecting angles which do not involve the full algebraic machinery developed in this book, it is remarked in [CR] that such proofs are not so easy to adapt to proving further impossibilities such as that of quintisecting an arbitrary angle.
Exercises 6.2 1. Decide in each case whether the number is constructible and give reasons . You may assume .;Y2 is not constructible.
V2
(i) + ..;2, (iv) ..;2 .;Y2,
(ii)
vr2,
(v)
..;2 + .;Y2.
(iii) ij2,
2. (a) Assume that an angle fJ has been constructed. Show how you would construct lengths sin fJ and cos fJ from it, using straightedge and compass. (b) Assume, conversely, that a length sin fJ has been constructed. (i) Show algebra ically that cos fJ is then constructible. [Hint. Use the Field with Square Roots Theorem.] (ii) Show geometrically how to construct cos fJ.
Famous Impossibilities
106
(iii) Show how you would use straightedge and com pass to construct the angle 0 (0 < 0 < ~) from sin 0 and cos O. 3.
(a) Use the identity cos(30) = -3 cos 0 + 4 cos 3 0 to show that cos(200) is a zero of the polynomial 8X 3 - 6X - 1. (b) Deduce that cos (20°) is not a constructible number. Which part of Exercise 2 now shows that an angle of 20° cannot be constructed' with straightedge and compass? (c) Which angle have you proved cannot be trisected?
4.
(a) Show that an angle of 30° cannot be trisected . [Hint . Note that an angle of 60° cannot be trisected.] (b) Repeat for an angle of 15°. (c) Show that there are infinitely many angles which cannot be trisected.
5.
(a) Let 0: E IR be a zero of the polynomial p(X) = X3 in Q[X]. Prove each of the following :
+ 2X + 1
(i) p(X) is irreducible over Q. (ii) p(X) = irr(o: , Q). (iii)
0:
is not constructible.
(b) Can we assert that 1 - (\' + (\'2 6.
=I O?
Why?
Let 0: and {3 E IR. In each case decide whether the sentence is true or false. Give reasons. (a) If 0: and {3 are constructible then so is 0: + (3. (b) If 0:
+ {3 is constructible then so are (\' and (3.
(c) If 0:{3 is constructible then so are 0: and (3. (d) If 0' + {3 and 0'{3 are both constructible then so are 0: and {3. [Hint for (d). Can you express 0'{3?]
0:
and {3 in terms of 0' + (3 and
7.* In this exercise you will prove the Cos(nO)-formula: For each prime number n > 2 there are integers ai, a3, .. . , a n-2, an such that
cos(nO) =
al
cosO + a3 cos30
+ ... + a n-2 cosn- 20 + an cos no
for all 0 E IR, where an = 2n - 1 and tile remaining coefficients a1, a3, ... ,a n-2 all have n as a factor.
107
Proofs of the Impossibilities
(a) Using the De Moivre Theorem (see [WL]) and the Binomial Theorem show that, if we let c = cos 0, then cos (nO) = c" + G)cn- 2(c2 - 1) + (~)cn-4(c2 - 1)2 + ...+ (n~1)c(c2 - 1)~. The following parts of this exercise assume that the right hand side of this formula is rearranged to give a polynomial in c. (b) Show that the coefficient of c" in this polynomial is
and then use the Binomial Theorem to show that this equals
~((1 + It + (1 + (-1))")
= 2"-1.
(c) By evaluating both sides of the equation in (a) at suitable 0, show that the constant term of the polynomial is zero. (d) State why the prime n is a factor of C) for 1 ~ r ~ n - 1. Deduce that the coefficients of the remaining powers of c in the polynomial in part (a) all have n as a factor. 8.* Eisenstein's Irreducibility Criterion states : the polynomial
is irreducible over Q if tbete exists a prime number p such that p is not a factor of an, but p is a factor of aD, .. . , an-I, and p2 is not a factor of aD. (For a proof see [JF], Theorem 31.4.)
(a) Let n > 2 be a prime number. Choose 0 so that cos(nO) = n:1 and then deduce from the Cos( nO)-formula and the Eisenstein Irreducibility Criterion that deg( cos 0, Q) = n. (b) Deduce that, for each positive integer n which is not a power of 2, there exists an angle which cannot be divided into n equal parts via straightedge and compass constructions.
108
6.3
Famous Impossibilities
Proving the "All Constructibles Come From Square Roots" Theorem
In order to describe the points (and the associated lines and circles) which appear when the operations (i) and (ii) of 5.3.1 are applied, we work with the coordinates of these points. To do so, we introduce the following definitions.
6.3.1
Definitions.
Let F be a subfield of R.
A point is an F-point if both of its coordinates are in F. A line is an F-line if it passes through two F-points. A circle is an F-circle if its centre is an F-point and its radius is the distance between two F-points. -
6.3.2
Example.
(a) (2,1) is a Q-point, because both its coordinates belong to Q. (b) (1, V2) is a Q( V2)-point. (c) {(x,y) : y = x} is a Q-line, because it contains the two Q-points (0,0) and (1,1) . (d) {(x,y): y = V2x} is a Q(V2)-line but not a Q-line. (e) {(x , y) : x 2+ y2 = 4} is a Q-circle, because it has the Q-point (0, 0) as its centre while its radius is the distance between the Q-points (0, 0) and (0,2). _ (f) {(x, y) : x 2 + y2 = 2} is a Q( V2)-circle. If we intersect a Q-line with a Q-circle, do we always get a Q-point? The following example shows that the answer is "no" .
6.3.3 Example. The Q-line {(x , y) : y = x } intersects the Q-circle {(x, '!J) : x2 + y2 = 4} at the two points (V2, 12") and (-12", - 12"). These points are not Q-points. However, they are Q(12")-points. -
Proofs of the Impossibilities
109
6.3.4 Remarks. Assume that we are performing a construction as described in Section 5.3 and that we have already obtained points Po, PI , .. . .Pi, Let F be a subfield of R containing the coordinates of all the points Po, Ph ... , Pk so that these points are all F-points (from Definition 6.3.1). If we perform an operation of type (i) or (ii) of 5.3.1, we might obtain several new points Pk+l, ... , Pk+t . The lines and circles which appear in (i) and (ii) are F-lines and F-circles, and so each of the new points obtained is at the intersection of (1) two F-lines,
or
(2)
an F-line and an F-circle,
or
(3)
two F-circles.
The next lemma tells us that, at worst, the coordinates of these new points involve extra square roots. -
Let F be a subfield of R. (a) If two F-lines intersect in a single point, then the point is an F-point. (b) Given an F-line and an F-circle, there exists a positive number Q' E F such that the points of intersection (if any) of this line and circle are F( val-points. (c) Given two F-circles, there exists a positive number Q' E F such that any points of intersection of these two circles are F( val-points. 6.3.5
Proof.
Lemma.
Note that F-lines have equations of the form
dx
+ ey + f
= 0
(1)
for d, e, f in F , while F-circles have equations of the form x
2 + y2 + px + qy + r = 0
(2)
for p, q, r in F . (a) To find where two F-lines meet, we solve two simultaneous equations of the form (1). This can be done just using field operations +, -, ., /, so the solutions are both in F .
110
Famous Impossibilities
(b) Finding where a line and a circle meet amounts to solving two equations, one of the form (1) and the other like (2). This is easily done by using the - b ± Vb 2 - 4ac 2a
formula for solving the quadratic ax 2 + bx + c == O. Since, at worst, square roots are introduced, we obtain F( yIa)-points for some positive 0' E F. (c) The two F-circles X2 + Y2 + PIX + qIY 2 2 x + y + P2X + q2Y
+ rl = + r2 =
0, 0
meet where the first of thes e and the F-line
meet. So this case follows from (b).
•
With these preliminaries out of the way, we can now prove the "All Constructibles Come From Square Roots " Theorem. Proof of Theorem 6.1.1. Assume that l' is constructible so that, by the definition of constructible number (Definition 5.4.1), there exists a set of points
constructed as in Section 5.3, with 11'1 equal to the distance between two of these points. Because the initial points Po and PI are at a distance 1 unit apart, we can introduce coordinate axes so that Po = (0,0) and PI = (1,0). Note that the coordinates of Po and PI are in Q.
Initially, the points Po , PI are Q-points. Put F I = Q. Assume that for some k satisfying 1 ~ k ~ n - 1, the points Po, Pi, .. . .P; are Fk-points , where F k is a subfield of R.
111
Proofs of the Impos sibilities
It follows from the Construction Rules in 5.3.1 that the next point, Pk+I , will be at the intersection of
(1) a pair of F k-lines, or (2) an Fk-line and an Fk-circle, or (3) a pair of Fk-circles. Hence , by Lemma 6.3.5, Pk+1 is an Fk+l-point where
Since F k
~
F k+ I, this implies further that
Po, PI, ... , Ph Pk + 1 are Fk+I-points.
From the boxed statements, it now follows by mathematical induction on k that the numbers 1'1 , . .. , 1'n are as described in Theorem 6.1.1. It also follows that 11'1 , being the distance between two of the points PO,PI , .. . .P; (which are Fn-points) , will be th e square root of an element of F n. Thus, for some 1'n E F n with v« > 0,
• Exercises 6.3 1. [This illustrates case (a) of the proof of Lemma 6.3.5.] Show that the point where the two Q( V2)-lines
V2) x + y + 3 - V2 = 0, 2V2x + (4 + V2) y + 1= °
(1 +
meet is a Q( V2)-point.
112
Famous Impossibilities
2. [T his is the general proof of case (a) of the proof of Lemma 6.3.5.] Consider two IF-lines
+ elY + ft d2x + e2Y + h
dlx
= 0
(1)
= 0
(2)
(with d 1, d2, e r, e2, ft, h E IF) which are not parallel. (i) If ei i= 0, use (1) to get an expression for Y, substitute this into (2) and hence find where the two lines meet. Explain why the intersection point is an IF-point. (ii) If el = 0, show that the lines meet at an IF-point. (iii) In (i), you needed to divide by d1e2-d2 el or something similar. Why can you do this? 3. [This illustrates case (b) of the proof of Lemma 6.3.5.] Find where the Q-circle
x 2 + y2 _ 4x - 2y - 4
=0
and the Q-line
x -y+3=0 meet. For which a is this point of intersection a Q( J(i)-point ? 4. Fill in the details of a general proof of case (b) of the proof of Lemma 6.3.5.
Additional Reading for Chapter 6 Many algebra textbooks contain a proof of th e impossibility of th e three geometrical constructions. You may like to compar e our treatment with those in [DB], [Aej, [JF) , [WGj, [CHI and [LS). A self-contained account of Problem III is given in [EH]. The algebraic machinery developed in this book can also be used to characterize those regular polygons which can be constructed using straightedge and compass. See, for example , Section 48.2 of [JF), Sections 135-138 of [AC] and Sections 2.5 and 2.7 of [CHI. [BB) B. Bold, Famous Problems of Geometry and How to Solve Them , Dover, New York, 1969. [AC] A. Clark, Elements of Abstract Algebra, Wadsworth , Delmont , California, 1971. [JF] J .B. Fraleigh , A First Course in Abstract Algebm , 3rd edition, Addison-Wesley, Reading, Massachusetts, 1982.
Proofs of the Impossibilities
113
[WGj W.J . Gilbert , Modern Algebm with Applications, Wiley, New York, 1976. [CHj C.R . Hadlock, Field theory and its Classical Problems, Carus Mathematical Monographs, No. 19, Mathematical Association of America , 1978. [EH] E.W . Hobson, Squaring the Circle, Cambridge University Press, 1913; reprinted in Squaring the Circle, and Other Monogmphs , Chelsea, 1953. [WLj W. Ledermann, Complex Numbers, Library of Mathematics, Routledge and Kegan Paul , London, 1965. [LS] L.W. Shapiro, Introduction to Abstmct Algebm, McGraw-Hill, New York, 1975.
CHAPTER 7 Transcendence of e and
7r
Th e main purpose of this c1lapter is to prove tllat th e number 7r is transcend ental, th ereby com pleting th e proof of the impossibility of squaring th e circle (Probl em III of th e Introdu ction). We first give th e proof the: e is a transcendental number, wlii cu is somewha t easier. This is of considerable interest ill it s own right, and its proof in troduces m an y of the ideas which will be used in the proof for n . "Vith th e aid of some more algebra - tile theory of sym metric p olyn omials - we can then m odify the proof for e to give the pro of for tt . Th e proof for e uses elementary facts a bout in tegration. ( We expect yo u are familiar wit h tilesc facts from a co urse on calculus.) Th e proof for 7r uses in tegration of com plex -valued functions, whicb we in trodu ce in this chapter.
115
116
7.1
Famous Impossibilities
Preliminaries
This section contains the preliminary results which are needed for our proofs that e and 7r are transcendental numbers. The extra preliminaries which are used only in the proof for 7r will be given later.
Factors and Prime Numbers Let a and b be integers. Recall that b is said to be a factor of a if there is an integer e such that a = be. The integer 0 is exceptional in that it has every integer as a factor (if a = 0 then every b satisfies a = be for e = 0) . Another way to say this is that if a is an integer which fails to have some integer b as a factor, then a -I O.
This statement provides a way of proving that an integer is not zero, and it will play an important role in our proof. This leads us to consider ways in which we can show that a given integer fails to have some other given integer as a factor. One consequence of the definition of a factor is that if a is nonzero and has b as a factor then lal ~ Ibl. Hence if b
> lal
and a
-I 0,
then b cannot be a factor of a.
Another consequence of the definition of a factor is that if an integer b is a factor of two integers, then it is also a factor of their sum, their difference and their product . Hence if b is a factor of an integer al bui. b is not a factor of an integer a2, then b is not a factor of al + a2
as otherwise b would be a factor of the difference (al + a2) - al = a2. Recall that an integer P is said to be a prime if p > 1 and if p has no positive factors other than 1 and p itself. The Fundamental Theorem of Arithmetic is stated in Exercises 1.4 #9. It can be shown from this theorem that if a prime number P is a factor of the product ab of two integers a and b, then p must be a factor of a or a factor of b (or a factor of both a and b). From this it is easy to deduce another useful test for failure to be a factor of a product:
117
Transcen dence
if a prime P is not a fa ctor of any of the in tegers , am th en it is no t a factor of th eir produ ct
aI , a 2, a laz
a m'
Another fact whi ch we shall need later is given by the following proposition, whose proof was known to Euclid.
7.1.1
Proposition.
Th er e are infini tely many prime numbers.
Proof. To prove this we note that there is at lea st one prime number (2 - for exam ple) and we shall prove that given any finit e se t of prime numbers (1) ther e is always another prime numb er whi ch is not in the se t . To this end conside r t he integer K
= PIP2 .. ' PlI + 1.
(2)
Each Pi is a factor of t he first te rm on t he right-hand side of (2), but not of the sec ond te rm, 1. Hen ce Pi is no t a factor of K. But K do es hav e a prime factor q by t he Fundamen tal Theorem of Ari thmetic. Thus ther e is a prime q whi ch is not in t he se t (1), as we wish ed to show. • This means t hat, given any integer m , no m a tter how large, we can b e sure t ha t ther e is a prime (indee d, infin itely many primes) larger t han m . We use this fact frequently in this chapter.
Calculus We expec t that all of our readers will have met the following properties of integrals in a calculus course. (Proofs can b e found in many texts: see for example Theorems 5 and 6 in Chapter 13 of [MS].)
7.1.2 Theorem. and let a, bE IR .
Let f ,9 end h-i , . . . .h.; be functi ons from IR to IR
(i) [Linearity of Integra ti on] If d l , . . . , dll E IR and if each of th e following int egrels exists , then
Famous Impossibilities
118
(ii) [Integration by Parts] If f and 9 are functions sucii that each of the following integrals exists, then
l f(x)g'(x)dx = f(b)g(b) - f(a)g(a) -l J'(x)g(x)dx.
•
We shall use these results only in the cases where the functions are polynomial or exponential functions , or products thereof; the integrals of such functions always exist. The proof of the next result depends on the idea of the degree of a nonzero polynomial form, which was explained in Chapter 1. 7.1.3 Lemma. Let g(X) and h(X) be polynomials whose coefficients are real numbers. If
g(x)
= h(x)e- X
for all x E IR
(3)
then g(X) and h(X) are both equal to the zero polynomial. Proof. It follows from (3) that h(:r) both sides gives
= g(x)e
X
and so differentiating
for all x E R. If we multiply both sides by g(x) and then replace the factor eXg(x) on the right-hand side by h(x), we find that
h'(x)g(x) = (g(x) + g'(x) )eXg(x) = g(x)h(x)
+ g'(x)h(x)
for all x E II\t
Rearranging this equation and interpreting it in terms of polynomial forms gives (4) g(X)h(X) = h'(X)g(X) - g'(X)h(X). If either of the polynomials g(X) or h(X) is zero, then so is the other by (3); hence it only remains to prove the lemma in the case where both g(X) and h(X) are nonzero polynomials. Now the degree of g'(X) is either undefined or is one less than the degree of g(X), and similarly the degree of h'(X) is either undefined or is one less than the degree of h(X). Hence the equation (4) gives a contradiction because the degree of its right side either is undefined or is less than the degree of g(X)h(X). Thus the case in which both of the polynomials g(X) and h(X) are nonzero cannot occur. •
Transcendence
119
The following lemma concerning the limit of a sequence will also be needed. 7.1.4
Lemma.
For eecli real number
C
Proof. Let m be an integer such that m integers n 2: m, C
-=- -
C
C
Cn
2: 0,
>
lim - = O.
n~oo
1 and m
C C
<--
- 1 2
1
C
1
2: 2c. For all
C
1 2 "'m-1 mm+1"'-;
n!
n!
1
''' m- 1 2 2 ''' 2
d
=
C · t 1le constant -cc- . . . - w1lere d IS . 1 2 m -1
Since
d
+1 2n- m
approaches 0 as n approaches
00,
~
so does " n.
•
Expanding a polynomial in powers of (X - r) Let g(X) be a polynomial over a field so that g (..,,r) ," =
Co
+ CI"'''" + C2"'''"v2 + ... + C V
vm
m "'''"
(5)
for some integer 111, 2: 0 and coefficients Co, CI, C2, . . • , Cm in the field. It is often very useful to know that, given r in the field, we can express g(X) in the form
where the coefficients bo, bl, b2 , • • • .b.; are also in the field . Whereas (5) expresses the polynomial as a linear combination of powers of X, (6) expresses it as a linear combination of the powers of (X - r) and is called the expansion of g(X) in powers of (X - r}, A general procedure for expanding a polynomial in powers of (X - r ) is to expand each monomial Xi in powers of (X - 1') using the Binomial Theorem, and then collect like terms . 'Ve illustrate this procedure in the next example. Note that the coefficients b, in (6) turn out to be polynomials di(X) evaluated at r; that is, b, = di (I').
120
Famous Impossibilities
7.1.5
Example.
To expand the polynomial (over IR)
g( X) = 3 + 4X + 7X 2 + X 3 in powers of (X - 1'), we write
g( X) = g(X - r + r ) = 3 + 4[(X - 1') + 1'] + 7[(X - 1') + 1']2 + [(X - 1') + 1']3 = 3 + 4[(X - 1') + 1'] + 7[(X - 1')2 + 21'(X - 1') + 1'2] + [(X - 1')3 + 31'(X - 1')2 + 31· 2(X - 1') + 1'3] = (3 + 41' + 71'2 + 1,3) + (4 + 141' + 31,2)(X - 1') + (7 + 31') (X - 1')2 + (X - 1')3. Thus we have shown that the polynomial g(X) can be expanded in powers of (X - 1') to give .
g(X)
= do(1') + dl(1')(X -
1') + d2(1')(X -
1,)2
+ d3(1')(X -
1')3
where do(1'), dl(1'), d2(1'), and d3(1') are the values at r of polynomials do(X), . . . ,d3(X) given by do(X) = 3 + 4X + 7X 2 + X 3, dl(X) = 4 + 14X + 3X 2,
d2(X) = 7 + 3X, d3(X) = 1.
In particular we have, as in (6),
g(X) = bo + bl(X - 1') + b2(X where bo = do(1'),
+ b3(X b, = dl(1'), b2 = d2(1') , and b3 = d3(1') . 1,)2
1')3
-
The general result , whose proof is obvious from the above example, is as follows.
7.1.6 Proposition. If g(X) is a polynomial of degree at most n, whose coefficients are integers, tlwn there exist polynomials do(.Y), ... ,dn(.Y)
with integer coefficients and degree at most n such tllat, for all r E IR, g(X) = do(1')
+ dl(1')(X -
1')
+ ... + dn(1')(X -
r}".
-
Note that the same polynomials do(X), ... ,dn(X) work for all values of 1'.
Transcendence
121
The following result confirms that when two polynomials in (X - r) are equal, we can equate coefficients (just as we can for ordinary polynomials, which is the case I' = 0), 7.1.7
Lemma.
Co + CI(X - r)
Let r E IR and bi.c, E R for i E {O, 1, . . . , n }. If
+ ...+ c,,(X -
r)"
= bo + bl(X -
r)
+ ...+ b,,(X -
r)" ,
then
Co = bo, CI = bl , Proof.
w-
... ,
C" = b".
have
Co + CI(X - I')
+ ...+ cll(x -
r)" = bo + bl(x - r)
+ ...+ b,,(x -
r)"
for all x E IR. Substituting:r = r gives Co = boo Differentiating with respect to x and substituting x = I ' then gives CI = b-: Repeating this a further n - 1 times gives the required result. Approximating Real Numbers by Rational Numbers
That we can approximate any real number a by rational numbers is well-known. We say that the set of rational numbers Q is a dense subset of the real line IR; that is, given any interval containing a , there is a rational number which lies in that interval. More formally this means that, given any real number a and a "small" number C > 0, we can find integers 111 and All and a number CI such that Icd < C and MI a= M +CI ·
(7)
This says that there is a rational number MdM at a distance of Icd < C from the given real number a. For our purposes, a more relevant type of approximation, is suggested by the following question: Given a "sm all " number e, can we find integers 1\1 and 1\II and a nonzero number CI with ICII < C such that (8)
Thus we are asking not only that the distance from a to MIl M be small, but also that it be small even when multiplied by M, the denominator of the approximating fraction.
122
Famous Impossibilities
7.1.8 Example. It is easy to see that the number e can be approximated as in (8). To see this first note that the well-known Taylor series for the exponential function gives 1
1
1
1
e = 1 + -I'. +. 2' + ... + ,+ (n+. 1)' n. For each integer n
~
+ ...
1, let 1
1
1
An = 1 + -1'. +. 2' + ... + ,. n. Hence
e = An
1
[1
1
[
1
+ (n + I)! 1 + n + 2 + (n + 2) (n + 3) + ...
~ An + (n
1
1
+ 1)1 1 + 2 + 22 + ...
]
]
-A 2 n+(n+1)! Given e > 0 choose n to be any integer> 2/6, so that 2/n < c. Put M = n! and M; = (n!)A n. You should now be able to see how to choose Cl so that (8) holds. We leave the details as Exercises 7.1 #8.• Can all real numbers a be approximated as in (8) by rationals? That the answer to this question is NO is clear from the following proposition.
7.1.9 Proposition. Let a be a real number with the following property: for eecli real number C > 0 there are integers M and M l , and a nonzero real number Cl witl: Icd < e such that
M, +c1
a=---
Ai
Then the real number a is irrational. Proof. Suppose, contrary to the conclusion of the proposition, that a is a rational number, say a = p/q where p and q are integers and
s > O.
Let e = 1/(2q). By the hypothesis of the proposition there exist integers M and kI1 , and a nonzero real number C1 with IC11 < c such that p M, + C1 = and hence pA1 - q.M1 = qe-: q M The left hand side of the last equation is an integer, while 0 < Iqcd < ~ . This contradiction shows that a cannot be rational. •
123
Transcendence
From Example 7.1.8 and Proposition 7.1.9 it follows that e is an irrational number. Note also that requiring Cl to be nonzero is crucial in the proof of the above proposition. By requiring Cl to be nonzero, we are not allowing the rational approximation MdM to equal the number a. A very useful idea which this proposition suggests is that the "better" a real number can be approximated by rational numbers (not equal to it), the "worse" that real number must be. Here we are thinking of the rational numbers as "good", the irrational numbers which are algebraic as "bad", and the irrational numbers which are not algebraic as "worse" . We shall follow up this idea in the next section.
Exercises 7.1 1. Let n be a positive integer. Verify that each of the following numbers has (n - I)! as a factor.
(i) n!.
(ii) n! + (n
+ I)! + ... + (n + m)! where m
is a positive integer.
(iii) (2n)!. 2. Prove, from the definition of factor, that if an integer b is a factor of another integer a which is not zero, then lal ;::: Ibl.
3. Prove, from the definition of factor, that if an integer b is a factor of each of the integers al and a2 then b is also a factor of the integers al + a2, al - a2, and ala2. Is b always a factor of aJ/a2?
> n. In each of the following cases state whether the integer always has tri as a factor. (Justify your answers.) (i) mn - n.
4. Let m and n be integers with m
(ii) nm - m. 5. Let p be a prime number and let m be a positive integer such that m < p. Show that p is not a factor of
(i)m(m-l); (ii) m! ; (iii) (m!)P.
Famous Impossibilities
124
6. Let f(x) = ao+alx+ . . .+anx n be a polynomial function with real coefficients . The result of differentiating this polynomial i times in succession is denoted by f(i). It is called the i-th derivative of the polynomial f. Show that f(n+I)(x) = 0 for all x E R. 7. (a) Express f(X) = 2-4X +3X 2-7X3 as a polynomial in (X -r) . Write down the formulae for the polynomials do(X), . . . , d3(X) in Proposition 7.1.6. (b) Repeat (a) with f(X) = co+clX + C2X2+C3X3+C4X4 where Co,· • . , C4 E I. 8. Complete th e details of Example 7.1.8. Why is M 1 an integer? 9. (a) Show that the rational number 3/7 can be approximated as in (7) of Section 7.1. [Hint. What can M 1 and M be if E: = 1/100?] (b) Show that every rational number can be approximated as in (7) of Section 7.1. 10. Would Proposition 7.1.9 be true if the word "nonzero" were omitted? (Justify your answer.) 11. Prove that each of the numbers cos(1) and sin( 1) is irrational by using the Taylor series for the functions cos and sin.
7.2
e is Transcendental
An important property of the number e is that, if g(x) = eX for all x E JR , then l( x) = g(x) for all x. Indeed this fact distinguishes the number e from all other positive real numbers (as shown in Exercises 7.2 #1 below) . We shall also use the fact that eX+Y = eXeY for all real numbers x and y (a property of e which is shared by all positive reals). Our proof that the number e is transcendental (see Definition 2.1.5) shall begin by supposing, on the contrary, that e is algebraic over Q. We let
125
Transcendence
denote irr(e,Q) , the irreducible polynomial of e over Q. Then we have m 2: 1 and a; E Q for each i , and Now clearly ao i= 0 as otherwise X would be a factor of the polynomial t(X), which is irreducible (and which is not just X since e i= 0). If we now multiply (*) by the product of the denominators of the rational numbers ao, al , .. . , am-l we get Co + ere + ... + cme m = 0
(1)
where Co, cl, .. . , cm are int egers such that Co i= 0 and Cm i= o. In du e course, we shall derive a contradiction from this.
Idea of Proof: M's and e'S To show that (1) leads to a contradiction, we shall use the idea (mentioned in Section 7.1 ) that the "worse" a number is, the "better" it can be approximated by rationals. As we have seen in Example 7.1.8 and Proposition 7.1.9, the irrationality of e follows very simply from the following property: For each E > 0 there is a nonzero number El with IEll < E and integers M and M, such that e=
Ml+El
M The aim now is to extend this "close approximation" by rationals to each of the remaining powers of e in the equation (1) . The aim is thus: Given E > 0, to show there are numbers El , ... , Em all less than E in absolute value, and integers AI, AIl, ... , AIm sucli that e=
M l + El
AI
2
e =
!vI2 + E2 AI '
m
e =
AIm + Em M .
Substituting these powers of e into (1) and rearranging the terms gives
Because M is an integer and so is each c, and each Mi , the first sum on the left side of (2) is an integer. If we can show this integer is nonzero we shall get the desired contradiction: we can choose each E j so small that the second sum on the left side of (2) has its absolute value less than ~ and hence is unable to cancel out the first sum.
Famous Impossibilities
126
It now remains to show how to find suitable numbers
El, •.• , Em
and M, M 1 , . . • , 111m •
Producing the €'s and the M's from integrals The following lemma shows how an integer (namely k!) arises from an integral involving the exponential function. This lemma therefore suggests the possibility of using integrals to construct the required integers 111,1111, ... ,Mm.
7.2.1 Lemma. For eedi integer k ~ 0 there are polynomials gk(X) and hk(X) with real coefficients such thet, for all r E R,
(a)
= k! - e-rgk(1'), 1')ke-Xdx = hk(1') - e-"k! .
{ xke-xdx {(x -
(b)
Proof. This can be proved by mathematical induction on k, using integration by parts (see Theorem 7.1.2). • Motivated by the above lemma, we shall aim at getting the integers M and M 1 , • • • ,114m from integrals of the form (3)
for r E {I, 2, . . . , m}, where f is a polynomial function. In the following example, we show how to apply the above lemma to such integrals.
7.2.2 Example. Let f(X) = 3 + 4X - 10X 3. By linearity of integration (Theorem 7.1.2), and then Lemma 7.2.1(a), for all r E R,
{ f(x)e-Xdx = {(3 + 4x - lOx3)e-Xdx
1.e- xdx + 4 for xe-xdx - 10 for x 3e- xdx = 3(0! - e-rgo(1')) + 4(l! - e- rgl(1')) - 1O(3! - e- rg3(r)) = (3.0! + 4.1! - 1O.3!) - e- r(3g0(I') + 4g1(1') - lOg3(1')) = 3 for
where go(X), gl,Y) and g3(X) are polynomials. Hence
{ f(x)e-Xdx where M
= 3.0! + 4.1! -
=M
- e-rG(r)
10.3! = -53 and G(X) is the polynomial
G(X) = 3go(X)
+ 4g1(X) -
10g3(X).
•
127
Transcendence
The next lemma generalizes this example. 7.2.3 Lemma. Given a polynomial f(X) with real coefficients, there is a unique real number M and a unique polynomial G(X) with real coefficients sucl: tluu, for all r E R, { f(x)e- x d.1: = M - e-rG(r). Indeed M and G(X) are unique in tlie strong sense that, if PI (X) and P2(X) are polynomials witli real coefficients sucli that { f(x)e-Xdx = P1(r) - e- rP2 (r ) for a117' E R, then P 2(X) a constant polynomial).
= G(X)
And P1(X)
=M
(so that PI(X) is
Proof. The existence of M and G(X) follows as in Example 7.2.2 by expanding f(X) in powers of X, using linearity of integration (Theorem 7.1.2(i)) and then applying Lemma 7.2.1(a) . To prove uniqueness, we assume that the two formulae in the statement of the lemma hold. Then, for all 7' E IR ,
AI - e-rG(r)
= PI(r) -
e- r P2(r)
and hence M -P1(7') = e- I ' ( G(r)-P2(7' )) . It follows from Lemma 7.1.3 that M - PI(X) and G(X) - P2(X) must both be the zero polynomial, which gives PI(X) = M and P2(X) = G(X) , as required. We now use this lemma to define the numbers we need for the transcendence proof. 7.2.4
Definition.
In Lemma 7.2.3, choose
f to be the polynomial
f(X) = (XP-\,(X - l)P(X - 2)1' . .. (X - m)p p-1.
(4)
where p is a prime number (which we shall later take to be rather large) and m is the integer occurring in (1) . Further, choose (i) M and G(X) to be the number and polynomial given by Lemma 7.2.3, (ii) M; = G(r) for r E {I, 2, .. . , m}, and (iii) Cr = e r {f(x)e-Xdx for r E {1,2, ... ,m}.
-
128
Famous Impossibilities
Note that, from (4), the degree n of f(X) is n
= mp+p-1.
(5)
Also, by Proposition 7.1.6, there are polynomials do, ... , dn with integer coefficients and degree at most n such that, for all r E R,
f (X)
= (p
~ I)! (doe r) + dI (r ) (X -
1') + ... + d; (r ) (X - rt) .
(6)
7.2.5 Lemma. With the notation as in (6), if r E {1,2, ... ,m} then doer) = dI(r) = .. . = dp_I(r) = O. Proof. Fix I' E {I, 2, ... , m}. Because (X - r)P is a factor of f(X) by (4), we have
f(X) = (p ~ l)!(X - r)Pg(X) for some polynomial g(X) with integer coefficients. If we expand g(X) in powers of (X -1') (see Proposition 7.1.6) and multiply each resulting term by (X - r)P we see that
f(X) = (p ~ I)! (bp(X - r)p + bp+I(X - r)p+1 + ... + bn(X - rt) (7) for some real numbers bp , • • • .b.; By Lemma 7.1.7 we can equate the coefficients of 1, (X - 1'), ... , (X - 1')" in (7) and (6). Since the coefficients of 1, (X - 1'), . .. , (X - I' )p-I are zero in (7), they must be zero in (6) also . -
Properties of M, M l , .. . , Mill and
El, . . . , Em
We now derive properties of the numbers which were introduced in Definition 7.2.4.
7.2.6 Lemma. (a) The number AI is an integer which does not have the prime p as a factor when p > m. (b) There is a polynomial G I with integer coefficients and degree at most n such that G(l') = p G I (l') for r E {I, 2, . .. , m}. In particular, M I , . . . , A!m are a11 integers witli the prime P as a factor . Proof. (a) The polynomial f(X) defined in (4) can be expanded in powers of X so that we can write
f(X)
= (p ~ I)! (ap_IXP-I + apXP + ...+ a"X")
(8)
129
Tr anscend ence
where ap_I, . . . , an are integers with a p- l = ±(m!)p =f O. By the linearity of integration (Theorem 7.1.2(i) ), it follows from (8) that
{ f (x)e-Xdx
_- Jor (p _1 I)! ( a _ l x + apx +. . . + anx n) e - xdx p
1'- 1
I'
= (p _1 I)! ( a p- l Jot' x 1'-1 e-Xdx
+ a pJot' x I' e-xd x + ... + an Jorr x n e-xdx ).
By Lemma 7.2.1(a), each of the integrals in the above sum is the difference of two terms, the second of which involves e- r . Hence , by the method used in Example 7.2.2, { f(x)e- Xdx is the difference of two sums, the first of which is
(p ~ I)! (a p- 1(p - I)! + app!
+ .. .+ ann!)
whil e the second is of the form e-rx polynomial in r. By (i) of Definition 7.2.4 and th e uniqueness part of Lemma 7.2.3 , M is this first term. Henc e, dividing (p - I)! into each term in the above sum gives M = a p - l + app+ + ann(n -1 ) .. . (p + l)p. Sin ce the coefficients ap - l , aI" , an ar e all int egers, the number M is an integer. Also the prime p is a factor of every te rm in the above sum except the first term a p- l = ± (m! )P , which cannot cont ain p as a factor when p > m (by Ex ercises 7.1 #5); hence p cannot be a factor of M when p > m. (b) [Here we use the expansion of f(X) in powers of X - r , as given in (6) above.] By the linearity of integration , it follows from (6) that
{ f(x)e-Xdx
= (p ~ I)! (d O(7') {
e- xd.7.: + d 1(r) {( x - r)e-Xdx +
... + dn(r) {(;r - 7't e-xdx) . By Lemma 7.2.1 (b) , each of the integrals in th e above sum is the difference of two terms, th e second of which involves e- r . Hence for f (x) e-xdx is th e difference of two sums. The second of th ese is
(pe~"l) ! (dO(7' ) + d1(r )1! + ...+ dn(r )n!)
(9)
130
Famous Impossibilities
and the first is a polynomial in r with real coefficients. By the uniqueness part of Lemma 7.2.3, G(r) is equal to the polynomial part of (9) ; that is,
In the special cases where r E {I, 2, ... , m}, we know from Lemma 7.2.5 that do(r) = d1(r) = .. . = dp_1(r) = O. Hence G(r) = (p ~ I)! (dp(r)p! + ...+ dn(1')n!). If we divide (p - I)! into each term we see that G(1') = pG1(r) where
G1(X) = dp(X)
+ dp+1(X)(p + 1) + ... + d,,(X)n(n -
1) .. . (p + 1).
Notice that G1(X) is a polynomial with integer coefficients and degree at most n since each d;(X) is such a polynomial. Hence G1(r) is an integer (since r is) and so, by (ii) of Definition 7.2.4, M; = pG1(r) is _ an integer having p as a factor. The next lemma shows that the € 's defined in Definition 7.2.4 can be made arbitrarily small by choosing p large enough.
7.2.7 Lemma. If f(X) is given as a Innctiot: of p by (4) , and {1 ,2, .. . ,m}, then
r E
Proof.
Let r E {I, 2, . . . , m} and let x E [0,1'] , Using (4) gives
lJ(x)e- XI < If(x)1
= ~
p-l
(px_ l)!I(x - l)P(x - 2)P . . . (x - m)pl .p- l
(;'C_ l)!(x + l)P(x + 2)P .. . (x + m)P r P- 1
< (p _ 1)!(1' + l)P(1' + 2Y··· (1' + mY 1
CP
= r (p -
I)!'
131
Transcendence
where C = 1'(1' + 1)(1' + 2) ... (1'
I{
+ m)
is independent of p. But
f(x)e-Xdxl < (length of interval) x (upper bound for If(x)e-xl) 1 CP = I' X - -:---.,..,. r(p-1)r
Hence the conclusion of the lemma follows from Lemma 7.1.4 .
•
With the aid of the above lemmas, it is now an easy task to prove tha t e is transcendental.
7.2.8
Theorem.
e is a transcendental number.
Proof. Suppose on the contrary that e is algebraic. Then, as shown at the start of this section (see (1) at the beginning of this section) , there is an integer m ~ 1 and integers Co, c!, C2, ... , Cm such that Co # 0, Cm # 0 and (10) Let M and M,., En for r E {I, 2, . . . , m}, be as in Definition 7.2.4 . From parts (iii) , (i) and (ii) of that definition respectively, it follows that e-rE,. = { f(x)e-Xdx = M - e- r G (1') = 1\1- «ru, and hence e
,.
=
!vI,>+ Er !vI .
Substituting these results into (10) gives
[coM and
+ cIM! + ... + cmM m] + [CIE! + ... + CmEm] = O.
(11)
We now choose the prime number P to be larger than each of m Ieol. As p > m it follows from Lemma 7.2.6(a) that
the integer 111 does not luiue p as a factor. As p > leo I it follows that Co does not have p as a factor and hence (as explained at the beginning of Section 7.1) that
coAi does not have P as a factor. But each of !vII, A12 , ••• , AIm is an integer having p as a factor by Lemma 7.2 .6(b). Hen ce p is not a factor of the sum
coM +cI!vII + .. . +cmMm , which is therefore a nonzero integer. (12)
Famous Impossibilities
132
On the other hand, if we choose p sufficiently large (which is possible by Proposition 7.1.1), then by Definition 7.2.4(iii) and Lemma 7.2.7 we get
lelcl
1
+ ...+ cmcml < 2'
(13)
But (12) and (13) contradict (ll). So our supposition is wrong. Hence e is a transcendental number. -
7.2.9 Remark. Now that the proof is complete, you might like to reflect on the reasons for the various features in the definition of f(X) in (4). (i) The (p - I)! in the denominator means that er can be made arbitrarily small by taking p large enough. (See Lemma 7.2.7.) (ii) The factor Xp-I ensures that, despite the (p - I)! in the denominator of f(X) , M is an integer. The exact power p - 1 (rather than a higher power) means that M is not divisible by p if p > m. (See the proof of Lemma 7.2.6(a). See also Exercise 3 below .) (iii) The factors (X -l)P, . .. , (X - m)P ensure that, despite the (p -1)! in the denominator, each of Af1, .. . , A1m is an integer. (See the proof of Lemma 7.2.6(b).) (iv) All of the factors (X - l)P, . . . , (X - m)P are required so that we can deal with all the exponents 1,2, ... ,m in (10).
Exercises 7.2 1. [This exercise proves the claim made at the beginning of Section 7.2 that e is distinguished from all other positive real numbers by the fact that the derivative of eX is eX.] Let C be a positive real number and assume that g(x) = c" for all x E IR. If g'(x) = g(x) for all real x, prove that e = e. [Hint. c" = (e1ney = exine.] 2. Prove Lemma 7.2.1. 3. Show that Lemma 7.2.6 would be false if we omitted the factor Xp-I in the definition of the function f(X) given in (4) of the text.
133
Tran scendenc e
4. Show t hat Lemma 7.2.6 would be false if we replaced the factor X p- I by the fact or X P in the definition of f (X ) given in (4) of the te xt. 5. (a) Prove that e2 is a transce ndent al number . [Hint. Use Theorem 7.2.8 and the definiti on of algebraic.] (b) Prove that e2 + 3e + 1 is a nonzero transce ndental number. (c) Prove that if f (X ) is a nonzero pol yn omial with rational coefficients, then f (e) is a nonz ero transcendental number. 6. Prove that
ve is transcendental.
[Hint. Use Corollary 4.4.4 .]
roo xke-xdx = k. !. r ,E.+~ e- g(1') = 0 for each p olynomial
7. Deduce from Lemma 7.2.1(a) that [Hint. You may assume that
g(X) with real coefficients .]
Jo
8. (a) Deduce from Definiti on 7.2.4, Lemma 7.2.3 and the hin t for Ex ercise 7 that
M =
fooo f (x )e-xdx .
(b) Using (a) dedu ce from Definition 7.2.4 and Lemma 7.2.3 that
M,. = e r
1 f (x )e-Xdx. 00
9. Prove by inducti on that if f (X ) = ao + alX
+ ... + anxn
then
for each r E R, where f (i)(l') is the i-th derivative of f evaluate d at r. [Hint. For each integer m , take the inducti on state me nt to be
10. (a) Prove that e" is irr ational for all r E N. (b) Deduce from (a) and the formula just before (11) in the te xt that 10 1, • . • , Em ar e nonz ero. Why didn 't we need to use this fact in the proof of Theorem 7.2.8 (in cont rast to the pro of of Propositi on 7.1.9 where 10 1 being nonzero is essential)?
134
7.3
Fam ous Impossibilities
Preliminaries on Symmetric Polynomials
The proof that 7f is tra nscendental (to be given in Section 7.6) is sim ilar in many respects to that given in Section 7.2 for e. There are, however , two to pics we mu st in troduce before the proof, firstly sy mmetric polynomials (covered in t his section) and secondly int egrals of complex-valued fun cti ons (discussed in Section 7.5) . We are conce rned here with pol ynomials in seve ral ind eterminates X\,X 2 , •. • , X n . Of course, XIX~+3X3 = 3X3+X~XI , for example, since the order of multiplying ind et erminates or adding t erms makes no differen ce. 7.3.1
Example.
= 3, two such polynomials are x~xixj + 3X I + 3X2 + 3X 3,
With n
f(XI, X 2 , X 3) = g(X I, X 2, X 3 ) = X?X 2 + xix3 .
•
Permutations and Symmetry We are interested in wha t happens to such p olyn omials when t he indeterminates are permuted amongst themse lves, for exam ple, replacing X I by X 2 , X 2 by X 3 and X 3 by XI . If we apply this permuta tion to become , resp ectively,
f and
9 in the exam ple above they
ft (X I, X 2 , X 3) = xixixl + 3X2 + 3X 3 + 3Xt, g!CX" X 2 , X 3 ) = xix3 + xix l • Not ice that f and I, are equal polyn omials bu t 9 and gl are not . You can see that, wha tever permuta ti on of th e indetermina tes is made in I , th e resulting polynomial is equal t o f. We say t hat f is a sym me tric polynomial bu t that 9 is not. The precise definition is given as Definition 7.3.3 below. Before giving th at definition we need to say a little m ore about permutations. Intuitively a permutation of a set, for example the set {X I ,X2 ,X3 } of indet erminates ab ove, is just a rearra nge me nt of the order of the elemen ts . Mor e formall y, a permuta tion p of a set S is a one- to- one and onto mapping p : S -+ S. In the exa mple ab ove,
p(X d = X 2 ,
p(X 2 ) = X 3
and
p(X 3 ) = XI'
As you probably kno w, the re a re 6 (= 31) permu ta tions of any set such as {Xt,X2 ,X3 } with 3 elements . (We suggest you write the m all
135
Transcendence
down. Do not forget the so-called identity permutation which leaves all elements unchanged.) More generally there are n! permutations of a set with n , elements. Clearly if f is a polynomial in the n indeterminates XI, X 2 , •. . , X n and P is any permutation of the set, we get a new polynomial (which may be denoted by fP) by applying P to all the Xi'S in f. 7.3.2 Example. Let PI be the permutation of {X I,X2,X3 } which interchanges XI and X 3 and leaves X 2 fixed, and let P2 be the permutation given by P2(Xr) = X 3 , P2(X2) = XI, P2(X3 ) = X 2. Then, with f and 9 as in Example 7.3.1 above,
gPl(XI, X 2, X 3 ) = jP2(X I,X2,X3 ) =
xlx2 + xix l , xlxfxi + 3X3 + 3X I + 3X2.
You might like to write down f P1(XI,X2,X3 ) and gP2(X 1,X2,X3 ) •
•
7.3.3 Definition. A polynomial f(X 1,X2, .. . ,Xn) in indeterminates XI, X 2, • • • , X; is called symmetric if, for all permutations P of {X 1,.X2 , • •• ,.\n}, we have
• The next lemma contains two routine consequences of this definition. We leave the proof as Exercises 7.3 #7. 7.3.4 Lemma. (i) If f(XI, .. . , Xn) and g(XI, ... , Xn) are symmetric polynomials in X I, .. . , X n then so are
f(X 1 , .•• , Xn) + g(XI, ... , XII) , f(XI,"" Xn) - g(X 1, ••• , Xn) , f(X 1 , . • • , XII) g(X I , •• • , Xn), (their sum, difference, and product). (ii) If h(Yi , , Ym ) is any polynomial in indeterminates YI, . . . , Ym and if gl(XI, , XII)" . . , gm(X I, ... , Xn) are symmetric polynomials in X I, . . . ,XII then
is also symmetric in XI, . .. , XII '
•
Famous Impossibilities
136
Note that "quotient" is not included in (i) of Lemma 7.3.4 since the quotient of two polynomials is in general not a polynomial.
Elementary Symmetric Functions There is one way of obtaining several interesting and important symmetric polynomials which we illustrate in the case n = 3. Consider
If we expand this and collect the coefficients of the powers of the indeterminate Y we obtain
Notice that the three polynomials obtained (Tl(XI, X 2, X 3)
= Xl + X 2 + X 3,
(T2(X l , X 2, X 3) = X IX2 + X IX3 + X 2X3, (T3(X l , X 2, X 3) = X IX2X3, are all symmetric. This is not surprising since clearly the value of the expression (*) (from which they are derived) remains unchanged after any permutation of the X's. The three polynomials obtained are called the elementary symmetric functions in 3 indeterminates. This can be extended to any n in the following definition.
7.3.5 Definition. The n elementary summetric [unctions (Tl, (T2 , . . . , o ; in indeterminates XI, X 2, ... , X n are the coefficients respectively of the powers yn-l , yn-2, ... , yO in the expansion of
that is, (Tl(Xl , X 2,.··, X n) = Xl + X 2 + ... + X n, (T2(X l,X2, ... ,Xn) = X IX2 + X IX3 + + XIX n+ X 2X;J + + X 2X n +
.. .+ Xn-lX n,
•
137
Transcendence
7.3.6
Example.
(i) For n
= 2 we have
al (Xl, X 2)
= Xl + X 2 and a2(Xj, X 2) = X IX 2.
(ii) For n = 4 we have
= XI + X 2 + X 3 + X 4 , a2(Xj, X 2, X 3 , X 4 ) = X UY2 + X IX3 + X IX4 +
al(Xj, X 2, X 3 , X 4 )
X 2X3 + X 2X4 + X 3X4 ,
a3(X I , X 2, X 3, X 4 ) a4(X l , X 2, X 3 , X 4)
= X lX2X 3 + X lX2X 4 + X IX3X4 + X 2X3X4 , = X lX2X3X4 .
m
(iii) Note that, when n = 4, al has terms, a2 has has G) terms and a 4 has terms.
G)
m= 6 terms, a3
(iv) In general, ak(Xj,X2, .. . ,X,,) has (~) terms.
-
7.3.7 Remark. The elementary symmetric functions provide a connection between the zeros of a polynomial and its coefficients. To see this connection, let f(} ") = a" Y"
+ a,,_1 }nl-I + .. . + al },r + aD
be a polynomial of degree n (that is, (l" ::J 0) with indeterminate Y and coefficients in a field IF. Assume also that f has n zeros 1'1,1'2, .. . ,1'n in some (possibly larger) field IE so that
It follows immediately from the definition of the elementary symmetric functions (Definition 7.3.5) that this expands to
f(Y) =a n }n , - ( l n a l ( 1'I, · · · , 1'" )}nl-I
... + (-lta,Pn(')'I,""
Thus from (**)
al(1'I , a2(1'1,
+
(ln
a 2(1'1, · · · , 1'n)y n - 2 -
1',,).
,1'n) = - an- I/ a" , , ~/,,) = a"-2/a",
and so on down to
Thus each elementary symmetric function evaluated at the zeros of (**) is plus or minus the quotient of a coefficient divided by the leading coefficient. Note also that all of these quotients are in IF (even though the 1"S need not be in IF) . -
Famous Impossibilities
138
The elementary symmetric functions are also important because, as we prove below in Theorem 7.3.10, every symmetric polynomial can be expressed in terms of (indeed, as a polynomial in) the elementary symmetric functions. The next example illustrates this. 7.3.8
Example.
For
f
in Example 7.3.1,
where h(Yl , Y2 ) = Y? + 3Y2 and a3 and al are two of the elementary symmetric functions in the indeterminates X}, X 2 , X 3. • In Example 7.3.8, it was convenient to write just a3 rather than a3(X l,X2 ,X3). Similarly, in what follows, we shall often omit th e X's from the a's. However, you should always remember that the a's depend on indeterminates X}, X 2, • . • , X n •
xiI
7.3.9 Definition. The degree of a monomial X~2 . .. X~n, where iI, i 2, ... , in are non-negative integers, is defined as the sum of the exponents, i l + i 2 + . . . + in, and the degree of any nonzero polynomial is the maximum degree of the monomials involved in it. • In Example 7.3.8, note that the degree of h (which is 2) is no greater than the degree of f (which is 6), and that h has integer coefficients. The following theorem generalizes this example. 7.3.10 Theorem. [Fundamental Theorem on Symmetric Functions] Every symmetric polynomial g, witu coefficients in a field IF, in the indeterminates Xl, X 2 , • . • , X n can be written as a polynomial h (with coefficients in IF) in the n elementary symmetric functions. Moreover,
(i) the degree of h is no more than the degree of g, and (ii) if 9 lies integer coefficients then so does h.
•
Before proving the theorem, we describe an algorithm for constructing such a polynomial h. Later we shall prove that the algorithm works; this will prove the theorem. We first need another definition.
139
Tr an scendence
7.3.11
Definition.
· i lX-2i 2 ' .. v in CX I
-"" n
Consider two nonzero terms ancI
dv il v i z .A I J'\. 2 . . .
xtn
(c,d E F ).
\Ve say that the first of these is of high er orde-r th an the second (or, equivalently t hat the second is of lower order than t.he first) if, at. t.he first p osi t.ion, say k , in which the i's and j's differ , we have ik > j k. • For example, X l X~X3 is of higher ord er than each of x l X 2X3 and 1ower onIer tl Ian J\.IJ'\.2 "\r2"\r3-\.3' v
v X 25XI30' W 111'1 e ...v·I.1.A "\r5 v 3lu :IS 0 f .AI 2 J\.
7.3.12 Algorithm. [Fundamental Algorithm for Symmetri c Functions] The aim of the algorithm is as follows: Given a non zero sy m m etric polynomial 9 in n ind et erminates X" X 2 , •• . , X n sat isfying the hypotheses of the Fundam en tal Th eorem on Symm etric Functions, to construct a polyn omi al h satisfying the conclusion of tlie: theorem . St ep O. Ini tially set
e= 1 and 91 = 9 (which is not zero) .
Step 1. (a) Find the te rm of highest order in
9f,
say
(CE F,c f O) . (b) Wi th c and i i, i 2 , • • • , in as in (a) , set
(c) Put
ge+I(X I , X 2, . .. , Xu ) (d) Increas e e by 1.
= ge(X" X 2,.·· , X u) -
he(a t, 0'2, ·. ·, an).
(e) If ge f 0 th en go back to (a ) and rep eat St ep 1; ot herwise pro ceed to St ep 2 bel ow. Step 2. Put h
= hi + h 2 + ... + he- I.
•
Before pro ving th at thi s algorithm works, we give an example of its use.
140
Famous Impossibilities
7.3.13 Example. We apply the above algorithm to the symmetric polynomial (in n = 2 indeterminates)
(See Example 7.3.6 for formulae for Step O. Set
e=
al
and a2')
1 and put 91 = 9.
Step 1. (a) The term of highest order in 91 is 3Xr X 2.
= 3yl-IY;l = 3Y?Y2 • 92(XI,X2) = 91(XI,X2) - h l (a l , a2) = 3XfX2 - 2xrxi + 3X tXg -
(b) hl(YI , Y2 )
(c)
3(X I
+ X 2)2(X tX2)
" '2 ,\,2
= - 8",'1.1",'1.2'
(d) Set
e=
2.
(e) As 92 -1= 0, return to (a) and repeat Step 1 with 92 in place of 91. Step 1. (a) The term of highest order in 92 is -8Xr xi. (b) h2(YI , Y2) = _8y?-2y;2 = -81'22 . (c) 93(XI,X2)
(d) Set
e=
(e) Since 93 Step 2. Set
= 92(XI,X2) -
h 2 (al , a2)
= - 8x r x i + 8xrxi = o.
3.
= 0 go on to Step 2. h = hI + h2 ; that is, h(Yt , Y2 ) = 3Y?Y2 -
8Yl .
As a check, note that
and also that deg h = 3 < 4 = deg 9 and that the polynomial h has integer coefficients. Thus h satisfies the conclusion of Theorem 7.3.10.• To prove that the algorithm always works we need the following lemmas.
141
Transcendence
7.3.14 Lemma. Let g(X I , ... , Xn) be a symmetric polynomial over a field IF and let
(c E IF,
C
¥= 0)
(1)
be tile term of highest order in g. Then
(2) Proof. If, say, ik < ik+1 contrary to the conclusion of the lemma, then we apply to 9 the permutation p which just interchanges the indeterminates Xk and Xk+ 1 . The resulting polynomial gP (using the notation in Definition 7.3.3) then contains the term (3)
which is of high er order than (1) by Definition 7.3.11. But 9 is a symmetric polynomial so the term (3) must be in 9 = o" , which contradicts our choice of (1) as the term of highest order in g. Thus (2) holds. •
Proof. The term of highest order in a product of polynomials is the product of the terms of highest order in each polynomial. The terms of highest order in 0"1,0"2, .. . . o; are respectively
Hence the term of highest order in the product is X Ii l -
iZ( "\'
v
·""1--'-2
)iZ-i3
(V V
V )in _
. . . ·'-1·'-2··· ·'-n
-
"\' il viz
· '- 1 ·'-2 .. .
Xin
II'
•
Proof of Theorem 7.3.10. To prove the theorem we prove that the Algorithm 7.3.12 always produces a polynomial h with the properties stated in the theorem. If 9 =1= 0 is a symmetric polynomial in XI , X 2 , · • • ,Xn , we let DEN denote the number of monomials (not necessarily occurring in g) which are of lower order than the highest term occurring in g. (Recall that a monomial is a term X:t with coefficient 1.) Our proof is by mathematical induction on D . If D = 0 then 9 is a constant polynomial and it is left as an easy exercise to prove that the algorithm works in this case .
xiI .. .
142
Famous Impossibilities
Now let kEN and assume that the algorithm works for all symmetric polynomials 9 with D :::; k. Consider now a polynomial 91 with D = k + 1, and let its term of highest order be
(4) where c E F is nonzero. We wish to prove that the algorithm works for 9\ .
The first pass through Step 1 of the algorithm gives
(5) and 92(X1> X 2, . . . ,XII) = 91(X I , X 2, ... , XII) - h l (<7I, <72, .· ., (711) ,
(6)
Now (5) defines hi as a polynomial since i 1 2: i 2 2: ... 2: ill by Lemma 7.3.14. Thus, by (ii) of Lemma 7.3.4, h l (<71, " " (711) is symmetric in Xl, ... , XII and hence, by (i) of Lemma 7.3.4, so is g2. By Lemma 7.3.15, moreover, the two polynomials on the right-hand side of (6) have the same term of highest order and so these terms cancel each other out in (6). If 92 is zero, it is an easy exercise to check that the choice h = b, satisfies the requirements of the theorem. Hence we consider below only the case where 92 =F O. If m2 is the term of highest order in 92, then m2 is oflower order than ml (since the terms of highest order cancel each other out in (6)) and so the D for g2 (the number of monomials of lower order than m2) is less than k+ 1 which is the D for gl (the number of monomials of lower order than mI). Thus the inductive assumption applies to 92 and so the algorithm gives a polynomial h*, say, such that 92('-\1> '. \2,
,XII) = h*(<71, <72, ... ,(7 11 ) ,
Hence by (6), gl(X 1, X 2, ,XII) = 11(<71, <72, ... , (711) where h = hi +h*. To see that the "moreover" parts of the theorem are true, note simply that (i) the degree of hi (Y1> '}~l) in (5) is i l while the degree of gl is at least i 1 + i2 + + ill, so that the degree of hI is not more than the degree of 91, and
143
Transcendence
(ii) if gl has integer coefficients then c in (4) is an integer, which means, by (5), that hI has integer coefficients. The "moreover" results now follow easily. (Formally, each of these results requires a separate proof by induction.) •
7.3.16 Corollary. Let IF be a field and let f(X) be a polynomial of degree n with coefficients in IF and with n zeros Q'I , Q'2, ... , Q'n in some extension field E of F . If 9 is any symmetric polynomial in XI, X 2 , ••• , X n witl: coefficients in F then g(Q'I, Q'2,.·., Q'n) E F.
Proof. By the Fundamental Theorem on Symmetric Functions, g(X I,X2 , •.. ,Xn ) is equal to 12(01,02, . . . , O n ) where 12 is a polynomial with coefficients in IF. It follows that
= h({3I,{32, . .. , (3n) for i = 1,2, .. . , n. But, by Remark 7.3.7,
g(Q'I,Q'2 , . . . ,Q'n)
where (3j = OJ(Q'I, Q'2, . .. , Q'n) each of the {3's is in IF (being the quotient of two of the coefficients of J) and thus g( Q'I, Q'2, ... , Q'n) is in IF. • The next proposition plays a vital role in our proof (in Section 7.4 and Section 7.6) that 1r is transcendental.
7.3.17 Proposition. Let F be a field and let t(X) be a polynomial of degree n witl: coefficients in IF and with n zeros Q'I, Q'2, ... ,Q'n in some extension field E of IF. Assume that k is an integer between 1 and n , and let be all the sums of exactly k of tlie Q' 's. TllCn there is a monic polynomial tk(X) of degree m with coefficients in IF which has 1'1,1'2, · · ·, 1'm as its zeros . 7.3.18 namely
Example. 1'1 = Q'I
+ Q'2,
Let n = 3 and k = 2. Then there are three 1"s, 1'2 = Q'I
+ Q'3,
and
1'3 = Q'2 + Q'3 ·
It is clear that t2(X) must be
t2(X) = (X -1'd(X - 1'2)(X - 1'3) 3 2 = X - (71 + 1'2 + 1'3 )X + (7n2 + 1'n3 + 1'21'3)X - 1'n21'3·
144
Famous Impossibilities
Each coefficient can be expressed in terms of the a's; for example, you can check that
If you repeat this for other coefficients, you will see that each of the coefficients is a symmetric polynomial (with coefficients which are in F since they are integers) in the a's, and so is in F by Corollary 7.3.16. This is the essence of the proof of Proposition 7.3.17. •
Proof of Proposition 7.3.17. It is clear from the properties required of tk(X) that we must define it by
We must prove that each of its coefficients is in F . Now, as in Remark 7.3.7, each of the coefficients of tk(X) is a symmetric polynomial evaluated at (,on , 12, ... "m). Thus it is sufficient to prove that, if h is any symmetric polynomial in m indeterminates and with coefficients in F then h(,l, 12, ... "m) E F. Accordingly, let h be any symmetric polynomial in m indeterminates and with coefficients in IF . We introduce n indeterminates X 1,X2,,, ,,Xn and let Yl,Y2, .. . ,}"~n denote the m distinct sums of the X's taken k at a time. Then h(Yi, Y2 , .• • , Y m ) can be expanded out to give a polynomial g(X 1,X2, ••• ,Xn ) in the X's
It is easy to see that if we permute the X's we also permute the Y's (since a sum of k X's remains such a sum after permutation of the X's , and every such sum comes from another such sum after permutation). Thus g(XI, X 2 , ••• , X n ) is a symmetric function in XI, X 2 , • • • ,Xn and has its coefficients in IF. But, from the connection between the Y's and X's and between the I 'S and a's, we have
which is in IF by Corollary 7.3.16.
•
145
Transcendence
Exercises 7.3 1. Write down formulae for the 6 permutations PI, P2, ... ,P6 of
{Xl, X 2 , X 3 } . If f(XI,X 2,X3) = xlx2 + xix3, write down fPi(X l, X 2, X 3) for i = 1,2, . . . ,6. 2. Write down two polynomials of degree 4 in Xl, X2 which are symmetric and two which are not. 3. Assume that f(X) in C. Calculate
= 2X3 -
3X 2+4X -1 has three zeros aI , a2, a3
4. Apply Algorithm 7.3.12 to express each of the following symmetric polynomials as polynomials in the relevant elementary symmetric functions .
(a) 3XrX2
+ 6xl -
4XrX? + 12X1X2 + 6xi
(b) 5xrxix3 - 2XrX2X3 2XIX2Xj.
+ 3X IX? 2x1xix3 + 5x lxixj + 5XrX2Xj-
5. Complete Example 7.3.18 by expressing the other coefficients of t2(X) in terms of the a's and checking that they are symmetric in the a 's. 6. Consider the proof of Proposition 7.3.17 in the case n
= 4, k = 2.
(i) Write down the six 1"s. (ii) Write down the coefficient of the X ·5 term in
Express this in terms of the a's and check it is a symmetric function of the a's, as claimed in the proof of Proposition 7.3.17. 7. (a) Prove (i) of Lemma 7.3.4. [Hint. If P is any permutation of {XI, of applying P to f(Xl, ,Xn ) + g(X l,
fP(X l, ... , Xn)
+ gP(XI,
,Xn) .]
, X n} then the result ,Xn) is
146
Famous Impossibilities
(b) Prove (ii) of Lemma 7.3.4. [Hint. If
t(X I, . .. , Xn) = h(gl(X 1, ••• , X n), then, for any permutation p of {X I ,
tP(XI, ... , Xn)
,gm(X I , . .. , Xn)) ,
X n},
= h(gf(X I, .. . , XII)"", g~n(XI,'"
, XII)).J
(c) Can you see how to give an alternative proof of (ii) of Lemma 7.3.4, this time using (i) of Lemma 7.3.4?
7.4
1r
is Transcendental - Part 1 Complex Exponentials
In our proof that 1r is transcendental, we need to consider e Z where z is a complex number. 7.4.1
Definition.
If z = x
by eZ 7.4.2
+ iy
where x, y E IR , then e Z is defined
= eX(cos(y) + i sin(y)) .
•
Example. e H i = e(cos(l) + isin(l)) , eill" = eO(cos 1r + i sin 1r) = 1(-1
+ Oi) = -1.
This latter fact will be the starting point for our proof that 1r is transcendental. It is also easy to check that e Z1eZ2 = eZI +Z2 for all ZI and Z2 in C. (Use the expansions of COS(YI + Y2) and sin(YI + Y2).) • An equation for rr via symmetric polynomials In the next proposition we need t o use the well-known result that every polynomial with complex coefficients can be written as a product of factors of degree 1. More precisely, if g(X) E qX] has degree nand leading coefficient c E C then g(X) can be written as for complex numbers AI , A2' ... , An which are , of course, the zeros of g(X) in C. This result is known as the Fundamental Theorem of Algebra (see, for example, Theorem 6 in Chapter V, Section 3 of [GB]).
147
Transcendence
7.4.3 Proposition. Supp ose 7f is algebraic over Q . Th en tliete exist integers m ;::: 1, q ;::: 1, b ;::: 1 an d n onzero com plex numbers f31, f32, ... ,f3m sucu that (1)
and the polynomial
h(X ) = b(X - ,BI)(X - (32) .. . (X - f3m)
(2)
has in teger coefficients. Proof.
Suppose
7f
is alge braic . i rr
Becau se th e eq uat ion e = - 1 (which relat es 1f to a n intege=--) -. ra t her th an j ust 1f in t he ex ponen t, we consider t he irredu cible polyno mial over Q of i1f rather than of 1f • i 1f
Sin ce the complex numb er i is algebra ic over Q it follows from Cor ollary 4.4.4 t hat. t he com plex nn mber in is a lgebraic (over Q). If t(X ) de notes the irreducible polynomial of in over Q t he n t( X) is a mo nic polynomial with ra ti onal coe fficients such that t ( i7r) = O. The Fundamen t al Theorem of Algebra te lls us t hat t( X) factor s com pletely over C so that
t(X ) = (X -
0'1
)(X -
0'2) . . .
(X -
O'k)
for some com plex numb ers 0'1,0'2 , ... , O'k. Of course one of t hese os must be equal to it: and we ca n suppose t hat 0' 1 = in , Bu t ei'll" = -1 and hence
(e"l
+ 1)(e"2 + 1) . . . (e"k + 1) =
O.
(3)
You may be puzzled as to why we include th e te rms
(1'0 2 + 1), . .. , (1'0' + 1) in (3) . T hese a re essentia l because (a s you will see la ter ), by includi ng t he factors corres pon ding to all the zeros of j , we are ab le event ually to obtain symmet ric polyn omials a nd the n integers.
If we multiply ou t t he left-hand side (LHS) of (3) , t he re are 2k te rms. If we use t he fact t hat e l eZ2 = e ZI + Z2 t he n we get 2k - 1 te rms of t he
Famous Impossibilities
148
form e? (where 1 is a sum of one or more a's) and a single term equal to 1 (the product of all the l's in (3)). For example, if n = 3 the LHS of (3) is eO'! +0',+0'3 + eO" +0', + eO" +0'3 + eO"+O'3
II
+ eO" + eO" + eO'3~
Thus (3) can be rewritten as e'Yl
+ e'Y2 + ... + e'YN + 1 = a
(4)
(where N = 2k - 1).
~we make our first lise of results from Section 7.3 about symmetric polynomials.
II ~
Now we have met something like the formation of the 1'S from the a's before in Proposition 7.3.17. Using that result and its notation, (i) there is a monic polynomial tl(X) with rational coefficients which has all the a's as zeros , (ii) there is a monic polynomial t2(X) with rational coefficients which has all the sums of two a's as zeros, and so on, finishing with (iii) a monic polynomial tk(X) with rational coefficients which has all the sums of k a 's as zeros. Thus if then T(X) is a monic polynomial with rational coefficients which has all the N (= 2k - 1) 1'S as it zeros. This means that T(X)
= (X -
1d(X - 12) . . . (X - 1N)'
(5)
It may happen that some of these 1'S (sums of a's) are in fact zero . (For example, it may happen that al + a2 + a3 = 0.) We have no way of knowing if this happens and, if so, how often it happens. But we can allow for it by saying that, if there are ql such 1'S equal to zero, we can rewrite (4) as
e13\
+ e,82 + ... + e13m + q = 0
149
Transcendence
where (31 ,132, ' . . ,13m are all nonzero complex numbers (the /"s which are nonzero) and q = ql + 1 is a positive integer, which proves (1) . Note also that m must be at least 1 since, if m = 0, (1) would say that q = which is impossible. From (5)
°
is a polynomial with rational coefficients and so
(X - 13I) .. . (X - 13m)
(6)
is also a polynomial with rational coefficients. We can multiply the polynomial in (6) by a suitably large positive integer b (to cancel out all denominators there) to obtain a polynomial
h(X)
= bX m + bm_1X m-1 + =
[)(X - 13I)(X - (32)
+ bo
(X - 13m)
with integer coefficients. (We have omitted the subscript from the leading coefficient bm to simplify several formulae in Section 7.6.) •
Exercises 7.4 1. Prove from Definition 7.4.1 that e Zte Z2 = e Zt +Z2 for all
2. If
7.5
Z
= x + iy where x , y E IR, show that
Zl
and
Z2
E C.
lezi = e".
Preliminaries on Complex-valued Integrals
Before we can complete the proof of the transcendence of 7.6) we need to introduce integrals of the form
7r
(in Section
l j(x)dx
where f is a complex-valued function (that is, a function f : IR -. C) and a, b are real numbers. First we define integrals of this kind in terms of integrals of real-valued functions (such as you have learned about in your elementary calculus courses). We also define derivatives of complex-valued functions in terms of derivatives of real-valued functions.
Famous Impossibilities
150
7.5.1 Definition. Assume a, b are real nnmbers and a complex-valued function. For any x E IR we can write
f(x) = ft(x)
f : IR
~
C is
+ ih(x)
where fl(X) and h(x) are real numbers, which gives us two real-valued functions h : R ~ Rand h : R ~ R. (i) We define
t f(x)dx = t h(x)dx + i t h(x)dx (provided the two normal real-valued integrals on the right-hand side exist). (ii) We define the derivative J'(x) as
J'( :r)
= ff(x) + if~(3;)
(provided the two real-valued derivatives on the right-hand side exist) .
•
7.5.2
Example.
Consider
f(x)
f : IR
~
C given by
= (x + 2i)2
for x E R.
Then
f(x) = (x 2 - 4) + 4xi so that here fl(X) = x 2 - 4 and h(x) = 4x. Thus, by the definitions above,
h\x + 2i)2dx = l(x 2 - 4)dx 5 = --
3
and
J'(x)
= ff(x)
+ i l4xdx
+ 6l '
+if~(x)
•
= 2x + 4i.
Many of the properties of integrals and derivatives with which you are familiar carryover to these integrals and derivatives of complexvalued functions.' In the next proposition we collect some of the •
The subject of Complex Analysis, which is taught near the end of most undergraduate
courses, covers integrat ion and differentiation of more general functions
f : C ......
C. Because
we expect that most of our readers will not yet have taken such a course, we have chosen to introduce in this section the special case
f : R ......
C, basing our development on material
familiar from elementary (real) calculus courses . Of course, any readers familiar with complex analysis will find the material in th is section easy special cases of what they already know .
Transcendence
151
properties we need below. 7.5.3 Proposition. Let t, g , F, hi," " h u : IR --+ C be complexvalued functions and let a, b E IR and c, d l, ... ,dn E C. Then , provided the relevant integrals and derivatives below exist, (i) l cf(x)d.1: = c l (ii) l\f(x)
f(:r)d.1:;
+ g(x))dx = l
f(x)dx
+l
g( :r)d.1:;
(iii) [Linearity] l
(dlhl(x)
+ ...+ dnhn(x))dx =
d1 l h 1(x)dx
+ ... + d n l
hu(x)dx;
(iv) [Fundamental Theorem of Calculus] if F'(x)
= f( :1:)
then l
f(x)
= F(b)
- F(a) ;
(v) [Integration by Parts] l
Proof. and c =
Cl
f(x)g'(x)dx = f(b)g(b) - f(a)g(a)
-i
f'(x)g(x)dx .
First we do (i) in some detail. Let f(x) = ft(x) + ih(x) where ft(:r), 12(:1:), Cl and Cz are real. Then
+ czi,
and so, by Definition 7.5.1, l cf(x)dx = l(Clft(.T) - czh(:I:))dx
+ i l(clh(x) + c2fl(x))dx.
The right-hand side of (i) is easily seen to be the same. Part (ii) follows similarly, while (iii) follows easily from (i) and (ii) . The remaining results also follow easily from Definition 7.5.1 and the corresponding properties of real-valued functions. We suggest you write out the details of at least some of these other parts (see Exercises 7.5 #3 below) . 7.5.4 Lemma. Let r E C and consider 9 : IR --+ C given by g(u) = e- u r for tt E IR. Then g'(u) = -re- W ' for all tt E IR . Proof.
If r
= 1'1 + 1'zi (1'1,1'2 real)
then, from Definition 7.4.1,
g(tt) = e-lIrl(cos(-ttr2) + isin(-ttl'z)) and the result follows easily from the definition of derivative in Definition 7.5.1. -
Famous Impossibilities
152
We also need the following lemma which gives an upper bound for the modulus of an integral. (Recall that the modulus Izi of the complex number z = c + di is given by Izi = -/c2 + d 2 . )
7.5.5
If(x)1
Lemma.
Assume
f :R
--t
C and a, b, c E R are such that
+ i12(.r.),
where fl(x) and 12(x) are real.
:s c for all x E [a,bj where a < b. Then It f(x)dxl :s 2c(b - a).
Proof. Let f(x) = ft(x) Then, if x E [a, b],
1!J(x)1 and similarly
112(:1:)1
:s J!J(x)2 + 12(x)2 =
:s c. Now, from
lJ(x)1 :s c
(i) of Definition 7.5.1,
Il f(:1:)d:1:! (l !J(x)dx) + (l 12(x)dx) :s Il fl(x)dx!+ It 12(x)dx! =
:s c(b -
2
a) + c(b - a)
2
(*)
( **)
= 2c(b - a)
as required. (Note that (*) follows because, as is easily seen by squar11'1 + lsi for all 1',8 E R, while (**) follows because ing, -/1'2 + 8 2 1!J(x)1 c and 112(x)1 c.) •
:s
:s
:s
- - - - - - - - - - Exercises 7.5 1. Evaluate the complex-valued integral fo\x
+ i)3dx.
2. If f(x) = (x + i)3 for all x E R, use (ii) of Definition 7.5.1 to calculate I' (x) . 3. Complete the proof of Proposition 7.5.3, being careful to indicate which properties of integrals of real-valued functions you use . 4. Complete the proof of Lemma 7.5.4. 5. Let f : R --t C be given by f(u) = ue- ui for all u E IR. Use integration by parts to calculate fJ f (u )du.
Transcendence
7.6
11"
153
is Transcendental - Part 2
Proposition 7.4.3 is the beginning of our proof that 1r is transcendental. Recall that we proved there that, if 1r is algebraic, then there exist integers m ~ 1, q ~ 1, b ~ 1 and nonzero complex numbers 131 ,132, . . . ,13m such that e(31
+ ef32 +...+ e(3m + q =
(1)
0
and the polynomial
h(X) = b(X - (31)(X - (32) . . . (X - 13m)
(2)
has integer coefficients. Multiplying out h(X) gives
h(X) = bo + blX +...+ b",_lX m -
1
+ bX"'
(3)
where bo, .. . , bill-I, b are integers and the constant term
is nonzero since the band f3's are all nonzero. Now that the above equations have been obtained, the rest of our proof that 1r is transcendental is going to look very much like the proof for e given in Section 7.2. We are going to derive a contradiction from (1) just as we derived a contradiction from the corresponding result ((1) of Section 7.2) for e. To this end we shall show how to construct certain numbers M; A1(31 ' . . . ,AI(3m and c(31' . .. ,c(3m by using integrals , as we did in Section 7.2 ill the proof for e. The integrals we shall use have the form
fal f(ur)e-u,onlu where r is a complex number. Interested readers might like to compare this wi th the integrals
(1' E R) we used in Section 7.2. Although it is not used in our proof below , you may be interested to note that if we make the formal substitution x = ttl' in (**) we obtain (*).
154
Famous Impossibilities
Producing the M's and the s 's We state and prove below four lemmas needed to define the M's and the e'« and to derive their properties. Each of these lemmas is very similar to one of the lemmas in Section 7.2. The first lemma is the analogue of Lemma 7.2.1. 7.6.1 Lemma. For eedi integer k 2: 0 tl:ere are polynomials glc(X), hlc(X) witl: real coefficients sucli that, for all r E C,
(a) (b)
1o\ur)lce-Urrdu = k! - e-rglc(r),
1o\w,- lye- m'rdu = hk(r) -
e-rk!.
Proof. This can be proved as for Lemma 7.2.1 by mathematical induction on k; using the relevant integration by parts rule (see (v) of Proposition 7.5.3) . Lemma 7.5.4 gives the rule needed for differentiating (or antidifferentiating) e"?" with respect to u. • The next lemma is the analogue of Lemma 7.2.3. 7.6.2 Lemma. Given a polynomial f(X) witll real coefficients, there is a unique number 111 and a unique polynomial G(X) with real coefficients suclx that , for a111' E C, 1
10
f(ttr)e- W 'I'du = M - e-rG(r) .
Indeed M and G(X) are unique in the strong sense tlIat, if Pl(X) and P2(X) are polynomials \ritll real coefficients such tllat (I
io
f(ur)e-W'rdu
for all r E C, then P 2(X)
= G(X)
= Pl(r) -
,
e-' P2(r)
and P 1(X)
=M.
Proof. The proof is very similar to that of Lemma 7.2 .3. For the existence part, use linearity as in (iii) of Proposition 7.5.3 instead of (i) of Theorem 7.1.2, and then use Lemma 7.6.1(a) instead of Lemma 7.2.1(a). The uniqueness part follows the proof in Lemma 7.2.3 exactly since in this case
holds for all r E C and so , in particular, for all r E R
•
155
Transcendence
As in the proof for e, our choice of the numbers AI etc involves using a particular choice of polynomial in the integrand. 7.6.3 Definition. polynomial
In Lemma 7.6.2 we choose f(X) to be the
(4) where the polynomial h(X) is given by (2) and (3) and where p is a prime bigger than the integers q, band Ibol occurring in (1), (2) and (3). We let n = mp + p - 1, which is the degree of f(X). We then choose (i) M and G(X) to be the number and polynomial given in Lemma 7.6.2 , (ii) M,. = b"G(l') (iii) e. = bne"
for r E {,gl,'" ,,B,,,}, and
fal f(ur)e- w 'l'du
Properties of M, M{31'
for r E {,BI, ... ,,Bm}. " .,
•
M{3m and e{31" . . , e{3m
The next lemma is the analogue of Lemma 7.2.6. 7.6.4 Lemma. (a) The number 111 is an integer which does not have the prime p as a factor wueti p > Ibol, where bois as in (3) above . (b) There is a polynomial G I (X) wi tl: integer coefficients and of degree at most n such that G(r) = p G I (1') for r E {,BI, .. . , ,Bm}. Proof. The proof of each part (a) and (b) is very similar to the corresponding part of Lemma 7.2.6. Here we merely indicate the changes that are required and leave the details as Exercises 7.6 #3. To prove (a), replace x by ur and d.1: by du, and use Lemmas 7.6.1(a) and 7.6.2 in place of Lemmas 7.2.1(a) and 7.2.3. Recall that bo i 0 in (3). To prove (b), firstly expand f(X) in powers of (X - 1'), with r E C, using the obvious complex analogue of Proposition 7.1.6. Again replace x by ttl' and use Lemmas 7.6.1(b) and 7.6.2 in place of Lemmas 7.2.1(b) and 7.2.3 . That each of d o(1'), ... ,dl'-l(l') is zero if r is in {,BI' ,B2, ... ,,Bm} follows just as in Lemma 7.2.5, using the complex analogues of Proposition 7.1.6 and Lemma 7.1.7 . •
156
Famous Impossibilities
Note a significant difference between Lemma 7.2.6(b) and Lemma 7.6.4(b). In the latter we make no claim that G1(r) is an integer for r E {,8h. " ,,8m}' Although G1(X) has integer coefficients, the ,8's are complex numbers and so we do not expect G1(r) to be an integer (unlike Section 7.2 where r E {l,2, . . . ,m} so r is an integer, guaranteeing that Gt(r) E Z). In the proof for Jr we shall use symmetric polynomials to show that the sum
is an integer (even though the individual G t(,8i)'S may not be).
Note that a sequence {zn} of complex numbers converges to 0 as n ~ 00 if the sequences of real and imaginary parts of Zn converge to 0; this is equivalent to saying that the sequence {Iznl} of moduli converges to O.
Our final lemma is the analogue of Lemma 7.2.7. 7.6.5 Lemma. Let f(X) and n be as in Definition 7.6.3. For r E {,I31,' .. ,,13m} let Cr be as in Definition 7.6.3, so that
Tllen
lim e; = O.
p-->oo
Proof. Let tt E [0,1] and let x = ttl' where r is the complex number r = 1'1 + i1'2 (where 1'1,1'2 E R). By definition,
and so IeI'I (2),
= e'".
Similarly le-w'l
= e-
Ur t
< eh l. Also, from (4) and
157
Transcendence
Hence, using Lemma 7.5.5, Ibne
r
10
1
f(ttl')e-
lI r
n/ul
:::; bmp+p-lerl(2 length of interval) x (upper bound for If(ul')e-url'l)
: :; bmp p-1 +
e'·1.2.1.
~~I:I;~ (11'1 + 1131If ... (1 1'1+ 113mIfehlll'l
2erl+lrll
CP
b
(p-1)!
< -...,.......--,..---:-:-
2 where C = bm + Ir l(lr l + 113t1) ... (11'1 + Il3ml). Note that C is independent of p (and so are band Lemma 7.1.4, p-oo lim e, = o.
1'1) '
Hence by
•
Given the above results, the long awaited proof of the transcendence of 7r is relatively straightforward. 7.6.6
Theorem.
7r
is a transcendental number.
Proof. Suppose 7r is algebraic. We shall derive a contradiction. We have already proved various consequences of this assumption and restated them at the start of this section. We let p be any prime which is larger than all of q, band Ibol occurring in (1), (2) and (3), and define the polynomial f(X) and the integer n as in (4) above . It follows from Lemma 7.6.4 that there is an integer M not divisible by p and a polynomial G1(X) with integer coefficients and degree at most n such that, for r E {131,132, .. . ,l3m}, (5)
We now int roduce the numbers J\fp " . . . ,Afpm and
Let M and M,., e.; for each r in {131, 132 , ... ,13m}, be as in Definition 7.6.3. From (iii) of Definition 7.6.3 and Lemma 7.6.4(b), (6)"
Famous Impossibilities
158
and by (iii) of Definition 7.6.3,
10,.
= bne" 10
1
!(ul')e- u"l'du.
(7)
Then, from (5), (6) and (7), we have
M,. + 10,.
= pbnGI (1' ) + bner(M -
e- r pG I (l'))
= bnerM
and so (8)
Unlike the case for e, here the numbers M 131, ••• , M 13m may not be integers - indeed all that we can be sure of is that they are complex numbers since the {l's and hence the r in (6) are complex . We only get the integer we need by adding them all up and then recognizing that there is a symmetric polynomial involved .
Now GIC"Yd + G I (X 2 ) + ...+ G 1 (X11\) is clearly a symmetric polynomial in XI, X 2 " . . , X11\ and so it follows from the Fundamental Theorem on Symmetric Functions (Theorem 7.3.10) that
where H is a polynomial with integer coefficients and degree at most n. Substituting 13; for Xi (i = 1,2, . . . , m) we see that
where Itj is the i-th elementary symmetric function a, evaluated at (13\, 132, ... ,13m)' But, by (2) and (3) and Remark 7.3.7, Iti = ±bm_;/b.
Since the degree of H is at most n, each term on the right-hand side of (9) is equal to an integer divided by some power (no bigger than n) of b (since bo, &1, ... , b11\-1 and b are all integers). Thus for some integer d, and so, from (6),
is an integer divisible by p.
159
Transcendence
~that we know t ha t th is slim is an intege r with p as a - .
II
~e can choose p large enough to get a cont ra diction in essent ially the sa me way as in the proof for e.
Substituting (8) into (1) gives
[qMbn
+ M rh + ...+ M{1mJ + [C{11 + ...+ c{1mJ = O.
(10)
But, since the integer M is no t divisible by the prime p a nd since p is larger than the integers q and Ibl, it follows tha t q Mb" is an integer not divisible by p. This me ans that the term in the first [ J of (10) is a n integer not divisible bY]J and hence is a nonzero int eger. On the other hand , by Lemma 7.6 .5, t he modulus of eac h term inside the second [ J of (10) can be made arbit rarily small by choosing p large enough, and so the modulus of this second [ J term in (10) can be made less than 4 (for exam ple) by choosing p large enongh . This is the contradiction we promised. Hence
1r
is transcendental.
7.6.7 Corollary. edge and compass.
-
Squarin g th e circle is not po ssible with streigbt-
Proof. This follows immediately from Theorem 7.6 .6 and the remarks in Section 6. 2.3.
Exercises 7.6 1. Complete the details of the proof of Lemma 7.6.1, b eing care ful to chec k whi ch properties from Section 7.5 you use, 2. Rep eat Exer cise 1 for Lemma 7.6.2 . 3. Complete the det ail s of the proof of Lemma 7.6.4 . State a nd prove the complex analogues of Propositions 7.1.6 and Lemma 7.1. 7 whi ch you need . Prove the analogue of Lemma 7.2.5 whi ch you need.
160
Famous Impossibilities
Additional Reading for Chapter 7 The idea of using integration to prove that e is transcendental was introduced in Hermite's original proof in [ChH]. This proof for e was extended by Lindemann in [FL] to give a proof that 1r is transcendental. Simplified versions of the original proofs have been given by many authors; except for [FD], they all use the polynomials f(X) given in Definitions 7.2.4 and 7.6.3. They all use results about symmetric polynomials, moreover, in the proofs for x , Two variants of the proof stem from Hilbert in [DH] on the one hand, and from Hurwitz in [All] on the other. Altliough Hurwitz gives only the proof for e, his method was extended later to 1r by Niven in [IN3]. In Hilbert's approach, the integral of f(x)e- X is obtained by first expanding the polynomial f(x) in suitable powers, and then integrating term by term . This seems essentially easier than Hurwitz's approach in which the integral is expressed as the sum of the successive derivatives of f( x) , left in factorized form . In the proof for z , however, Niven's approach is more elementary th an Hilhert's in that it avoids use of integration in the complex plane (and the Cauchy integral theorem). Our proof is an attempt to combine the advantages of both of these approaches. References which follow Hilbert 's approach are [AD], [FlO] and [MS] - although the last gives only the proof for e. References which follow the Ilurwitz-Niven approach include [CH], [IN2] and [IN3]. In an attempt to make the proof even more elementary, a number of authors have removed integrals entirely from their proofs and replaced them by sums, as in [P G], [Ell], [GH], [FK] and [OS]. Spivak remarks that such elementary proofs are often harder to understand than the original proofs, inasmuch as the essential structure of the proof has been obscured just to remove a few integral signs. What do readers think? An approach to symmetric functions which is similar to ours is contained in [WF] and [CH]. A proof of the Fundamental Theorem on Symmetric Functions which uses a doub le induction, rather than the concept of order of a monomial, is given in [J A] and [AC]. [JA] J _Archbold, Algebra, 4th edition, Pitman, London, 1970. [AB] A. Baker, Transcendental Number Theory, Camb ridge University Press, Cambridge, 1975. [FB] F . Deukers, J _P. Bezivia and P. Robba, "An Alternative Proof of the Lindemann Weierstrass Theorem", American Matliematical Monthly, 97 (1990), 193-197. [GD] G. Birkhoff and S. Macl.ane, A Sl/T'vey of Modern Algebra, Macmillan, New York, 1953. [AC] A. Clark, Elements of Abstract Algebra, Wadsworth, Delmont, California, 1971. [WF] W.L. Ferrar, Higher Algebra, Clarendon, Oxford, 1958. [PG] P. Gordan, "Transcendenz von e und -r", Mntbemotische Annalen, 43 (1893),222-224 . [AH] A. Hurwitz, "Beweis der Transcendenz der Zahl e", Mathemotische Annalen, 43 (1893) , 220-222. [ClI] C.R. Hadlock, Field Theory and its Classical Problems, Carus Mathematical Monographs, No. 19, Mathematical Association of America, 1978. [ChlI ] Ch. Hermite, "Sur la fonction exponentielle" , Comptes Rendus des Seances de l'Academie des Sciences Paris, 77 (1873), 18- 24.
Transcendence
161
[DH] D. Hilbert, "Uber die Transcendenz der Zahlen e und 11''', Moihematische Annalen, 43 (1893),216-219; reprinted in Gesammelte Abhandlungen FoU, Chelsea, 1965. [EH] E.W . Hobson, Squaring the Circle, Cambridge University Press, 1913; reprinted in Squaring the Circle and Other Monographs, Chelsea, 1953. [GHl] G.H. Hardy , A Course of Pure Mathematics, 10th edition, Cambridge University Press, Cambridge, 1952. [GH] G.H. Hardy and E.M: . Wright, An Introduction to the Theory of Numbers, 3rd edition, Clarendon, Oxford, 1954. [FKl] F. Klein, Elementary Mathematics from an Advanced Standpoint, (vol 1: Arithmetic, Algebra and Analysis), Dover, New York, 1948. [FK] F. Klein, Famous Problems of Elementary Geometry; reprinted in Famous Problems and Other Monographs, Chelsea, 1962. [FL] F.L. Lindemann, "Uber die Zahl 11''', Mothematische Annalen, 20 (1882), 213-225. [IN2] I. Niven, Irrational Numbers, Carns Mathematical Monographs, No .Ll , Mathematical Association of America, 1963. [IN3] I. Niven, "The transcendence of 11''' , American Mathematical Monthly, 46 (1939), 469471. [DS] D.E . Smith, The /listory and Transcendence of 11'; reprinted in W.A. Young, Monographs on Topics of Modern Mathematics Relevant to the Elementary Field, Dover, 1955. [MS] M. Spivak, Calculus, Benjamin , New York, 1967.
CHAPTER 8 An Algebraic Postscript
We have completed the proof the: the thre e famous constructions are impossible. In this chapter we introduce some purely algebraic results which extend results in tile earlier chapters. These new results will enable you to construct fields that you have not seen before. If ex is a complex number wbicl: is algebraic over a subfield F of C, we have been able to show that F(ex) is a field, but we have, as yet, introduced no effective way of calculating reciprocals of numbers in F(ex) . In this clispter, we describe a way of constructing fields which gives, as a by-product, an elgoritlun for cnlculetuig suci: reciprocals . The material in this chapter is optional as it is, to some extent, a digression from our main theme of "impossibilities". We return to this theine in tile short concluding Chapter 9.
163
164
Famous Impossibilities
The Ring IF[X]I'(x )
8.1
We describe a set IF[X]p(x) of polynomials and define addition and multiplication operations for these polynomials. It will turn out that IF[X]p(X) is a field with these operations. 8.1.1 Definitions. Let IF be any field and let p(X) be a polynomial in IF[X] which is irreducible over IF and which has degree n . Define the set IF[X]/l(X) by putting
IF[X]p(x)
= {ao + alX + ... + an_IX n - 1 : ao, al, . . . ,an-l ElF}.
We also define addition and multiplication in IF[X]p(:<) : for each f(X) and g(X) in IF[X]/l(X), define (i) f(X) EElp(X) g(X) = f(X) + g(X) , (ii) f(X) @p(X) g(X) = remainder on dividing f(X)g(X) by p(X).Notice that, as a set, IF[X]p(x) depends only on the degree n of the polynomial p(X) . The set IF[X]p(x) contains all polynomials in IF [Xl of degree less than n (together with the zero polynomial). Addition EElp(X) in F[X]p(x) is just ordinary addition of polynomials while multiplication @p(X) depends on the polynomial p(X). Indeed, multiplication is done modulo p(X) in the same way that multiplication @n in Zn is done modulo n (as described in Section 1.1). The usual product f(X) g(X) may have too high a degree for it to belong to IF[X]p(x); to bring down the degree we remove multiples of p(X). 8.1.2
and
Example. In the set of polynomials Q[X]XL2 we have (X 2+3) EElXL 2 (2X+1)=X 2+2X+4 (X 2 + 3) @X3- 2 (2X + 1) = X 2 + 6X + 7
since (X 2+ 3)(2X + 1) = 2X 3 + X 2 + 6X + 3 which leaves a remainder of X 2 + 6X + 7 when divided by X 3 - 2. _ 8.1.3
Proposition.
(IF[X]I'(X), EElp(X) , @p(X)) is a ring.
Proof. Addition is just ordinary addition while multiplication is closely related to ordinary multiplication. The proof of the various ring axioms follows closely the arguments which one would use to prove that (Zn, EEl n ,@n) is a ring. The details are left for you to consider. -
165
An Algebraic Postscript
In fact IF[X]p(x) is always a ring, even if p(X) is reducible over IF. The irreducibility of p(X) over IF is only needed (as we shall see in the next section) to prove that IF[X]p(x) is a field. This is also analogous to 7!..n, which is always a ring, but only a field if n is a prime number. Notice that all the constant polynomials are in IF[X]p(x) so that IF ~ F[X]p(x) . Indeed IF[X]p(x) is obviously a vector space over F with a basis {I, X, .. . , xn-l} and dimension n .
Exercises 8.1 1. In Q[X]XL2' calculate
(a) (X 2 - 3X
(b) (X 2 -
+ 2) 3X + 2)
EBXL 2 (2X 2 + X
®X3- 2
+ 4), (2X2 + X + 4).
2. (a) Explain why X 2 + 1 is irreducible over R (b) In R[XL\"2+1 ' calculate
(i) (3 + 4X) EBX2+1 (2 - X),
(ii) (3 + 4X) ®X2+1 (2 - X),
(iii) (a+bX) EBX2+1 (c+dX), (iv) (a+bX) ®X2+1 (c+dX) . (c) Do the formulae in (b) (iii), (iv) look familiar? (If not, write down formulae for addition and multiplication of two complex numbers.) 3. (a) Show that in the ring Q[XJxLll the element X-I is a zero divisor. [Hint. Write X 3 -1 as a product of two polynomials of smaller degree, and then multiply those polynomials modulo X3 - 1.] (b) Is X3 - 1 irreducible over Q? Is Q[XL\"LI a field?
8.2
Division and Reciprocals in F[X]I'(x)
Now that the set of polynomials F[X]p(x) has been given the structure of a ring, it makes sense to ask whether this ring contains the reciprocal of each of its nonzero elements. A clever little algorithm can be used to show the existence of reciprocals, thereby giving an affirmative answer to our question. The algorithm depends on the validity of the following lemma, which may seem paradoxical at first sight . It says that you can reduce the degree of a polynomial by mnltiplying it by another polynomial!
166
Famous Impossibilities
You must remember, however, that it is not the usual multiplication of polynomials which is involved, but rather multiplication "modulo
p(X)".
8.2.1 Lemma. [Degree Reducing Lemma] Assume a(X) is a nonconstant polynomial in F[X]p(x) , If p( X) is irreducible over F then there exists a polynomial q(X) in F[X]p(x) sucli that the polynomial a(X) 0 p (x
)
q(X)
is nonzero and l1
Proof. By the Division Theorem (Theorem 1.3.2), we can divide p(X) by a(X) to get a quotient q(X) and remainder r(X), both in IF[X], such that
p(X)
= a(X) q(X) + r(X)
(i) r(X) = 0 (ii) degr(X) < dega(X).
and either or
But if (i) holds then, by (*),
p(X)
= a(X) q(X).
Since, however, a(X) is a nonconstant polynomial in IF[X]p(X), 0< dega(X) < degp(X) and so (**) contradicts the irreducibility of p(X). Thus (i) is false and hence
r(X) f= 0 and (ii) holds. Thus degr(X) < dega(X). From (*) it now follows that degp(X)
= deg (a(X)q(X)) = dega(X) + degq(X).
167
An Algebraic Postscript
This shows that degq(X)
< degp(X) and hence q(X) E F[X]p(x).
Finally note that, by (*), the remainder on dividing a(X) q(X) by p(X) is -r(X). Hence
a(X) @p(X) q(X)
=
-r(X) .
The conclusions of the lemma now follow from the boxed statements. Note that there is an algorithm implicit in the above proof: To get the "degree reducing factor" q( X) you dillide the irreducible polynomial p(X) by the polynomial a(X) whose degree you wish to reduce. The quotient q(X) and the remainder r(X) then satisfy
a(X) @p(X) q(X)
=
-l'(X)
where r(X) is nonzero and has lower degree than a(X) . We are now able to give an algorit.hm for calculating reciprocals in F[X]p(x), For brevity, we use the symbol @ to stand for the multiplicat.ion @p(X) in F[X]/l(X)'
8.2.2 Algorithm. [Calculation of Reciprocals] Assume that a(X) is a nonzero polynomial in the ring F[X]P(x) where p(X) is irreducible over F. Construct, using the Degree Reducing Lemma (Lemma 8.2.1), polynomialsql(X) , q2(X), q3(X), .. . in F[X]p(x) such that a(X) @ ql.Y), a(X) @ ql(.X) @ q2(X), a(X) @ ql(X) @ q2(X) @ q3(X), and so on, have successively lower degrees. Hence eventually we get
where c
-f: 0
is a constant in the field F. If we put
d(X) =
!c ql(.X) @ q2(X)
@
q3(X)
@ .. . @
then d(X) is the reciprocal of a(X) in F[X]I'(x) ,
qm(X),
168
Fam ous Impossibilities
Proof.
It is eas y to see t hat
a(X) 0 d(X )
•
= 1.
T he p olynomial d(X ) so constructed , bein g t he reciprocal of a(X ) in the ring F[X]I'(X), is deno ted by
l/a(X ) (a lt ho ugh this notation does not indicate the dep enden ce of t he answer on the polynomial p(X )). The following rough and ready descript.ion of the above algorithm for obtaining l/a(X ) m ay b e helpful when you apply it to sp ecific exam ples.
(A) Find successive reducing fa ctors for a(X) until you get a con stan t. (B) Multiply all th e reducing fa cto rs togeth er (modulo p(X )) and th en divid e by th e con st ant. Th is qives you l/a(X ). 8.2.3
Find the reciprocal of X 2 + 1 in Q[X] X3-2.
Example.
x 2 + 1.
(A) W e fi nd successive reducinq factors for 1. To find a reducin g fact or for X 2 into X 3 - 2 to get
+
1, divide t his p olynomial
X 3 - 2 = (X 2 + l )X + (- X - 2). Hen ce
(X 2 + 1) Thus X
+2
0 );3_ 2
X
=
X+2 .
is the result of applying the redu cinq factor X.
+ 2 , divide
2. To find a reducin g factor for X into X 3 - 2 to get
X3 Hen ce
(X
-
2 = (X
+ 2)
+ 2)(X 2 -
0 );3-2
(X 2
-
2X
2X
+ 4) -
+ 4) =
this polynomial
10.
10.
Thus t he constant 10 is the resul t of applying a fur ther redu cing fa ct o-r X 2 - 2X + 4.
An Algebraic P ostscript
169
(B) W e m ultiply th e reducuu; fa ctors and divide by th e cons tan t. This gives 1 X 2
+1
It now follows easily that F[X]p(x) is a field (provided p(X) is irredu cibl e over IF).
8.2.4 Theorem. If p( X ) is irreducible over IF th en IF[X]p(x), with th e op erations in Definition s 8.1 .1, is a field. Proof. We know from Secti on 8.1 that it is a ring. It is clearly commutative and contains 1. By Algori thm 8.2.2, it is closed under di vision by nonz ero elements. Hence it is a field. -
This gives us an impor tan t method for constructing new fields find a pol ynomial p(X ) which is irr educible over a field IF and construct IF[X]p(x ). For example, you may have though t tha t there could be no field with four elements since the obvious contender 1: 4 is not a field. We can, however , use the F[X]),(X) construction to exhibit a field with four element s. 8.2.5 Example. [A Fi eld with Four Element s] We shall apply Theorem 8.2.4, with IF = 1: 2 and p(X) = X 2 + X + 1.
t= °
t=
Because p(O) and p(l) 0, it follows from the Small Degree Irreducibility Theorem (Theorem 4.2.3) that p(X) is irreducible over 1: 2 . Now
has just four elements it is a field.
0, 1, X , 1 + X
and, by the above theorem ,
It is easy to dr aw up tables for the two operations in thi s field.
Famous Im possib ilities
170
The tables ar e shown below. EElp(X )
0
1
X
1+ X
0
0
1
X
1
1
0
i -i
l+X X
X
X
0
1
l+X
l+ X
l+X X
1
0
@ p( X )
0
1
X
l+ X
0
0
0
0
0
1
0
1
l +X
X
0
X
X l+X
l+ X
0
l+X
1
X
x
1
Note also t hat (as you might expect from Theorem 3.2.4) [E : l 2] = deg p(X ), each of these numbers being equa l to 2.
•
- - - - -- -- - - E x er cises 8.2 1. In the rin g Q[X JxL 2 find success ive reducing fact ors for the polyn omi al X 2+X +1 and hence find the reciprocal of this element.
2. (a) In the rin g R[X L\ 2+1 find
(i) 1/ (3 + 4X) , (ii) l / (a + bX) , if a and b ar e not both zero. (b) Does your answer to (a)(ii) agree with th e formula for l /(a +bi) in C? (See par t (c) of Exercises 8.1 # 2.) 3. Which of t he following are fields? (J ustify your ans wers .)
(a) Q[X ]X3-2X+l, (b) Q[X ]X3+2X+l, (c) Q(q'2)[X Jx3_2'
171
An Algebraic Postscript
4. In Example 8.2.3, check the very last step of the calculation and then verify directly, using multiplication modulo X 3 - 2, that the answer given for the reciprocal of 1 + X 2 is indeed correct.
5. Find a polynomial of degree 2 in l3[X] which is irreducible over l3 ' Follow the method of Example 8.2.5 to construct a field with
nine elements.
6. Show how to construct a field with eight elements. [Hint. You will need a polynomial of degree three in l2[X] which is irreducible over l2']
8.3
Reciprocals in IF(a)
We return to the situation in the earlier chapters where 0' is a complex number which is algebraic over a subfield IF of C, with irr(a, IF)
= p(X)
and
deg(O', IF)
= n.
Although we were able to prove in Chapter 3 that IF (0') is a field , we had no efficient way of expressing 1/{3 in the form Co
+ Cl
a
+ . . . + Cn-l a,,-1
(C; E IF)
for an arbitrary nonzero (3 in F(O'). We can now do this very easily, however, by using the algorithm from Section 8.2. If you look at. the definition of F(O') (Definition 3.2.1) and at the definition of IF [X]I'(X) in Section 8.1, you will notice a certain similarity. We bring this out by defining the evaluation mapping
-+
IF(O')
by putting
Note that
Famous Impossibilities
172
8.3.1
P ropos it io n .
Th e map cPa is a field isomorphism.
P roof. That cPa is a one-to-one and onto map follows routinely from the fact that {1, 0', ... , O'n-l } is a basis for F (0') over IF (see Exercises 8.3 #8) . It is also easy to show that cPa pr eserves addition in th e sens e that
for all po lynomials f(X) and g(X) in F[Xl. What about multiplication? This is a bit more com plicat ed. By the Division Theorem (Theorem 1.3.2), t here exist polynomials q(X) and 1"(X) in IF[Xl with
f(X) g(X) = p(X) I](X) + l'(X)
(1)
< degp(X) . If we replace X by 0' in
where 1"(X) = 0 or deg l'(X) (1) we get
f(O')g(O') = p(O')q(O') + 1"(0') = 0 .1](0') + 1" (0')
= 1"(0').
(2)
Bu t by the definition of multiplication in the field F[Xlp(x ),
l'(X)
=
f(X) 0 p (x ) g(X).
Henc e in the equation (2),
while right hand side
=
cPa(J (X ) 0 p(x
)
g(X)) .
These two sides ar e equal, which implies that the map cPa prese rves multiplication. We can now put the algorithm for reciprocals in IF[X ]p(X ) to work in IF(O') since these two fields are isomorphic. 8.3.2 Example. the field Q( ~) .
Find th e reciprocal of the number 1 + (~f m
173
An Algebraic Postscript
Note firstly that , because irr( {Y2, Q) = X 3 - 2, it follows that
or, more formally,
Hence
1 1 + (ij2)2
1
=
(1 +1X2) (-
as
~X2 + ~X +~)
= -~5 ({Y2)2 + ~ 5
({Y2) ·
by Example 8.2.3
+~. 5
•
This concludes our algebraic postscript in which we have shown how to construct new fields (including finite ones) and have given an efficient way of calculating reciprocals in fields of the form IF[X]p(X) and F(o:) .
- - - - - - - - - - Exercises 8.3 1. For which choice of the polynomial p(X) is the field Q( ij2) isomorphic to the field Q[X]p(X) ?
2. Check (using multiplication in Q( ij2)) the answer obtained in Example 8.3.2. 3. (a) Write out explicitly each of the sets Q( ij2) and Q[X]XL2' (b) Write down
(i) a typical element of the latter set, (ii) its image under the map
and show both in the appropriate sets in the following diagram.
Famous Impossibilities
174
Q(~)
(c) Which theorem from Chapter 3 guarantees that each element of Q( ~) is uniquely expressible in the form
Co
+
CI ij2
+
C2( ij2)2
with each c, in Q? (d) Apply the map cP ~ -I (the inverse of the map cP ~) to the above element. 4. Let f(X)
Q[X].
= X 2 + 1 and
g(X)
= 2X + 3,
which are elements of
(a) State why f(X) and g(X) belong to Q[X]XL2' (b) Calculate the following sum and product in the ring Q[XL\03_2'
(i) f(X) EBXL 2 g(X) , (ii) f(X) ®X3- 2 g(X) . (c) Use the fact that
An Algebraic P ostscript
175
8. Prove t hat t he map cPa defined in the text is one-to-one and onto . [Hint. Use t he fact that {I , 0', ... , O'n -1 } is a basis for IF (0') over IF .J
Additional Reading for Chapter 8 T he field F[X ]p(x ) is usually constru cted as a facto r ring of the ring F [X ] of polynomials. See, for exam ple, Section 35 of [JF] an d Chap te r 10 of [WG]. Recipro cals of elements in F( Q) are often calcula ted using t he algorit hm for finding th e greatest commo n div isor of two polynomi als, as indicated , for exa mple, in Sect ion 110 of [AC] and Section 2.2 of [CIl lo Finit e fields have been completely cha ra cterized: see, for exam ple, Section 45 of [JF] an d Chapter 11 of [WGJ. [AC] A. Clark, Eleme nts of Abstract Alqebra, Wadsworth , Belmon t , California, 1971. [JF] J .B. Fra leigh, A First Cou rse in A bstract A lqebro, 3rd edition, Addison-Wesley, Reading, Massac husetts, 1982. [WGJ W.J . Gilbert , Mode rn A lgebra with A pplicatio ns, Wiley, New York, 1976. [CR ] C.R. Hadlock, Field Theory and its Classical Problems, Carus Mat hematical Monogra phs, No. 19, Mat hemati cal Association of America , 1978.
CHAPTER 9 Other Impossibilities
and Abstract Algebra
This book has introduced some basic ideas from the part of abstract algebra wbicu deals with fields. Use of these ideas made it relatively easy for us to prove th e impossibility of the constructions mentioned in the Introduction. It may now come to you as a pleasant surprise that your newly acquired ideas about fields have Iurtbet applications in proving the impossibility of solving otliet famous problems. Three sucli problems are discussed below. H'hile the first problem comes from geometry, the other two come from eletneutery algebra and elementary calculus respectively.
177
178
9.1
Famous Imposs ibilities
Construction of Regular Polygons
A polygon is said to be regular if its sides all have the same length and its angles are all equal. Constructing a regular polygon with n sides amounts to dividing an angle of 360 0 into n equal parts. The ancient Greeks were able to construct, with straightedge and compass, regular polygons with 3 sides and 5 sides, but were not able to construct one with 7 sides. It was, of course, trivial for them to construct a polygon with 2m sides, for any integer m ~ 2, as this involves merely repeated bisection. They also proved that if it is possible to construct regular polygons with r sides and 8 sid es, then it is also possible to construct one with 1'8 sides, provided that the integers rand 8 have greatest common divisor 1. Thus, taking r = 3 and 8 = 5, they knew it is possible to construct one with 15 sides. No further progress was made on the problem of constructing regular polygons for over 2000 years until in 1796 the nineteen year old Carl Friederich Gauss amazed the mathematical world by constructing a regular polygon with 17 sides. He did this by showing that the equation x l 7 - 1 = 0 can be reduced to a finite set of quadratic equations. So the regular polygon with 17 sides can be constructed with straightedge and compass. Gauss showed, furthermore , that a regular polygon with n sides can be constructed whenever n is a prime number of the form for some integer l: ~ O. Prime numbers of this form are called Fermat primes. (Notice that 17 = 222 + 1 is a Fermat prime.) From the results mentioned above, it follows that a regular polygon with n sides can be constructed if
where i ~ 0 is an integer and PI, P2 , ... , Pi are distinct Fermat primes. What if n does not have this form ? The answer was provided by Pierre Wantzel in 1837, who proved that, for n not of this form, the construction of a regular polygon with n sides in impossible. The proof of this impossibility requires little more than the ideas about fields already given in this book. Books which cover this topic in detail are listed in the "Additional Reading" at the end of this chapter.
179
Other Impossibilities
9.2
Solution of Quintic Equations
Our next problem is concerned with solving polynomial equations, that is, equations of the form
(ai E R) such as, for example, 2x 3
-
5x 2 + 2x - 11 = O.
Such equations arose early in the history of mathematics: the ancient Babylonians solved many particular quadratic equations, while the Arabs discovered a general formula for the solutions during the Middle Ages. Their formula is equivalent to the one used today by school children to solve a.T 2 + In: + c = 0 :
x-
-b± Jb 2 - 4ac 2a
(a =? 0).
During the Renaissance, Italian mathematicians discovered formulae for solving cubic and quartic equations. Thus, for example, to solve a cubic of the special form x 3 + p.T = q, they used a formula which, in modern notation, reads
Cases in which the quantity under the square root sign is negative puzzled them, however, as complex numbers were not properly understood at that time. The above formulae are said to provide solutions by radicals of the original equations. The idea is that the solutions are obtained from the coefficients in the equations by applying successive field operations and extractions of roots. Attempts were made to find similar formulae to solve an arbitrary quintic equation, but none of these attempts succeeded. In 1824, Niels Henrick Abel, after mistakenly believing that he had found the solution, proved that it is impossible to solve an arbitrary quintic by radicals. Soon afterwards, the youthful genius Evariste Galois established a necessary and sufficient condition for a polynomial of arbitrary degree to be solvable by radicals. He was then able to extend
Famous Impossibilities
180
Abel's result and to show that, for each integer n ~ 5, it is impossible to solve an arbitrary equation of degree n by radicals. Although a complete discussion of these results would require a lot more abstract algebra than has been developed in this book, it is possible to indicate how the problem can be expressed in terms of field extensions. To this end, let f(X) be a polynomial in F[X] where F is a subfield of C. Instead of referring to the equation f(x) = 0 as being "solvable by radicals", we apply this phrase to the polynomial f(X) itself, the precise definition being as follows:
9.2.1 Definition. A polynomial f(X) E F[X] is said to be solvable by radicals O1/er F if there is a sequence of complex numbers CI, C2, C3," " Cn and a sequence of positive integers i\, i 2 , i 3 , • •• , in such that il Cl ElF i2 C2 E F(cd C3 i3 E IF(cd(C2) in
E IF(cd(C2) ''' (cn-d and the zeros in C of f(X) are all contained in the field Cn
•
IF(CI)(C2)'" (cn-d(c n) .
This definition has been expressed in terms of integer powers of complex numbers in order to avoid the ambiguity which would arise from the use of roots of complex numbers. The following theorem expresses, in a precise way, the fact that it is impossible to solve an arbitrary quintic or higher degree equation by radicals.
9.2.2
Theorem.
There exist polynomials of every degree n
~
5
wbicli are not solvable by radicals over Q.
The proof consists of choosing suitable polynomials and verifying that they are not solvable by radicals. The choice given in [eR] is 2X5 - 5X 4 + 5 for n = 5 and, for n ~ 5, the result of multiplying this polynomial by xn-5. The proof that these polynomials are not solvable by radicals, however, involves use of the field and group theory developed by Galois. Books which cover this topic in detail are listed in the "Additional Reading" at the end of this chapter.
181
Other Impossibilities
9.3
Integration in Closed Form
A topic which in the past has tended to dominate the undergraduate curriculum in elementary calculus is that of finding indefinite integrals in "closed form"; that is, in terms of functions which are usually defined in introductory calculus courses. This topic is usually presented as an art rather than as a science: each integral comes out if the student is clever enough (or lucky enough!) to guess a suitable trick. There are , however, some integrals which look quite innocent and yet refuse to "come out". One such example is e x2dx.
J
The impossibility of finding this integral in closed form was proved by Joseph Liouville, who, in the years 1833-1841, laid the foundations for the study of integration as a science, rather than an art. At first sight one might guess that, of all the topics in the undergraduate curriculum, none could be further removed from abstract algebra than that of finding integrals in closed form - not so, however! In the "suggested reading" section of [MS], written in the mid 1960's, Spivak mentions that polished algebraic treatments of Liouville's theorems had recently been obtained. One such treatment is given in [MR], which includes, in particular, an algebraic proof of the impossibility of finding the above integral in closed form. Even the most cursory reading of [MR] will reveal its dependence on ideas about fields similar to those developed earlier in this book: alqebraic and transcendental elements, extensions of fields, degree of one field Ol1er another, irreducible polynomials of elements, touters of fields.
A full understanding of [MR] requires a knowledge of functions of a complex variable, because the elements of the fields involved are no longer numbers, but functions defined on subsets of C. As one might guess, fields are involved at the very outset, in setting up the problem in a precise way.
Famous Impossibilities
182
The most concrete evidence of the recent work in this area can be seen in a number of computer algebra packages including Reduce?', Mathernatica" and MACSYMATM which are available on various computers ranging from microcomputers to mainframes. These packages contain implementations of the algorithm due to R.H. Risch [RR] for integrating an important class of functions in closed form. When you type in one of these functions the package tells you whether or not the function can be integrated in closed form and, if so, displays the answer.
Additional Reading for Chapter 9 The constructibility of regular polygons is studied in Sections 135-138 of [AC], Section 48 of [JF] and Section 10.2 of [LS]. Each of these books gives an account of Galois theory, while Sections 139-149 of [AC] give an extensive account of the application of Galois theory to the solution by radicals of polynomial equations of low degree. A recent re-examination of the short, but action-packed, life of Galois can be found in [TR]. The article by A.C. Norman in [CAl gives an account of recent algorithms for integration in closed form , suitable for implementation on a computer, while [JD] presents the underlying mathematical theory. [CAl Computer Algebra: Symbolic Algebraic Computation, ed. B. Buchberger, G.E . Co1lins, R. Loos and R. Albrecht, Springer-Verlag, Vienna, 2nd edition, 1983. [AC] A. Clark, Elements of Abstract Algebra, Wadsworth, Belmont, California, 1971. [JD] J.n . Davenport, On the Integmtion of Algebraic Functions, Lecture Notes in Computer Science, 102, Springer-Verlag, Berlin, 1981. [JF] J.B . Fraleigh, A First Course in Abstract Algebra, 3rd edition, Addison-Wesley, Reading, Massachusetts, 1982. [CH] C.R . Hadlock, Field Theory and its Classical Problems , Carus Mathematical Monographs, No. 19, Mathematical Association of America, 1978. [MR] M. Rosenlicht, "Integration in Finite Terms" , American Mathematical Monthly, 79 (1972), 963-972. [TR) T. Rothman, "Genius and Biographers: the Fictionalization of Evariste Galois", American Mathematical Monthly, 89 (1982) , 84-106. [RR] R.n. Risch, "The Solution of the Problem oflntegration in Finite Terms", Transactions of the American Mathematical Society, 76 (1970), 605-608 . [LS] L.W. Shapiro, Introduction to Abstract Algebra, McGraw-Hill, New York, 1975. [MS] M. Spivak, Calculus, W.A. Benjamin, New York, 1967.
Index
abelian group, 24-25
calculus , 115, 117, 149, 177, 18 1
abstract algebra, 3
coefficient, 13-15, 137, 144. See also lead ing coefficient, equa ting coefficients.
Ahmes ,2 algeb ra, i, 3, 7, 75, 115
- integer, 120, 128, 130, 147
algeb raic .v-vi, 7, 14, 75, 99, 103, 105, 181
- rat ional, 148
algebraic number, 27-28, 30-31, 33,72, 99,147,153, 157
commutative, 10
algebraic over a field, 27-33, 39, 42, 44, 48,50-51,68,72,10 1-102 , 124,147, 171
commutative ring , 13,24-25
comm utative group , 24
Algorithm
compass, 1,3-4 ,75-76,78,81. See also st raightedge and compass .
- Calculat ion of Reciprocals, 167
- collapsing, 82
- Division, 17-19
- noncollapsing, 82
- Euclidean, v
- rusty, 85
- Fundamental Algorithm for Symmetric Functions, 139
comp lex analysis, 150
All Constructibles Come from Square Roots Theorem, 96, 100-101 , 108, 110
complex-valued function , 115, 149-150
complex number, 8, 16- 17, 146, 156 complex-valued integral, 149
approximation, 121, 123, 125
computer algebra, 182
Archimedes, 2
CON,93-94
basis, v, 11, 26, 40, 52-53, 55-56
constan t, 69
Basis for F(a ) Theorem, 46
constant polynomial , 63-65 , 70
Basis for a Tower T heorem, 53
constructible number, 4, 75, 93-96, 99, 101, 103- 104, 110
Binomial Theorem , 107, 119
constructi ng a line par allel to a given line . See Standard Constructions.
bisect, 3,77-78,80,91 bisecting a line segment. See Standard Const ructions.
constructing a product. See Standard Constructions .
bisecting an angle. See Standard Const ructions .
constructing a quot ient. See Standard Constructions.
C,8
constructing a square root . See Standard Constructions.
Calc ulation of Recipro cals, 167
183
Famous Impossibilities
184 constructing an angle of 60° or 90°. See Standard Constructions. Construction Rules, 89 -rule (i), 90 -rule (ii), 90
duplicating the cube, 1. See also doubling the cube . e, vi, 28,115,122-125, 131-132, 160
e", 146 Eisenstein's Irreducibility Criterion, 107
copying an angle . See Standard Constructions.
elementary symmetric function , 136-138
cubic, 4, 68, 104
equating coefficients, 121, 128 Eratosthenes, 1
De Moivre 's Theorem, 107
Euclid , 117
deg !(X), 14
Euclidean Algorithm, v
deg( Q, F), 33
expansion in powers, 119-120, 127-129
degree, 138
exploration, vi
-least, 32-34, 62, 70
exponential function, 118, 122, 126
-of a number over a field, 33, 42, 46, 99, 101
extension field, 8, 71, 74, 143, 181 extension of F by
Q,
44
-of divisor, 18 -of extension field, 12, 181
F[Xlp (x ), 164
-of polynomial, 14, 17,20,46,61-62, 118, 120, 138, 143, 145
Fvcircle, 108-109,111
-of remainder, 18
F-point, 108-109, 111
Degree of a Constructible Number Theorem, 101
factor, 20-21, 23, 56, 62, 116-117, 129, 131, 159
Degree Reducing Lemma , 166-167
- common, 20-21, 23
Delian problem, 1
- of degree 1, 64-65
dense, 121
Factor Theorem, 65, 67
dependent. See linearly dependent .
factorize, 4, 61-63
F·line, 108-109, 111
derivative, 149-150
Fermat prime, 178
differentiation , 118, 121
field, v , 8-11, 24-25, 39-44, 46-47, 50, 56, 93-94, 163-165, 169, 177
dimension , v, 4, 11-12,26,40,46,52,56, 165
-finite, 9, 13
Dimension for a Tower Theorem , 5.5, 102
- smallest , 41, 48
dividend, 18-19
field of scalars, 10
divis ion, 10, 165, 169, 171
Field with Square Roots Theorem, 93, 96
-of polynomials, 17
finite field, 9, 13
Division Algorithm , 17-19
finite-dimensional, 4, 11-12,55,71-72
Division Theorem, 19,65, 166, 172
finite-dimensional extension, 71-72
divisor, 18-19
Fundamental Algorithm for Symmetric Functions, 139-142, 145
doubling the cube , 1,61,91,93,99,104
Ind ex
185
Fundamental Th eorem of Algebr a, 146-147
-closed form , 181
Fundament al Theorem of Arithmetic , 21, 23,116-117
- linearity of, 117, 126-127, 151
Fundamental Th eorem of Calc ulus, 151
inverse, 24
Fundamental Theorem on Symmetric Functions, 138-139 , 143, 158, 160
irr (a, F ), 33
geometrical ,99 geometrical construct ions , 16, 75
integration by parts, 118, 126, 151
irrational numb er, 7,20,23 ,122-123 ,125, 133 irreducible, 34, 61, 63, 68-69,164-166, 169
- collapsing compass, 84
irreducible polynomial, 62-63, 68, 125
- doubling the cube, 2, 91, 93, 99,104
- over subfield, 63
- quintisect ing an angle, 105
irredu cible polynomial of a numb er, 33-34 , 42-43 ,61-62 ,68,104 , 125, 147, 181
- regular polyg on, 2 -rules for, 75, 89, 91 - squa ring the circle, 2, 92, 105 -straightedge and compas s, 3,76,81 -83, 104-105 -three famous, 66, 75, 103 - t risect ing an angle , 3, 83, 92, 104 geometry, v, 3, 75 -Euclidean, 77 Greek , 1-2 ,4, 75,81-82,89 group, 24
isomorphism, 171, 173 leading coefficient , 14, 31, 137. See also coefficient. limit, 119, 130, 156 Lindemann, 2, 160-161 linear algebra, v linear combin at ion, 11, 119 ·- t rivial, 11, 26 linear span . See span . linearity of int egration . See integration .
higher order, 139
linearly dependent, 11, 26, 30, 36, 47, 72
identity, 24-25
linearly independent , v , 11, 26, 36, 40, 46, 54-55
-additive, 24-25
lower order, 139, 141
-multiplicative, 25
lowest terms , 20
impossibility, v-vi, 3-4, 61, 66, 74-75 95-96, 99, 101, 105, 112, 115, 159, 163, 177-179, 181
MACSYMA, 182
independent. See linearly independent.
Mathematica, 182
indeterminate, 14, 134, 138
mathematical induction, 111, 126, 133, 141, 143, 154, 160
induction. See mathematical induction.
midpo int , 77
initial set of poi nts, 89-90, 93
modulo, 164, 166
integral , 150, 153
modulus, 152
integral domain , 13, 25
Monic Irreducible Zero Theorem (MIZT) , 68-69
integration, 115, 126, 181
186
monic polynomial, 31-3 ·1, 68-70 , 143, 147-148
Famou s Impossibilities
R[X], 14
monomial, 18, 138, 141
rational number, 4, 8, 20-21, 93, 121, 123, 125
N ,10
Rational Roots Test (RRT) , 7, 20-21 , 67-6 8
non-constructible numb ers, 100
real number, 4, 8, 21, 94, 121, 123
one-to-on e mapping, 134 onto mapping, 134
real-valued function, 149, 151 reciprocal , 40, 44, 163, 165, 167, 170-172 Reduce, 182
n , v-vi, 2-5, 28 , 92 , 105 , 115-116, 134 , 143 146-147 , 149, 153, 156-15 7, 160-161
reduced , 34, 61
pairs of fields, 11
reducing factor , 167-170
permutation, 134, 136
regular polygon, 1-2, 178, 182
reducible , 62-63 , 165
- identity, 135
remainder , 17-19 , 164, 166-167
polynomial, 7, 13-1 5, 118. See also monic polynomi al.
Rhind Papyrus , 2
- of degree 2 or 3, 65
ring , 8, 10, 13-16, 24, 40,44,46-47 ,164 , 169
- over a ring, 13-14
- finite, 17
polynomial form, 13-1 5, 118
ring of int egers, 25
polynomial function, 13, 15, 118, 126
ring with unity, 24-2.5
polynomial ring , 14-1 5
ruler, 1,3
prime , 116-117 , 127-12 9, 157,165, 178 Problem I, 1-4 , 104. See also doubling th e cube. Problem II, 1-4 ,104. See also tri secting an angle . P roblem III, 1-4,99 , 11.5. See also squaring th e circle. proofs, v-vi , 4 Ptolemy, I
scalar, 10-11 ,29,40,46-47,52- 53 scalar multipli cati on, 10-12 ,25 schizophrenic, 53 Small Degree Irreducibility Theorem (SDIT) ,66-69 Smallest Field Th eorem, 48 solution by rad icals, 179, 182 solvable by rad icals, 179-1 80 span, 25, 40, 44, 46, 50-51, 53-55
Q,8
square root s, 100, III
quadratic, 61, 110
squaring th e circle, v , 1,4 ,92 ,99, 105, 11.5 , 159, 161
quadrature, 1-2 quintic equation , li9
Standard Const ruct ions, 76
quintisec ting an angl e, 105
- a ngle of 600 , 79, 104
quotient , 17-19, 166-1 67
- angle of 900 , 79 - bisect ing a line segment, 77
R,8
- bisecting an angle, 78
187
Ind ex -copying an ang le, 80
tower of fields, 5 1-5 2, 55, 95, 102, 181
- line parallel to a given line, 81
transcendental , v- vi, 4, 28 , 115 , 124, 132 , 153, 181
- opera t ions of typ e (i ), 90-93, 109 - operat ions of type (ii), 90-93, 109 - prod uc t , 85 - quotient , 86 - square root, 87 - t ransferring a length , 78 st raighte dge, 3-4, 75-77, 80-8 1 st ra ightedge and compass, 104-105, 159
transcendental number, v, 30, 124, 131, 157 transferring a length . See St and ard Const ruct ions . t risect ing an a ngle, 1-4 , 92,104 trivial linear combina tio n, 11, 26 uniqueness, 127,1 29, 154
subfield , 8-9 , 11, 16,25,39- 40,44, 52,93 - smallest, 8, 44
vector, 10,25, 29,40,46 , 51- 53
subring, 25,40,44 ,47
vector addition, 10-11 ,2.5
subs pace , 25,40,44-4 5
vecto r spa ce, v, 4, 8, 10- 11, 24- 26, 29, 39-40, 46-4 7, 51-56, 165
Successive Squ are Root s Give Const ruct ibles Th eorem , 95-96
vector sub sp ace. See subspace.
symmet ric polynomi al , 115, 134-1 36, 138, 141, 143, 146- 148, 156, 1.58. See also elementa ry sy mmet ric function .
Want zel,2
symmet ry, 134
Z, lO
Tay lor series, 122 term, 139
zero of a polynomial , 20-21, 28-29, 31, 34, 64-66,68-69, 137, 143,145, 147- 148
Zn. 9
term of highest order, 139, 141
zero polynomial , 14, 118
tetragonid zein , 2
zero vector, 11, 25-26
thr ee-dim ensional , 11
zero-divisor, 10, 13,24- 25