INTRODUCTION TO SET THEORY
PURE AND APPLIED MATHEMATICS A Program of Monographs, Textbooks, and Lecture Notes
EXECUTIVE EDITORS Earl 1. Tafi Rutgers University New Brunswick, New Jersey
Zuhair Nashed University of &laware Newark, Delaware
EDITORIAL BOARD M. S. Baouendi University of Calijbrnia, San Diego Jane Cronin Rutgers University Jack K. Hale Georgia Institute of Technology
Anil Nerode Cornell University Dunald Passman University of Wisconsin, Madison Fred S. Roberts Rutgers University
S. Kobayashi Uniwrsity of Cal~$ornia, Berkeley
Gian-Carlo Rota Massachusetts Institute of Technology
Marvin Marcus University of Calflornia, Santa Barbara
David L. Russell Virginia Polytechnic Institute and State University
W. S. Mmsey Yale University
Walter Schempp Universitat Siegen
Mark Teply University of Wisconsin, Milwa ukee
MONOGRAPHS AND TEXTBOOKS IN PURE AND APPLIED MATHEMATICS 1. K. Yano, Integral Formulas in Riiannian Geometry (1970) 2. S. Kobayashi, Hyperbolic Manifdds and Momorphic Mappings (1970) 3. V. S. Vladimimv, Equations of Mathematical Physics (A. Jefkey, ed.; A. Littlewood, trans.) (1970) 4. 8. N. Pshenichnyi, Necessary Conditions for an Exfremum (L. Neustadt, translation ed.; K. Makowski, trans.) (1971) 5. L. N a k i et al., Functional Analysis and Valuation Theory (1971) 6. S. S. Passman, Infinite Group Rings (1971) 7. L. DMhft:Group Representation Theuy,Part A: Ordinary Representation Theory. Part B; M u l a r RepresentationTheuy (1971,1972) 8. W. Boothby and G. L. Weiss, eds.,Symmetric Spaces ( l 972) 9. Y. Matsushima, Differentiable Manifolds (E. T. Kobayashi, trans.) (1972) 10. L. E. Ward, Jr., Topdogy (1972) 1l. A. Babakhanian, Gohomological Methods in Group Theory (1972) 12. R. Gilmer,Multiplicative Ideal Theory (1972) 13. J. Yeh, Stochastic Pmesses and the Wiener Integral (1973) 14. J. &m-Neto, Introduction to the Theory of Distributions (1973) 15. R. Larsan, Functional Analysis (1973) 16. K. Yano and S. Ishihara, Tangent and Cotangent Bundles (1973) 17. C. Prucesi, Rings with PdynomialIdentities (1973) 18. R. Hennann, Geometry, Physics, and Systems (1973) 19. N. R. Wallach,Harmonic Analysis cm Homogeneous Spaces ( l 973) 20. J. Dieudonnd, Introduction to the Theuy of Formal Groups (1973) 21. l, Vaismen, Cohomdogy and Differential Forms (1973) 22. 8.-Y. Chen, Geometry of SubmanifoMs (1973) 23. M. Marcus, Finite Dimensional Multilinear Algebra (in two parts) (1973, 1975) 24. R. Larsen, Banach Algebras (1973) 25. R. 0.Kyala and A. L. Vitter; eds., Value Distribution Theory: Part A; Part B: Defiat and Bezout Estimates by Wilhelm SWl(1973) 26. K. 8. Stdarsky, Algebraic Numbers and m a n t i n e Approximation (1974) 27. A. R. Magid, The Separable GaloisTheory of Commutative Rings (1974) 28. 8. R. McDonald, Finite Rings with Identity(1974) 29. J. Satake, Linear Algebra (S. Koh et al., trans.) (1975) 30. J. S. W a n . Localization of NoncommutativeRings (1975) 31. G. Klambauer. Mathematical Analysis (1975) 32. M. K Agoston, Algebraic Topology (1976) 33. K. R. Goodearl, Ring Theory (1976) 34. L. E. Mansfield, Linear Algebra with Geomebic Applications (1976) 35. N. J. Pullman, Matrix Theory and Its Applications (1976) 36. 8. R. McDaneld. Geometric Algebra Over Local Rings (1976) 37. C. W. Groetsch, Generalized Inverses of Linear Operators (1977) 38. J. E. Kucrkowski and J. L. Gersting, Abstract Algebra (1977) 39. C. 0.Christenson and W. L. Voxman, Aspeds of Topdogy (1977) 40. M. Nagata, Field Theuy (1977) 41. R. L. Long, Algebraic Number Theuy (1977) 42. W. F. Plb#br,Integrals and Measures (1977) 43. R. L. Wwe&n and A. Z p n d , Measure and Integral (1977) 44. J. H- Curtiss, Introduction to Funcbionsof a Complex Variable (1978) 45. K. Hdmcek and T. Jech, lnhdudicm to Set Theory (1978) 46. W. S. Massey, Homdogy and c o h m b g y Theory (1978) 47. M. Marcus, Introductionto Modem Algebra (1978) 48. E. C. Young, Vector and Tensor Anatysis (1978) 49. S. 8. Nadkr, Jr., Hyperspacasof Sets (1978) 50. S. K. &gal, Topics in Group Klngs (1978) 51. A. C. M. van Row, Non-Archimedean Functional Analysis (1978) 52. L. Corwin and R. Szczadm, Calculus in Vector Spaces (1979) 53. C. Sadosky, Interpolation of Operators and Singular Integrals (1 979) 54. J. Cmin, Differential Equations (1980) 55. C. W. Gmtsch, Elements of Applicable Functional Analysis (1980)
56. 57. 58. 59.
1. Vaisman, Foundations of Three-Dimensional Eudidean Geometry (1980) H.I.Fmedan, Deterministic Mathematical Models in Population Emlogy (1980) S. B.Chae, Lebesgue Integration (1980) C. S. Rees et al., Theory and Applications of Fourier Analysis (1981) 60. L. Nachbin, Introduction to Functional Analysis (R. M. Arm, trans.) (1981) 61. G.Onech and M. Omch, Plane Algebraic Curves (1981) 62. R. Johnsonbaugh and W. E. Pfaffenberger, Foundations of Mathematical Analysis (1981) 63. W.L. Voxman and R. H. Goetschel, Advanced Calculus (1981) 64. L. J- Corwin and R. H. Szczadm, Multivariable Calculus (1982) 65. V. I. Istrdtescu, t ntroduction to Linear Operator Theory (1981) 66. R. D. Jdwinen, Finite and Infinite Dimensional Linear Spaces (1981) 67. J. K. Beem and P. E. Ehrlich, Global Lorentrian Geometry (1981) 68. D.L. Amacost, The Structure of Lcxally Compad Abelian Groups (1981) 69. J. W. h w e r and M. K. Smith, eds., Emmy Noether: A Tribute (1981) 70. K. H. Kim, Boolean Matrix Theory and Applications (1982) 71. T. W. Wmting, The Mathematical Theory of Chromatic Plane Ornaments (1982) 72. D. B.GauM, Differential Topdogy (1982) 73. R. L. Faber, Foundations of Eudidean and Non-Eudidean Geometry (1983) 74. M. Carmeli, Statistical T h q and Random Matrices (1983) 75. J. H.Camrfh et al., The Theory of Topdogical Semigroups (1983) 76. R. L. Faber, Differential Geometry and Relativity Theory (1983) 77. S.Barnett, Pdynomials and Linear Control Systems (1983) 78. G. Ka@lovsky, Commutative Group Algebras (1983) 79. F. Van Oystaeyenand A. Verschmn, Relative lnvariants of Rings (1983) 80. 1. Vaisman, A First Course In Differential Geometry (1984) 81. G. W.Swan, Applications of Optimal Contrd Theory in Biomedicine (1984) 82. T. Pet* and J. D. Randall, Transformation Groups on Manifdds (1984) 83. K. Goebel and S. Reich, Uniform Convexity, Hyperbdic Geometry. and Nonexpansive Mappings (1984) 84. T. Albu and C. Ndstdsescu, Relative Finiteness in Module Theory (1984) 85. K. Hhacek and T. Jech, Introduction to Set Theory: Second Edition (1984) 66. F. Van Oysbeyen and A. Verschoren, Relative Invariants of Rings (1984) 87. 8.R. McDmald. Linear Algebra Over Commutative Rings (1984) 88. M. Namh, Geometry of Projective Algebraic Curves (1984) 89. G. F. Webb. Theory of Nonlinear Age-Dependent Population Dynamics (1985) 90. M. R. Bremner et al., Tables of Dominant Weight Multiplicities for Representations of Simple Lie Algebras (1985) 91. A. E.Fekete, Real Linear Algebra (1985) 92. S. B.Chae, Holamorphy and Calculus in Normed Spaces (1985) 93. A. J.,e n iJ Introduction to Integral Equations with Applications (1985) 94. G.Kap'lovsky, Projective Representations of Finite Groups (1985) 95. L. Nariciand E. Beckenstein, Topdogical Vector Spaces (1985) 96. J. Weeks, The Shape of Space (1985) 97. P. R. Gribik and K. 0.Kortanek, Extremal Methods of Operations Research ( 1985) 98. J.-A. Chao and W. A. Woycrynski, eds.. Probability Theay and Harmonic Analysis (1986) 99. G. D. Cmwn et al., Abstract Algebra (1986) 00. J. H. Carmth et al., The Theory of Topological Semigroups, Vdume 2 (1986) 01. R. S.Doran and V. A. Beffi, Characterizations of C*-Algebras (1986) 02. M. W. Jeter, Mathematical Programming (1986) 03. M. Anman, A Unified Theory of Nonlinear Operator and Evolution Equations with Applications ( l 986) 04. A. V e r s c h n , Relative lnvariants of Sheaves (1987) 05. R. A. Usmani, Applied Linear Algebra (1987) M. P.Blass and J. fang, Zariski Surfaces and Differential Equations in Characteristic p > 0 (1987) 07. J. A. Reneke et al., Structured Hereditary Systems (1987) 08. H. Busemann and 8.B. Phadke, Spaces with Distinguished Geodesics (1987) 09. R.Harfe. Invertibility and Singularity for Bounded Linear Operators (1988) 10. G.S. Ladde et al., Oscillation Theory of Differential Equations with Deviating Arguments (1987) 11. L. Dudkin etal., IterativeAggregationTheory(1987) 12. T. Okuh, Differential Geometry (1987) 13. D.L. Stancl and M.L. Stancl, Real Analysis with Point-Set Topotogy (1987)
114. 115. 116. 1 17. 118. 1 19. 120. 121.
T. C. Gad, lntroduction to S t d a s t i c Differential Equations (1988) S. S. Abhyankar, Enumerative Combinatorics of Young Tableaux (1988) H. Strade and R. Famsteiner, Modular Lie Algebras and Their Representations (1988) J. A. Huckaba, Commutative Rings with Zero Divisors (1988) W. D. Wallis, Combinaton'alDesigns (1988) W. Wipshw, Topdogical Fields (1988) G. Karpilovsky, Field Theory (1988) S. Caenepeel and F. Van Oystmyen, Brauer Groups and the Cohomdogy of Graded Rings (1989) 122. W. Kozlowski, Modular Function Spaces (1988) 123. E. Lowen-Cdebunders, Function Classes of Cauchy Continuous Maps (1989) 124. M. Pave/, Fundamentals of Pattem Recognition(1989) 125. V. Lakshmikantham et al., Stability Analysis of Nonlinear Systems (1989) 126. R. Sivaramakrishnan, The Classical Theory of Arithmetic Functions (1989) 127. N. A. Watson, Parabolic Equations on an Infinite Strip (1989) 128. K. J. Hastings, Introduction to the Mathematics of Operations Research (1989) 129. 8. Fine, Algebraic Theory of the Bianchi Groups (1989) 130. D. N. Dikranjan et al., Topdogical Groups (1989) 13l.J. C. Mowan /I, Point Set Theory ( l990) 132. P. Biler and A. Wlkowski, Problems in Mathematical Analysis (1990) 133. H. J. Sussmann. Nonlinear Controllability and Optimal Contrd (1990) 134. J.-P. Flomns et al., Elements of Bayesian Statistics (1990) 135. N. Shell, Topological Fields and Near Valuations (1990) 136. B. F. Dodin and C. F, Martin, lntroduction to Differential Geometry for Engineers (1990) 137. S. S. Hdland, Jr.. Applied Analysis by the Hilbert Space Method (1990) 138. J. Okninski, Semigroup Algebras (1990) 139. K. Zhu, Operator Theory in Function Spaces (1990) 140. G. B. Price, An Introduction to Multimplex Spaces and Functions (1991) 141. R. 8. Darst, Introduction to Linear Programming (1991) 142. P. L. Sachdev, Nonlinear Ordinary DifferentialEquations and Their Applications (1991) 143. T. Husain, Orthogonal Schauder Bases (1991) 144. J. F o m , Fundamentals of Real Analysis (1991) 145. W. C. Bmwn, MatricesandVecbr Spaces (1991) 146. M. M. Rao and L.D. Ren, Theory of Orlicz Spaces (1991) 147. J. S. Gdan and T.Head, Modules and the Structures of Rings (1991) 148. C. Small, Arithmetic of Finite Fidds (1991) 149. K. Yang, Complex Algebraic Geometry (1991) 150. D. G. Hoffman et al., Coding Theory (1991) 151. M. 0. Gonzdlez, Classical Complex Analysis (1992) 152. M. 0.Gonzdlez, Complex Analysis (1992) 153. L. W. Baggett, Functional Analysis (1992) 154. M. Sniedovich, Dynamic Programming (1992) 155. R. P. Agamal, Difference Equations and lnequalihies (1992) 156. C. Bmrinski, Biorthogonality and Its Applications to Numerical Analysis (1992) 157. C. Swattz, An Introduction to Fundional Analysis (1992) 158. S. 8. Nadler, Jr., Continuum Theory(1992) 159. M. A. Al-Gwaiz, Theory of Distributions (1992) 160. E. Peny, Geometry: Axiomatic Developments with Problem Solving (1992) 161. E. Castilh and M. R. Ruiz-Cob, Functional Equations and Mdelling in Saence and Engineering (1992) Jn i Integral and Discrete Transforms with Applications and Error Analysis 162. A. J. ,e (1992) 163. A. Charlier et al., Tensars and the Cliord Algebra (1992) 1 . P. Bilerand T. Nadzieja, Problems and Examples in DifFerentialEquations (1992) 165. E. Hansen, Global Optimization Using lnteival Analysis (1992) 166. S. Guem-DelabMm, Classical Sequences in Banach Spaces (1992) 167. Y. C. Wong. Introductory Theory of Topological Vector Spaces (1992) 1M. S. H. Kulkamiand 8. V, Limaye, Real Function Algebras (1992) 169. W. C. Bmwn, Matrices Over Commutative Rings (1993) 170. J. Loustauand M. Dillon, Linear Geometry with Computer Graphics (1993) 171. W. V. Petryshyn. Approximation-Sdvability of Nonlinear Functional and Differential Equations (1993) 172. E. C. Young, Vector and Tensor Analysis: Second Edition (1993) 173. T. A. Bick, Elementary Boundary Value Problems (1993)
174. M. Pavel, Fundamentals of Pattern Recognition: Seoond Edition(1993) 175. S. A. Albewrio et al., Nmcommutative Distributions(1993) 176. W. Fulks, Complex Variables(1993) in.M. M. Rao, Conditional Measures and Applications(1993) 178. A. Janicki and A. Wem, Simulation and Chaotic Behavior of a-Stable Stochastic P m s s e s (1994) P. NeitfaanMkiand D. Tiba, Optimal Control of Nonlinear Parabdic Systems (1994) J. Cmin, Differential Equations: Introduction and Qualitative Theory, S@
Edition
(1994)
S. HeikkMand V. LBkshmikantham, Monotone Iterative Techniques for Discontinuous Nonlinear Rfferential Equations (1994) X. Mao, Exponential Stability of Stochastic Differential Equabions (1994) B. S. Thornson, Symmetric FVoperties of Real Functions (1994) J. E. Rubio, Optimirationand Nonstandard Analysis (1994) J. L. Bueso et al., Compatibility, Stability, and Sheaves (1995) A. N. Mjcheland K. Wang, Qualitative Theory of Dynamical Systems(1995) M. R. Damel, Theory of Lattice-Ordered Groups (1995) 2.Naniewicz and P. D. PanagEotopoulos, Mathematical Theory of Hemi~liational Inequalities and Applications(1 995) L. J. Cornin and R. H. Szczadw, Calculus in Vector Spaces: Second Edition(1 995) L. H. Erbe et al., Oscillah Theuy for Functional Differential Equations(1995) S. Agaian et al., Binary Polynomial Transforms and Nonlinear Digital Filters(1995) M.I.Gil', Norm Estimations for Operation-Valued Functions and Applications(1995) P. A. Gn'Ilei. Semigroups: An lntroduction to the Structure Theory (1995) S. Kichenassamy, Nonlinear Wave Equations (1996) V. F. Kmiov, Global Methods in Optimal Control Theory (1996) K. I. Beidaret al., Rings with Generalized Identities (1996) V. I.Amautov et al., Introduction to the Theory of Topological Rings and Modules
(1996) G.Sierksma, Linear and Intqpr Programming (1996) R. Lassar, Introduction to Fourier Series(1 996) V. Sima, Algorithms for Linear-QuadraticOptimization (1996) D. Redmond, Number Theory(1 996) J. K. Beem et al., Global Lmntrian Geometry: Second Edition(1 996) M. Fontana et al.. Prijfer Domains (1997) H. Tanabe, Functional Analybc Methods for Partial Differential Equations(1997) C. Q.Zhang, Integer Flows and Cyde Covers of Graphs(1997) E. Spiegeland C. J. Olhnell. Incidence Algebras(1997) 8. Jakubczyk and W. Respondek, Geometry of Feedback and Optimal Contrd(1998) T. W. Haynes et al., Fundamentals of Domination in Graphs (1998) T W. Haynes et al., Domination in Graphs: Advanced Topics(1998) L. A. D'Alotto et al., A Unified Sgnal Algebra Approach to TwDimensional Parallel Dgital Sgnal Processing (1998) F. Halter-Koch, Ideal Systems(1 998) N. K. Govil et al., Approximation Theory(1 998) R. Cms, MultivaluedLinear Operators (1998) A. A. Martynyuk. Stability by 'liapunovls ~ a t r i xFunction Method with Applications
(19981 A . ~ a v i nand i A. Yagi, Degenerate Differential Equations in Banach Spaces (1999) A. lllanes and S. Nadler, Jr., Hyperspaces: Fundamentals and Recent Advances
(1999)
G. Kato and D. Stnrppa, Fundamentals of Algebraic Micrdocal Analysis (1999) G. X.-2. Yuan, KKM Theory and Applications in Nonlinear Analysis (1999) D. Mdmanu and N. H. Pavel. Tangency, Flow Invariance for Differential Equations, and Optimization Problems(1999) K. Hrbacek and T. Jech, Introduction to Set Theory, Third Edition(1999) G. E. Kolosov, Optimal Design of Control Systems(1 999) A. I.Prilepko et al., Methods for Solving Inverse Problems in Mathematical Physics
(1999)
INTRODUCTION TO SET THEORY Third Edition, Revised and Expanded
Karel Hrbacek The City College of the City Universijr of New York Ne W Y&, New York
Thornas Jech The &nnsylvania State University University Park, Pknnsylwania
Library of Congress Cataloging-in-Publication
Hrbacek, Karel. Introduction to set theory / Karel Hrbacek, Thornas Jech. - 3" ed., rev. and expanded. p. cm. -(Monographs and textbooks in pure and applied mathematics; 220). Includes index. ISBN 0-8247-7915-0 (alk. paper) 1. Set theory. I. Jech, Thomas J. 11. Title. 111. Series. QA248.H68 1999 5 1 1 -3'22--dc2 1 99-15458 CIP
This book is printed on acid-free paper.
Headquarters Marcel Dekker, Inc. 270 Madison Avenue, New York, N Y 10016 tel: 2 1 2-696-9000; fax: 2 1 2-685-4540
Eastern Hemisphere Distribution Marcel Dekker AG Hutgasse 4, Postfach 8 12, CH-4001 Basel, Switzerland tel: 4 1-6 1 -26 1-8482; fax: 4 1-61-26 1-8896
World Wide Web http:/lwww.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters address above.
Copyright O 1999 by Marcel Dekker, Inc. All Rights Reserved. Ne~therthis book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing fiom the publisher. Current printing (last digit) 1 0 9 8 7 6 5 4 3 2 1
PRINTED IN THE UNITED STATES OF AMERICA
Preface to the 3rd Edition Modem set theory has grown tremendously since the first edition of thia book was published in 1978, and even since the second edition appeared in 1984. Moreover, many ideas that were then at the forefront of research have become, by now, important tor>lain other branches of mathematics. Thua, for example, combinatorial principles, such as Principle Diamond and Martin's Axiom, are indispensable tools in general topology and abstract algebra. Non-well-founded sets turned out to be a convenient setting for semantia of artificial as well as natural languages, Nonstandard Analysis, which is grounded on structum constructed with the help of ultrdlters, has developed into an independent methodology with many exciting applications. It seems appropriate to incorporate some of the underlying set-theoretic ideas into a textbook intended as a general introduction to the aubject. We do that in the form of four new chapters (Chapters 11-14), expanding on the topics of Chapter 11 in the second edition, and containing largely new material. Chapter 11 presents fdters and ultrafiltera, develops the basic properties of closed unbounded and stationary aets, and culminates with the proof of Silver 'a Theorem. The first two aections of Chapter 12 provide an introduction to the partition calculus. The next two sections study trees and develop their relationship to Suslin'a Problem. Section 5 of Chapter 12 is an introduction to combinatorial principles. Chapter 13 is devoted to the measure problem and measurable cardinals. The topic of Chapter 14 is a fairly detailed study of well-founded and non-well-founded sets. Chapters 1-10 of the aecond edition have been thoroughly revised and reorganized. The material on rational and real numbers has been consolidated in Chapter 10, so that it does not interrupt the development of set theory proper. To preserve continuity, a wction on Dedekind cuts has been added to Chapter 4. New material (on normal forms and Goodstein sequences) has also been added to Chapter 6. A aolid basic course in set theory Bhould cover most of Chapters 1-9. This can be supplemented by additional material from Chapters 10-14, which are almost completely independent of each other (except that Section 5 in Chapter 12, and Chapter 13, refer to some concepts introduced in earlier chapters).
This page intentionally left blank
Preface to the 2nd Edition The fht version of this textbook was written in Czech in spring 1968 and accepted for publication by Academia, Prague under the title h o d do teorie m n o h However, we both left Czechdovakia later that year, and the book in set never appeared. In the following y e . we taught introductory COtheory at various uaiversitiea in the United Statea, and found it difficult to select a textbook for w in these m m . Some existing books are based on the "naive" approach to set theory rather than the axiomatic one. We consider some understanding of set-theoretic "paradoxes" end of undecidable propositions (such as the Continuum Hypothesis) one of the important goals of such a CO-, but neither topic can be treated honestly with a "naiven approach. Moreover, set theory is a natural choice of a field where students can first become acquainted with an axiomatic development of a mathematical discipline. On the other hand, all currently available texts presenting set theory from an axiomatic point of view heavily stress logic and logical formalism. Most of them begin with a virtual m i n i c o w in logic. We found that students often take a course in set theory before taking one in logic. More importantly, the emphasis on formalization obscures the essence of the axiomatic method. We felt that there was a need for a book which would present axiomatic set theory more mathematically, at the level of rigor customary in other undergraduate courses for math majors. This led us to the decision to rewrite our Czech text. We kept the original general plan, but the requirements for a textbook suitable for American colleges reaulted in the production of a completely new work. We wish to stress the following featurea of the book: l. Set theory is developed axiomatically. The reasons for adopting each axiom, ars arising both fn>m intuition and mathematical practice, are carefully pointsd out. A detailed discussion is provided in "controversial" cases, such as the Axiom of Choice. 2. The treatment is not formal, Logical apparatus is kept to a minimum and logical formalism is completely avoided. 3. We show that Iudomatic set theory is powerful enough to serve as an underlying framework for mathematics by developing the beginnings of the theory of natural, rational, and real numbers in it. However, we carry the development only as far as it is useful to illustrate the general idea and to motivate settheoretic generalizations of some of these concepts (such as the ordinal numbers end operations on them). Dreary, repetitive details, such as the proofs of all
vi
PREFACE TO THE SECOND EDITION
the usual arithmetic laws,are relegated into exercises. 4. A substantial part of the book is devoted to the study of ordinal and cardinal numbers. 5. Each section ia accompanied by many exercises of w y i n g difficulty. 6. The h a l chapter is an informal outline of some recent developments in set theory and their significance for other areas of mathematics: the Axiom of Constructibility, questions of consistency and independence, and large cardinals. No prooh are given, but the exposition is sufficiently detailed to give a nonspecialist mme idea of the problem arising in the foundatiom of set theory, methods used for their solution, and their effectson mathematics in general. The fimt edition of the book has been extensively used as a textbook in undergraduate and first-year graduate courses in set theory. Our own experience, that of our many colleagues, and ~uggestionaand critickm by the reviewem led us to consider some changes and improvements for the present aecond edition. Those turned out to be much more extensive than we originally intended. As a result, the book has been substantially rewritten and expanded. We list the main new features below. l. The development of natural numbers in Chapter 3 has been greatly simplified. It is now based on the definition of the set of all natural numbers as the least inductive set, and the Principle of Induction. The introduction of transitive sets and a characterization of natural numbers as those transitive sets that are well-ordered and inversely well-ordered by E is postponed until the chapter on ordinal numbers (Chapter 7). 2. The material on integers and rational numbers (a separate chapter in the first edition) has been c o n d e d into a single section (Section l of Chapter 5). We feel that m& students learn this subject in another course (Abstract Algebra), where it properly belongs. 3. A series of new sectiona (Section 3 in Chapter 5, 6, and 9) deals with the set-theoretic properties of sets of real numbers (open, d m d , and perfect sets, etc.), and provides interesting applications of abstract B& theory to real analysis. 4. A new chapter called "Uncountable Setsn (Chapter 11) has been added. It introduces some of the concepts that are fundamental to modern set theory: ultrafilters, closed unbounded sets, trees and partitions, and large cardinals. These topics can be used to enrich the usual one-semester course (which would ordinarily cover most of the material in Chapters 1-10). 5. The study of linear orderings has been expanded and concentrated in one place (Section 4 of Chapter 4). 6. Numerous other small additions, changes, and corrections have been made throughout. 7. Finally, the discussion of the present state of set theory in Section 3 of Chapter 12 has been updated.
Content S iii
Preface to t h e T h i r d Edition Preface to t h e Second Edition 1 Sets l Introduction to
2 3 4
Sets
.........................
Prop&iw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TheAxioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elementary Operations on Sets . . . . . . . . . . . . . . . . . . .
2 Relations. E'unctions. and Orderings
1 2 3 4 5
Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F'unctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equivalences and Partitions . . . . . . . . . . . . . . . . . . . . . Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S Natural Numbers l Introduction to Natural Numbers . . . . . . . . . . . . . . . . . . 2 Properties of Natural Numbm . . . . . . . . . . . . . . . . . . . 3 The Fkursion Theorem . . . . . . . . . . . . . . . . . . . . . . . 4 Arithmetic of Natural Numbers . . . . . . . . . . . . . . . . . . . 5 Operations and Structures . . . . . . . . . . . . . . . . . . . . . . 4 Finite. Countable. and Uncountable S e t s 1 Cardinality of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 2 FiniteSets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Linear Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Complete Linear Orderings . . . . . . . . . . . . . . . . . . . . . 6 Uncountable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Cardinal N u m b e r s 1 Cardinal Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . The Cardinality of the Continuum . . . . . . . . . . . . . . . . . 2
vii
viii
CONTENTS
8 Ordinai Numbers 103 . . . . . . . . . . . . . . . . . . . . . . . . . . l Well-Ordered Sets 103 . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Ordinal Numbera 107 . . . . . . . . . . . . . . . . . . . . . 3 The Axiom of Replmment 111 4 38nsfinite Induction and Recursion . . . . . . . . . . . . . . . . 114 5 Ordinal Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6 The Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7 Alephs 129 1 Initial Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 2 Addition and Multiplication of Alephs . . . . . . . . . . . . . . . 133 8 T h e Axiom of Choice 1 The Axiom of Choice end its Equivalents . . . . 2 The Use of the Axiom of Choice in Mathematics
.........
137
137 . . . . . . . . . 144
9 Arithmetic of Cardinal Numbers 1 Infinite Sums and Products of Cardinal Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Regular and Singular Cardinals . . . . . . . . . . . . . . . . . . . 3 Ekponentiation of Cardinals . . . . . . . . . . . . . . . . . . . . .
155
10 Sets of Real Numbers 1 Integers and Rational Numbers . . . . . . . . . . . . . . . . . . . 2 Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Topology of the Real L i e . . . . . . . . . . . . . . . . . . . . . . 4 Sets ofRRal Numbera . . . . . . . . . . . . . . . . . . . . . . . . 5 Bore1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
171
11 Filtera and Ultrafilters 1 Filters and Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 UltrsGlters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 C l d Unbounded and Stationary Sds . . . . . . . . . . . . . . 4 Silver's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . .
201
155 160 164 171 175 179 188
194 201 205 208 212
12 Combinatorial Set Theory 217 1 Ramsey's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 217 2 Partition Calculus for Uncountable Cardinals . . . . . . . . . . . 221 Thea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 3 4 Suslin's Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 5 Combinatorial Principles . . . . . . . . . . . . . . . . . . . . . . . 233 13 Large Cardinab 241 1 The Measure Problem . . . . . . . . . . . . . . . . . . . . . . . . 241 2 Large Cardinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
CONTENTS
ix
14 The Axiom of Foundation
1 2 3
251
. . . . . . . . . . . . . . . . . . . . . . . 251 . . . . . . . . . . . . . . . . . . . . . . . . . . 256 .. .... . . . . . . . . . . . . . . . .. 2W
Well-Founded Relations Wd1-Founded S& Non-Well-Founded Sds
15 The Axiomatic Set Theory l The Zermel+Fraenkel Set Theory
.
2 3
287
. .
. .
.
With Choice. . . . . . . . . . . . . . . . . . . . . . . . 267 Coasiskncy and Independence . . . . . . . . . . . . . . . . 270 The Univerae of Set Theory . . . . . . . . . . . . . . . . . 277
Bibliography Index
.. ..
.
. .
This page intentionally left blank
Chapter 1
Sets Introduction to Sets The central concept of this book, that of a set, is, at least on the surface. extremely simple. A set is any collection, group, or conglomerate. So we have the set of all students registered at the City University of New York in February 1998, the set of all even natural numbers, the set of all points in the plane n exactly 2 inches distant from a given point P, the set of all pink e1ephant.s. Sets are not objects of the real world, like tables or stars; they are created by our mind, not by our hands. A heap of potatoes is not. R set of potsat.oc?s,the set of all molecules in a drop of water is not the same object ;is that drop of water. The human mind possesses the ability to abstract, to think of a variety of different objects as being bound together by some common propert,y. and thus to form a set of objects having that property. The property in question may be nothing more than the ability to think of these objects (as being) t,ogetl~er. Thus there is a set consisting of exactly the numbers 2, 7, 12, 13, 29, 34, and 11,000, although it is hard to see what binds exactly those numbers together, besides the fact that we collected them together in our mind. Georg Cantor, a German mathematician who founded set theory in a series of papers published over the last three decades of the nineteenth century, expressed it as follows: "Unter einer Menge verstehen wir jede Zusammenfassung M von bestimmten wohlunterschiedenen Objekten in unserer Anschauung oder unseres Denke~is (welche die Elemente von M genannt werden) zu einem ganzen." [A set is a collection into a whole of definite, distinct objects of our intuition or our thought. The objects are called elements (members) of the set.) Objects from which a given set is composed are called elements or members of that set. We also say that they belong to that set. In this book, we want to develop the theory of sets as a foundation for other mathematical disciplines. Therefore, we are not concerned with sets of people or molecules, but only with sets of mathematical objects, such as numbers. points of space, functions, or sets. Actually, the first three concepts can be defined in
CHAPTER 1 . SET'S
2
set theory as sets with particular properties, and we do that in the following chapters. So t h e only objects with which we are concerned from now on art. sets. For purposes of illustration, we talk about sets of numbers or points evt.11 hefore these notions are exactly defined. We do that, however, only ill ex:inlples. exercises arid problems, not in thc nl:tin body of theory. Sets of mnthemat,ic:al objects are, for example: 1.1 (a) (b) (c) (cl) (e)
Example T h e set of The set of T h e set of T h e set of The set of
all prime divisors of 324. a11 rlurribers divisible by 0. all continuous real-valued functioris on the interval (0.l ] . all ellipses with major axis 5 and eccentricity 3. all sets whose elements are rlatural riunlbers less than 20.
Examination of these and many other similar examples revtlals that sc?t.s with which mathematicians work are relatively simple. They i~icluclr!the set o f natural numbers and its various subsets (such as the set of all prime numbers). :ts well as sets of pairs, triples, and in general n-tuples of natural riumbers. Integcrs and rational numbers can be defirietl using only such sets. Real r~umberscall then be defined as sets or sequences of rational numbers. Mathematical analysis deals with sets of real numbers allcl functions on real numbers (sets of ordered pairs of real numbers), and in some investigations, sets of functions (11. cvrn sets of sets of functions are considered. But a working mathematician rarely eri(:ounters objects more con~plicatt?dthan that. Perhaps it is not surprising t h a t uncritical usage of "sets" renlote from "everyday experience" n ~ a ylead to contradictions. Consider for example the "set" R of all those sets which are rlot elrlncr~ts of themselves. In other words, R is a set of all sets X such that X 4 a ( E reads -;belongs to," 4 reads "does not belong to"). Let 11s now (ask whether R E R . If R R, then R is not an element of itself (because no eleme~itof R belongs t o itself), so R $ R; a contradiction. Therefore, ~~ttcessarily R 4 R. But tl~t.11 R is a set which is not an element of itself, and a11 such setasbelong t o R.P'!! conclude that R E R; again, a collt,ritdiction. Tlie argument call be briefly sulnmarized a follows: Define R by: .I. E R it' nritl o ~ i l yif X # X. Now consider X = R; by definition of R, R G R if itllci ~ 1 1 1 if~ R 4 R; a contradiction. A few comments on this argument (due to Bertrand Russell) rriight bc helpful. First, there is nothing wrong with R being a set of sets. Many sets wl~osc elements are again sets are legitimately enlployed in mathematics - set. Exanlple 1.1 - and do not lead t o contradictions. Second, it is easy t o give clx;~mpl(ts of elements of R; e.g., if X is the set of all natural numbers, then X 4 s (the set of all natural numbers is not a natural number) and so X E R. Third. it is riot so easy to give examples of sets which do not belong to R, but this is irrelevant. T h e foregoing argument would resr~ltin a contradiction even if there were 110 sets which are elements of themselves. (A plausible clindidatt! for a set wliiclh is a11 r~lenientof itself woulcl be the "st:t of all sets" V ; c:le~rlyV E C'. Howevtlr.
2. PROPERTIES
3
the "set of all sets" leads to contradictions of its own in a more subtle way see Exercises 3.3 and 3.6.) How can this contradiction be resolved? We :issunled that we have a sett R defined as the set of all sets which are not elements of t,hemselves. anci derivrd a contradiction as an immediate consequence of the definition of R. This can only mean that there is no set satisfying the definition of R. In other words. this argument proves that there exists no set wllose l~ienlberswould be precisely the sets which are not elements of themselves. The lesson contained in Russell's Paradox and other similar examples is that by merely defining a set we: clo not prove its existence (similarly as by defining a unicorn we do riot prove t,hat unicorns exist). There are properties which do not define sets; that is, it is not possible to collect all objects with those properties into one set. This observiitioll leaves set-theorists with a task of determining the properties which do dcfirle sets. Unfortunately, no way how to do this is k~lawn,and some results i n logic Theorenls discovered by K urt Giic.lt4) (especially the so-called In~omplet~eness seem to indicate that a complete answer is not cvcn possible. Therefore, we attempt a less lofty goal. We formulate some of the relatively simple properties of sets used by mathematicians as axioms, and then t a k e c a r e to check that all theorems follow logically from the axioms. Since the axioms are obviously true and the theorems logically follow from them, the theorems are also true (not necessarily obviously). We end up with a bodj. of truths about sets which includes, among other things, tlw basic properties of natural. rational, and real numbers, functions, orderings, etc.. but as far as is knowli, no contradictions. Experience has shown that practically all notio~ls~ ~ s ei tl li contemporary mathematics can be defined, and their rnatheniatical propcrti~s derived, in this axiomatic system. In this sense, the axiomatic set theory serves as a satisfactory foundation for the other branches of n~athematics. On the other hand, we do not claim that every true fact about. sets c:nn 11e derived from the axioms we present. The axiomatic system is not co~npletcin this sense, and we return to the discussion of the question of completeness in the last chapter.
Properties In the preceding section we introduced sets as collections of objects Iinving some property in common. The notion of property rnerit,s some analysis. Solne properties commonly considered in everyday life are so vague that they can hardly be admitted in a mathematical theory. Consider. for example. t h(?'-set of all the great twentieth century American novels.'' Different persons' judg~ilents as to what constitutes a great literary work differ so nluch that there is no generally accepted way how to decide whether a given book is or is not an element of the "set." For an even more startling example, consider the "set of those naturai numbers which could be written down in decimal not;ition9 (by "coul(l" we mean that someone could actually do it with paper and pencil). Clearly. 0 can be so
CHAPTER I. SETS
4
written down. If number n can be written down, then surely number n + 1 can also be written down (imagine another, somewhat faster writer, or the person capable of writing n working a little faster). Therefore, by the familiar principlt. of induction, every natural number n can be written down. But that is plainly absurd; to write down say 10lol" in decimal notation would require to follow 1 by 10'' zeros, which would take over 300 years of continuous work at a rate of a zero per second. The problem is caused by the vague meaning of "could." To avoid similar difficulties, we now describe explicitly what we mean by a property. Only clear. mathematical properties are allowed; fortunately, these properties are sufficient. for expression of all mathematical facts. Our exposition in this section is informal. Fkaders who would like to see how this topic can be studied from a more rigorous point of view can t:onsult some book on mathematical logic. The basic set-theoretic property is the membership property: " . . . is a11 element of . . . ," which we denote by E . So "X E Y" reads "X is an elen~c~lt of' Y1' or "X is a member of Y" or "X belongs to Y." The letters X and Y in these expressions are variables; they stand for (denote) unspecified, arbitrary sets. The proposition "X E Y" holds or does not. hold depending on sets (denoted by) X and Y. We sometimes say "X E Y" is a property of X and Y . The reader is surely familiar with this informal way of speech from other branches of mathematics. For example, "m is less than 71" is a property of m and n. The letters m and n are variables denoting unspecified numbers. Some m and n have this property (for example, "2 is less than 4" is true) but others do not (for example, "3 is less than 2" is false). All other set-theoretic properties can be stated in terms of membership with the help of logical means: identity, logical connectives, and q~ant~ifiers. We often speak of one and the same set in different contexts and f i ~ dit convenient to denote it by different variables. We use the identity sign "=" t o express that two variables denote the same set. So we write X = Y if X is thr. same set as Y [ X is identical unth Y or X is equal t o Y ] . In the next example, we list some obvious facts about identity: 2. l Example (a) X = X .
(b)
If X = Y, then Y
=
X.
( c ) If X = Y and Y = Z, then X = Z.
(d) If X = Y and X E 2, then Y E 2. (e) If X
=
Y and Z
E
X, then Z E Y.
(X is identical with X.) (If X and Y are identical, then Y and X are identical.) (If X is identical with Y and Y is identical with 2, then X is ider~tical with Z.) (If X and Y are identical and X belongs to Z, then Y belongs to 2.) (If X and Y are identical and Z belongs to X , then 2 belongs to Y.)
2. PROPERTIES
5
Logical connectives can be used to construct more complicated properties from simpler ones. They are expressions like "not . . . ," " . . . and . . . ," "if. . . , then . . . ," and " . . . if and only if . . . ."
2.2 Example (a) "X E Y or Y E X" is a property of X and Y. (b) "Not X E Y and not Y E X" or, in more idiomatic English, "X is not an element of Y and Y is not an element of X" is also a property of X and Y. (c) "If X = Y, then X E Z if and only if Y E Z" is a property of X , Y and Z. (d) "X is not an element of X" (or: "not X E X") is a property of X . We write X # Y instead of "not X Y" and X # Y instead of "not X = Y." Quantifiers "for all" ("for every") and "there is" ( "there exists" ) provide additional logical means. Mathematical practice shows that all mathematical facts can be expressed in the very restricted language we just described, but that this language does not allow vague expressions like the ones at, the beginning of the section. Let us look a t some examples of properties which involve quantifiers. 2.3 (a) (b) (c)
Example "There exists Y E X." "For every Y E X , there is Z such that Z E X and Z E Y." "There exists Z such that Z E X and Z # Y."
Truth or falsity of (a) obviously depends on the set (denoted by the variable) X . For example, if X is the set of all American presidents after 1789, then (a) is true; if X is the set of all American presidents before 1789, (a) becomes false. [Generally, (a) is true if X has some element and false if X is empty.] We say that (a) is a property of X or that (a) depends on the parameter X . Similarly, (b) is a property of X , and (c) is a property of X arid Y. Notice also that Y is not a parameter in (a) since it does not make sense to inquire whether (a) is true for some particular set Y; we use the letter Y in the quantifier only for convenience and could as well say, "There exists W E X ," or "There exists some element of X ." Similarly, (b) is not a property of Y, or Z , and (c) is not a property of 2. Although precise rules for determining parameters of a given property can easily be formulated, we rely on the reader's common sense, and limit ourselves to one last example. 2.4 Example (a) "YE X." (b) "There is Y f X ." (C) "For every X, there is Y E X."
Here (a) is a property of X and Y; it is true for some pairs of sets X , Y and false for others. (b) is a property of X (but not of Y), while (c) has no parameters. (c) is, therefore, either true or false (it is, in fact, false). Properties
CHAPTER l . SEI'S
6
which have no parameters (and are, therefore, either true or false) are callecl statements; all mathematical theorems are (true) statements. We son~etimeswish to refer to an arbitrary, unspecified propertry. 1%. 1 1 s ~ boldface capital letters to denote statements and properties and, if convenient.. list some or all of their parameters in parentheses. So A ( X ) stancts for any property of the parameter X , e.g.. (a). (L>), or (c) in Exa~nple2 . 3 . E ( X . Y ) is a property of parameters X and Y, e.g., (c) in Exari~ple2.3 or (a) ill Ex:lrrlplt. 2.4 or (cl)
"X E
Y o r X = Y or Y E
X.''
In general, P(X, Y,. . . , Z ) is a property whose truth or falsity deper~dsc111 parameters X, Y ,. . . , Z (and possibly others). We said repeatedly that all set-theoretic properties can be expressed in our restric:ted language, consisting of n~embershipproperty and logical means. However, as the developmerlt proc:eeds a11d rliore and more complicated t h m r t t ~ r arcb ~s proved, i t is practical to give names to various particular properties, i.e.. to Liefine new properties. A new symbol is then introduced (defined) to dt!l~ott,thcl property in question; we can view it ;is a shorthand for t.11~ explicit f'orlr~ul;itiori. For example, the property of I~einga subset is defined by 2.5
X
CY
if and or~lyif every elerrlent of X is an element of Y
"X is a subset of Y" (X
c Y) is a property of
X and Y. We can use it irl more cotnplicated formulations and, whenever desirable, rcpl;tcc? X Y by its tlefinition. For example, the explicit defirlition of
"If X
c Y ancl Y G Z, then X c 2.''
would be
"If every element of X is arl ttlt.merit of Y and every elenlent of Y is
ill1
element of 2, then every element o f X is an element of 2." It is clear that mathematics witllout definitions woulcl hcl possible. but ext:ctedingly clumsy. For another type of definition. consider the property P ( X ) : "There clxists no Y E X ." We prove in Section 3 that (a) There exists a set X such that P ( X ) (there exists a set X with no elements). (b) There exists at most one set X such that P ( X ) , i.e., if P ( X ) and P ( X f ) . then X = X' (if X has no t ~ l t ~ l ~ ~and e n t X' s ha,! no eler~lents,the11 X i i r ~ c l X' are identical). (a) and (b) together express the fact that there is a unique set X with thtb property P ( X ) . We can the11 givt? this set a name, say 0 (the ernyty set). arl
3. THE AXIOMS
7
The full meaning of "8 c 2" is then "the set X which has no elements is ;L subset of 2."We occasionally refer to 8 as the con,stant defined by the p r o p ~ r t y
P. For our last example of a definition, consider the property Q ( X , Y, 2) of X , Y, and 2: "For every U, U E Z if and only if U E X and U E Y . " We see in the next section that (a) For every X and Y there is Z such that Q(X, Y, 2). (b) For every X and Y, if Q ( X , Y, 2) and Q ( X , Y, Z'), then Z = 2'. [For every X and Y, there exists a t most one Z such that Q ( X , Y, Z).] Conditions (a) and (b) (which have to be proved whenever this type of definition is used) guarantee that for every X and Y there is a unique set. Z such that Q ( X , Y, 2). We can then introduce a name, say X n Y , for this ul~iquesc1t Z. and call X n Y the intersection of X and Y. So Q(X, Y, X n Y) holds. 1441 refer to n as the operation defined by the property Q.
3.
The Axioms
We now begin to set up our axiomatic system and try to make clear thc intuit.ive meaning of each axiom. The first principle we adopt postulates that our "universe of discourse" is postulate the t1xistenc:c~ not void, i.e., that some sets exist. To be concrcto. urcA of a specific set, namely the empty set. The Axiom of Existence
There exists a set whic:h Ilzts no elements.
A set with no elements can be variously describc?cl intuitively, e.g.. as the set of all U.S. Presidents before 1789, the set of all real numbers .r for n?hit:l1 22 - - 1 , etc. All examples of this kind describe orltt and the same set. nanlely the empty, vacuous set. So, intuitively, there is only one empty set. But we cannot yet prove this assertion. We need another postulate to express tllr fat-t that each set is determined by its elements. Let us see another example: X is the set consisting exactly of numbers 2, 3, ant1 5. Y is the set of all prirne numbers greater than 1 and less than 7. Z is the set of all solutions of the equation "X 1 0 ~ " 31x - 30 = 0. Here X = Y, X = Z, arld Y = Z, and we have three different ctcscriptioi~s of one and the same set. This leads to the Axiom of Extcr~sionality. The Axiom of Extensionality If every element of X is an element of Y mid every element of Y is an element of X , then X = Y.
Briefly, if two sets have the same elements, then they are identical. We can now prove Lemma 3.1.
CHAPTER 2 . SETS
8 3.1 Lemma There exists only one set with n o elements.
Proof. Assume that A and B are sets with no elements. Then every element of A is an element of B (since A has no elements, the statement "a C A implies a E B" is an implication with a false antecedent, and thus automatically true). Similarly, every element of B is an element of A (since B has no elenients). Therefore, A = B, by the Axiom of Extensionality. 3.2 Definition The (unique) set with no elements is called the crnpty set ar~ci
is denoted
O.
Notice that the definition of the constant 0 is justified by the Axiom of' Existence and Lemma 3.1. Intuitively, sets are collections of objects sharing some common property, so we expect to have axioms expressing this fact. But, as demoristrated by the paradoxes in Section 1, not every property describes a set; properties "X # X" or "X = X" are typical examples. In both cases, the problem seems to be that in order to collect all objects having such a property into a set, we already have to be able to perceive all sets. The difficulty is avoided if we postulate the existence of a set of all objects wit.11 a given property only if there already exists some set to which they all belor~g. Let P ( x ) be a property of X. For E B if and only if X E A and P(z).
The Axiom Schema of Comprehension
any set A, there is a set B such that
X
This is a schema of axioms, i.e., for each property P, we liave one axioni. For example, if P(x) is "X = X," the axiom says: For any set A, there is a set B such that and X = X . (In this case, B = A . ) If P ( x ) is
"5
4 X",
X
E B if and only if' z E A
X
E
the axiom postulates:
For any set A, there is a set B such that and X 4 X.
B if and only if
X E
A
Although the supply of axioms is unlimited, this causes no problems. sinw i t is easy to recognize whether a particular statement is or is not an axiorr~and sirice every proof uses only finitely many axioms. The property P ( x ) can depend on other parameters p , . . . . q; the correspon
only if
X
E R if
t111cl
3. THE AXIOMS
9
Pmoj, Consider the property P(x, Q ) of X and Q: "X E Q." Then, by the Comprehension Schema, for every Q and for every P there is a set R such that X G R if and only if X E P and P(x, Q ) , i.e., if and only if X E P and z E Q. ( P plays the role of A, Q is a parameter.) U 3.4 Lemma For every A, there is only one set B such that if X E A and P ( x ) .
X
E
B if and onlp
Proof, If B' is another set such that X E B' if and only if X E A and P(.x), then X E B if and only if X E B', so B = B', by the Axiom of Extensionality. We are now justified t,o introduce a name for the uniquely determined set B. 3.5 Definition { X E A I P ( x ) ) is the set of all
X
E A with the propertp P ( x ) .
3.6 Example The set from Example 3.3 could be denoted { X
P I X E Q}.
Our axiomatic system is not yet very powerful; the only set we proved to exist is the empty set, and applications of the Comprehension Schenla to the empty set produce again the empty set: {X E 0 I P ( x ) ) = 0 no matter u'hat. property P we take. (Prove it.) The next three principles postulate that some of the constructions frequently used in mathematics yield sets. The Axiom of Pair For any A and B, there is a set C such that only if X = A or X = B,
X
E C if and
So A E C and B E C?and there are no other elements of C. The set C is unique (prove i t ) ; therefore, we define the unordered pair of A and B as the set having exactly A and B as its elements and introduce notation {A, B) for the unordered pair of A and B, In particular, if A = B, we write { A ) i n s t ~ a dof { A ,A ) . 3.7 Example (a) Set A = 8 and B = 0; then (0) = (0,0) is a set for which 0 E {Q), and if X E {Q), then X = U. So (0) has a unique element 0. Notice that {U) # 8, since U G {g), but 0 4 0. (h) Let A = Q) and B = (0); then 0 E (0, (0)) and {Q))E (0, {0)), and 8 and
{U) are the only elements of (0, {g)). Note that O # {U, {a)), (0) # (0, (0)).
The Axiom of Union For any set S, there exists a set U such that rc E U if and only if X E A for sorrle A E S. Again, the set U is unique (prove it); it is called the union of S and denoted by U S. We say that S is a system of sets or a collection of sets when we want to stress that elements of S are sets (of course, this is always true - all our
CHAPTER I . SET'S
10
objects are sets - and thus the expressions "set" and "system of sets'' have t h e sarne meaning). The. union of a system of sets S is then a set of prtxcisely thoscx X which belong to some set from the system S. 3.8 Example (a) Let S = (0, (8)); X E U S if ard only if X E A for sorne A E and ordy if X E O or X z: {(P)). Therefore, X E U S if and only U S = (0)(b) U 0 = 8. (c) Let hi arld N be sets; X E U{I\/I,N ) if and only if X E M or 3: E The set U{hf, N ) is called the unior~of M and N ant1 is denoted M
S, is(... if' if z = VJ;
N. U N.
So we finally introduced one of the simple set-theoretic operations with which the reader is surely familiar. The Axiom of Pair and the Axiom of Union arc1 Ilecessary to define union of two sets (and the Axiom of Extensionality is nettrled to guarantee that it is unique). T11e union of two sets 11% the r~sualrnealiirig: U N if and only if x E A.1 o r X , E N . .c E
The Axiom of Union is, of course, much more powt?rful;it e~iablesus to forni unions of not just two, but of any, infinite, collection of sets. If A, B, and C are sets, we can now prove the existence and uniqueness of tlich set P whose elements are exactly A, B, and C (see Exercise 3 . 5 ) . P is denot,e
We write A
G
B to denote that. A is
;i
subset of B
3.11 Example (a) (9)) c { @ l { @ ) land {{@)l (8, (011. (b) 8 A and A A for every set A. ( C ) {XE A I P(x)} C A. (d) If A E S, then A C_ U S .
c
c
The llext axiom postulates that all subsets of a given set (:an be collec:tJe(l into one set. The Axiom of Power Set if aiid only if X S.
For any set S, there exists a set P such that X E P
Since the set P is again uniquely determined, we call the set of all sl~t)sets of S the power set of S arld denote it by P ( S ) .
3. THE AXIOMS 3.12 Example (4 P ( 0 ) = (81. (b) P({a}) = {a}}* ( C ) The elements of P ({a, 6)) are
0, { a ) , { h ) , and {a, h ) .
We conclude this section with another notational convention. Let P ( x ) be a property of X (and, possibly, of other parameters). If there is a set A such that, for all X , P ( s ) implies z E A. then ( X E .4 I P ( x ) ) exists, and, moreover, does not depend on A . That means that if A' is another set such that for all X , P(x) implies X E A', tllerl {z E A' j P(:z)) = { X E A I P ( x ) ) . (Prove it.) We can now define {X 1 P ( z ) ) to be the set { X E A ( P ( x ) ) , where A is ally set for which P ( x ) implies X E A (since it does not matter whicll such set A we use). {X ( P ( x ) ) is the set of all X with the property Pis). We stress olice again that this notation can be used only after it lias been proved that sonle A contains all X with the property P. 3.13 Example (a) {X I X C P and
X
E Q ) exists.
P ( z , P, Q ) is the property "X E P anct X E Q!': let A = P. Then Proof. P ( x , P, Q ) implies X E A. Therefore, { X I X E P and z E Q ) = { # rE P I X E P and X E Q ) = {XE P I z E Q ) is the set R from Example 3.3. O (b) {X I X = a or J: = b) exists; for a proof put A = (a. b); also show that { X In: = a or X = 6) = {a,b}. ( C ) {X 1 5 4 X) does not exist (because of Russell's Paradox): thus in this instance the notation { z I P(x)) is inadmissible.
Although our list of axioms is not complete, we postpone the introductior~ of the remaining postulates until the need for them arises. Quite a few cariccpts can be introduced and some theorems proved from the postulates we now h;lvc> available. The reader may have noticed that we did not guarantee existence of any infinite sets. This deficiency is removed in Chapter 3. Other axioms are introduced in Chapters 6 and 8. The complete list of axioms can be fourltl in Section 1 of Chapter 15. This axiomatic system was essentially formulated by Ernst Zermelo in 1908 and is often referred to as the Zemelo-Fraenkel uxiomntic system for set theory.
Exercises 3.1 Show that the set of all X such that X E A and X 4 B exists. 3.2 Replace the Axiom of Existence by the following weaker postulate: Weak Axiom of Existence Some set exists. Prove the Axiom of Existence using the Weak Axiom of Existence and the Comprehension Schema. [Hint:Let A be a set known to exist; coilsider {X E A l X # X ) . ]
CHAPTER 2 . SET'S
12
3.3 (a) Prove that a "set of all sets" does not exist. [Hint: if V is a set of all sets, consider ( X E V I X 4 X ) .] (b) Prove that for any set A there is some X 4 A. 3.4 Let A and B be sets. Show that there exists a unique set C such that. X E C if and only if either X E A and X 4 B or X E B and X 4 A. 3.5 (a) Given A, B, and C, there is a set P such that X E P if and only if x=Aor~=Bors=C. (b) Generalize to four elements. 3.6 Show that P ( X ) X is false for any X. In particular, P ( X ) # X for any X. This proves again that a "set of all sets" does not exist. /Hint: Let Y = { U € X [ u ~ u ) Y ;E P ( X ) b u t Y 4 X.] 3.7 The Axiom of Pair, the Axiom of Union, and the Axiom of Power Set can be replaced by the following weaker versions. Weak Axiom of Pair For any A and B, there is a set C such that A E C and B E C. Weak Axiom of Union For any S, there exists U such that if X E A and A E S, then X E U. Weak Axiom of Power Set For any set S, there exists P such that X c S implies X E P. Prove the Axiom of Pair, the Axiom of Union, and the Axiom of P o w r ~ Set using these weaker versions. [Hint: Use also the Co~nprehension Schema.]
c
4.
Elementary Operations on Sets
The purpose of this section is to elaborate somewhat on the notions introduced in the preceding section. In particular, we introduce simple set-theoretic: operations (union, intersection, difference, etc.) and prove some of their basic properties. The reader is certairily familiar with them to some extent and we leave out most of the details. Definition 3-10 tells us what it means that A is a subset of B (i7~cludedill B), A 5 B. The property 5 is called inclusion. It is easy to prove that, for any sets A , B, and C , (a) A 5 A. (b) If A B and B C A, then A = B. (c) If A 2 B and B G C , then A C C. For example, to verify (c) we have to prove: I f X E A, then s E C. But if X E A , then s E B, since A B. Now, X E B implies X E C, since B C C. So X E A implies X E C. If A C B and A # B, we say that A is a proper subset of B (A is prope7.ly contained in B) and write A C B. We also write B 2 A instead of A c B a11d B 2 A instead of A C B. Most of the forthcoming set-theoretic operations have been rr~entionedb e fore. The reader probably knows how they can be visualized using Venn &agrams (see Figure 1).
c
4. ELEMENTARY OPERATIONS ON SETS
Figure 1: Venn diagrams. (a) Intersection: Tile shaded part is A n B . (b) Union: The shaded part is A U B. (c) Difference: The shaded part is A - B. (d) Symmetric difference: The shaded part is A A B. (e) Distributive law: The shaded part obviously represents both A n ( B U C) and (A n B ) U (A n C). 4.1 Definition The intersection of A and B, A
n B, is the set of all
which which belong to both A and B. The union of A and B, A U B, is the set of belong in either A or B (or both). The diflerence of A and B, A - B, is the set of all X E A which do not belong to B. The s?ymmetP.ic diflerence of A and B, A A B, is defined by A A B = (A - B) U ( B - A ) . (See Examples 3.3 and 3.8 and Exercises 3.1 and 3.4 for proofs of existence and uniqueness.) X all X
As an exercise, the reader can work out proofs of some simple properties of these operations. Commutativity
Associativity
AnB=BnA AuB=BuA (AnB)nC=An(BnC) (AuB)uC=AU(BUC)
So forgetting the parentheses we can write simply A n B n C for the intersection
of sets A, B , and C. Similarly, we do not need parentheses for the rinioil arlcl for more than three sets. A n ( B U C ) = ( A n B )U ( A n C ) A u ( B n C )= ( A u B ) n ( A u C )
Distributivity
DeMorganLaws C - ( A n
B ) = (C - A) U (C - B )
C - ( A uB) = ( C - A ) n ( C - B ) Some of the properties of the difference and the symmetric difference arc
A n ( B - C ) = ( A n B )- C A - B =. O if and only if A C B AAA=@ AAB=BAA ( A A B ) A C =A A ( B A C ) Drawing Venn diagrams often helps one discover nikd prove these and sitiii1;ir relatiorlships. For example, Figure l i e ) illustrates the distributive law A n ( B LJ C ) = ( A n B ) U ( A n C ) . The rigorous proof proceeds as follows: We hwvv t o prove that the sets A n (B U C ) tirlrl ( A n B ) U ( A n C ) h:vt? t.he same eI~illt?rits. That requires us to show two facts: (a) Every element of A n ( B U C ) belongs to ( A n B ) U ( A n C ) . (b) Every element of ( A n B) U ( A n C ) belongs to A n ( B U C ) . To prove (a), let a E A ~ ( B u C )Then . a E A and also a E BUC. Therefore: either a f B or a E C. So a E A nilri a E B or a E A and a E C'. Tllis tinhiins t.hat it E A n B or a E A n C ; htantsc,finally, U E ( A n B ) U ( A n C ) . To prove (b), let a E ( A n B ) U ( A n C ) . Then a E A n B or a E A n L'. Irl the first case, a E A and a E B , so U E A and a E B U C . and a E A n ( B u C ' ) . 111the second case, U E A and U E C , SO again a E A arkcl cr. E B U C , arltl finally. U E An(3uC). U The exercises should provide slificierlt material for practicing similar ()l(.mentary arguments about sets. The union of a system of sets S was defined in the preceding section. IV(> now define the intersection S of a rlonernpty system of sets S : rr: E S if ;illil r~rllyif z E A for all A E S. The11 intersection of two sets is again ;i spcc*i;il case of the more genera1 operation: A n B = n { A ,B ) . Notice that wr do 11c1t define (78; the reason is that rLvc?rp:r belongs to all A E Q) (si11r:c there is 110 such A), so B would have to be i i set of all sets. We pustponc lilortl drtailvtl itivestigation of general unions and intersections until Chapter 2, where) ii ir~orc wit3ld y tkotatiolr becoirles available. Finally, we say that sets A i i t l r l B are dis~ointif' A n B = 8, Alorc. ge~i~riilly. S is a system of mutuully disjoint s e t s if A n B = Q) for a11 A, B E S sutnht l l i i t A # B.
n
n
n
4. ELEMENTARY OPERATIONS ON SETS
15
Exercises
4.1 Prove all the displayed formulas in this section and visualize them using 4.2
Venn diagrams. Prove: (a) A C B i f a n d o n l y i f A n B = A i f a n d o n l y i f A ~ B =B i f a n d o n l y ifA-B=@. (b) A B n C if and only if A G B and A C C. ( c ) B U C A if and only if B A and C A. (d) A - B = ( A u B ) - B = A - ( A n B ) . (e) A n B = A - ( A - B ) . (f) A - ( B - C) = ( A - B ) u ( A n C ) . (g) A = B if and only if A A B = 8. For each of the following (false) statements draw a Venn diagram in which it fails: (a) A - B= B - A . (b) A n B c A. (C) A B U C implies A G B or A C. (d) B n C A implies B C A or C A. Let A be a set; show that a "complement" of A does not exist. (The "complement" of A is the set of all s 4 A.) Let S # 8 and A be sets. (a) Set Tl = {Y E P ( A ) I Y = A n X for some X E S), and prove A n US = UT1 (generalized distributive law). (b) Set Tz = { Y E P ( A ) 1 Y = A - X for some X E S), and prove
c
4.3
c
4.4 4.5
c
c
c
c c
(generalized De Morgan laws). 4.6 Prove that r) S exists for all S # 8. Where is the assumption S # in the proof?
0 used
This page intentionally left blank
Chapter 2
Relations, Functions, and Orderings 1.
Ordered Pairs
In this chapter we begin our program of developing set theory as a foundation for mathematics by showing how various general mathematical concepts, suclh as relations, functions, and orderings can be represented by sets. We begin by introducing the notion of the ordered pair. If a and b arc sets. then the unordered pair {a, b ) is a set whose elements are exactly n and b. The "order" in which a and b are put together plays no role; {a, b ) = {b.a). For many applications, we need to pair a and b in a way making possible to "read off" which set comes "first" and which comes "second.'! We denote this ordered pair of n and b by ( a , b); a is the first coordinate of the pair (a, b), b is the second coordinate. As any object of our study, the ordered pair has to be a set. It should bc defined in such a way that two ordered pairs are equal if and only if their first, coordinates are equal and their second coordinates are equal. This guarantres in particular that (a, b ) # (b, a ) if a # b. (See Exercise 1.3.) There are many ways how to define (a, b) so that the foregoing conditioll is satisfied. We give one such definition and refer the reader to Exercise 1.6 for alternative approach. 1.1 Definition (a, b) = {{a), { a , b ) ) .
If a # b, (a, b) has two elements, a singleton { a ) and an unordered pair {a, b). We find the first coordinate by looking a t the element of { a ) . The second coordinate is then the other element of {a, b). If a = b , then (ata ) = {{(L), {a, a ) ) = {{a) ) has only one element. In any case, it seems obvious that both coordinates can be uniquely "read off" from the set (a, b). WP make this statement precise in the following theorem.
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
18
1.2 Theorem (a, b) = (a', b') if and only if a = a' and b = b'.
Proof. If a = a' and b = b', then, of course, (a, b) = { { a ) ,{a, b)) = {{a'), {a', 6')) = (a', 6'). The other implication is more intricate. Let us assume that {{a), {a,b)) = {{a'), { a f , b f ) ) .If a # b, {a) = {a') and {a,b) = {a',bt). S O, first, a = a' and then {a, b) = {a, b') implies b = b'. If a = b, { { a ) ,{a. a ) ) = {{a)). So { a ) = {a'), {a) = {a', h'}, and we get a = a' = b', so a = a' and b = b' holds in this case, too. 0 With ordered pairs at our disposal, we can defirie ordered triples ("l
b, c) = ((a,b), c),
ordered quadruples (a, b, c, d) = ((a,b, c), d), and so on. Also, we define ordered "one-tuples" (a) = a. However, the general definition of ordered n-tuples has to be postponed until Chapter 3, where natural numbers are defined. Exercises
1.1 Prove that (a, b) E P ( P ( { a ,b))) and a , b E U(a, b). More generally. if a f A and b E A, then (a, b) f P(P(A)). 1.2 Prove that (a, b), (a, b, c), and (a, b, c, d) exist for all a , b, c, and d. 1.3 Prove: If (a, b) = (6, a), then a = b. 1.4 Prove that ( a ,b, c) = (a', b', c') implies a = a', b = b', and c = c'. S t . a t ~ and prove an analogous property of quadruples. 1.5 Find a, b, and c such that ((a, b), c) # (a, (b, c)). Of course, we could use the second set to define ordered triples, with equal success. 1.6 To give an alternative definition of ordered pairs, choose two different sets C3 and A (for example, U = 0, A = (0))and define
State and prove an analogue of Theorem 1.2 for this notion of ordered pairs. Define ordered triples and quadruples.
2.
Relations
Mathematicians often study relations between mathematical objects. Relations between objects of two sorts occur most frequently; we call them binary relations. For example, let us say that a line 1 is in relation R1 with a point P if I passes through P. Then RI is a binary relation between objects called lines a l i r i objects called points. Similarly, we define a binary relation Rz between positive
2. RELATIONS
19
integers and positive integers by saying that a positive integer m is in relation R2 with a positive integer n if m divides n (without remainder). Let us now consider the relation R: between lines and points such that a line 1 is in relation R', with a point P if P lies on 1. Obviously, a line l is in relation RI to a point P exactly when 1 is in relation R\ to P. Although different properties were used to describe RI and R:, we would ordinarily consider R , and R; to be one and the same relation, i.e., RI = R;. Similarly, let a positive integer m be in relation R; with a positive integer n if n is a multiple of m. Again, the same ordered pairs (m, n) are related in R2 as in R;, and we consider to be the same relation. R2 and A binary relation is, therefore, determined by specifying all ordered pairs of objects in that relation; it does not matter by what property the set of these ordered pairs is described. We are led to the following definition. 2.1 Definition A set R is a binary relation if all elements of R are ordered pairs, i.e., if for any z E R there exist s and y such that z = (x,y).
2.2 Example The relation R2is simply the set { t ( there exist positive intege.rs m and n such that z = (m, n ) and m divides n ) . Elements of R2 are ordered pairs
It is customary to write xRy instead of (X,y) f R. We say that relation R with y if s R y holds. We now introduce some terminology associated with relations.
.X
is in
2.3 Definition Let R be a binary relation. (a) The set of all s which are in relation R with some y is called the domain of R and denoted by dom R. So dorn R = { s I there exists y such that xRy ) . don1 R is the set of all first coordinates of ordered pairs in R. (b) The set of all y such that, for some X, z is in relation R with y is called the range of R, denoted by ran R. So ran R = { y ( there exists X such that :t-Ry ). ran R is the set of all second coordinates of ordered pairs in R. Both dom R and ran R exist for any relation R. (Prove it. See Exercise 2.1). (c) The set dom R U ran R is called the field of R and is denoted by field R. (d) If field R X , we say that R is a relation in X or that R is a relation between elements of X .
c
2.4 Example Let R2 be the relation from Example 2.2.
dom R2 = {m I there exists n such that m divides n ) = the set of all positive integers
CHAPTER 2. RELATIONS, FUNCTIONS. AND ORDERINGS
20
because each positive integer m divides some n, e.g., n = m; ran R2 =
1 there exists m such
that 7n divides n ) = the set of all positive integers (71
because each positive integer
YL is
divided by some m. e.g., by m =
7 ~ ;
field R:! = dom R2 LJran R2 = the set of all positive integers;
R2 is a relation between positive integers. We next generalize Definition 2.3. 2.5 Definition (a) The image of A under R is the set of all y from the range of R related R to some element of A; i t is denoted by R [ A J So .
R[A]= { y f ran R ( there exists
X
it1
E A for which xRp).
(b) The inverse image of B under R is the set of all X from the domain of R related in R to some element of B; it js denoted R-' [B].So
-
R-'[B]
{X
E dom R ( there exists I/ E
B for which x R y ) .
2.6 Example ~ ; ' [ { 3 , 8 , 9 , 1 2 } j= { l , 2,3,4,6,8,9,12); R1[{2}] = the set of' all even positive integers. 2.7 Definition Let R be a binary relation. The inverse of R is the set
R-
L
=
{z1 z
= ((L, g )
for some z ;md y such that (y.s)E R )
2.8 Example Again let
R2 = { t 1 t = (rn,n ) , rn ancl n are positive integers, and 7n divides 7 1 ) : tbc~ii R,' = {W1 W = ( n , m ) , and ( m , n ) Rz} ~ = {WI W = ( n , m ) , m and n are positive integers, and m divides n}. In our description of Rz, we use variable m for the first coordinate and variable n for the second coordinate; we also state the property describing R2 so that the variable m is mentioned first. It, is a customary (though not necessitr!.) practice to describe R;' in the same way. All we have to do is use letter 771 instead of n , letter n instead of m. and change the wording: l
R; = { W (
W
= {WI
W
(m, n ) , n , m are positive integers, and n divides 7 7 1 ) = ( m , n ) , m , n are positive integers, and m is a multiple of n ) =
Now R2 and R;' are described in a parallel way. In this sense, the inverse o f the relation "divides" is the relation "is a multiple." The reader may notice that the symbol Rb'IB] in Definition 2.5(b) for tile inverse image of B under R now also denotes the image of B under R - ' . Fortunately, these two sets are equal.
2. RELATIONS
21
2.9 Lemma The inverse image of B under R is equal to the image of B under
R- l. Notice first that dom R = ran R-' (see Exercise 2.4). Now, a: E Proof. dom R belongs t o the inverse image of B under R if and only if for some E B, (a:, y) E R. But ( X ,y) E R if and only if (y, a:) E R - ' . Therefore, a: belongs to the inverse image of B under R if and only if for some y E B, (y, X) E R- '. i.e.. U if and only if a: belongs to the image of B under R-'. In the rest of the book we often define various relations. i.e.. sets of ordered pairs having some particular property. To simplify our notation, we introduce the following conventions. Instead of {W
)
W
= (a:, y) for some X and y such that P ( x , y ) ) ,
we simply write Y) l P(x,Y)). For example, the inverse of R could be described in this notation as { ( X . ? j ) 1 (y, X ) +E R ) . [As in the general case, use of this notation is admissible only if we prove t h a t there exists a set A such that. for all a: and y. P ( x . y) implies (5, y) f A.] {(X,
2.10 Definition Let R and S be binary relations. The composition of R and S is the relation
So R =
{(X, z )
1 there exists y for which
(a:, y) E R and (y, 2) E S ) .
So (a:, z ) E S o R means that for some y, x h y and ySz. To find objects relat~cl to X in S o R, we first find objects y related t o X in R , and then objects related t o those y in S . Notice that R is performed first and S second, but notation S o R is customary (at least in the case of functions; see Section 3). Several types of relations are of special interest. We introduce some of them in this section and others in the rest of the chapter. 2.11 Definition T h e membership relation on A is defined by
€ A = ((a,b) I a f A, ~ E Aa n, d a E h ) . The identity wlation on A is defined by 1 d =~ {(a,b) I
~ A,E
b f A, a n d a = b ) .
2.12 Definition Let A and B be sets. The set of all ordered pairs whose first coordinate is from A and whose second coordinate is from B is called the cartesian product of A and B and denoted A X B. In other words,
A
X
B = {(a,b) I a . € A a n d b ~B}.
Thus A X B is a relation in which every element of A is related to every element. of B.
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
22
It is not completely trivial t o show that the set A X B exists. However, one gets from Exercise 1.1 that if a E A and b E B, then ( a . b ) E P ( P ( A U B ) ) . Therefore,
A x B = { ( a , b ) f P ( P ( A u B ) ) I U E A a n d b e B). Since P ( F ( A U B ) ) was proved to exist, the existence of A X B follows from the Axiom Schema of Comprehension. [To be completely explicit, we can write,
A x B= We denote A readily:
{W
X
E P(P(AU
B ) ))
W
= (a, b) for some a E A and b E B).]
A by A 2 . The cartesian product of three sets can be introduced AxBxC=(AxB)xC.
Notice that
(using an obvious extension of our notational convention). A X A denoted A 3 . We can also define ternary relations.
X
A is usually
2.13 Definition A ternary relation is a set of unordered triples. More explicitly, S is a ternary relation if for every U E S, there exist X , y, and t such that. U = ( X ,y, 2). IE S G A ~ we , say that S is a ternary relation in A . (Note t h a t a binary relation R is in A if and only if R C A'.)
We could extend the concepts of this section t o ternary relations and also define 4-ary or 17-ary relations. We postpone these matters until Section 5 in Chapter 3, where natural numbers become available, and we are able to defint* n-ary relations in general. At this stage we only define, for technical reasons. unary relations by specifying that a unary relation is any set. A unary relation i n A is any subset of A. This agrees with the genera1 conception that a unary relation in A should be a set of l-tuples of elements of A and with the definition of ( X )= a: in Section 1. Exercises
2.1 Let R be a binary relation; let A = U(UR ) . Prove that (X, E R implies s E A and y E A . Conclude from this that dom R and ran R exist. 2.2 (a) Show that R-' and S 0 R exist. [Hint: R-' (ran R ) x (dorn R). S o R (dom R) X (ran S ) . ] (b) Show that A X B X C exists. 2.3 Let R be a binary relation and A and B sets. Prove: (a) R [ AU B] = RIA] U R [ B J . (b) R [ A n B] C R[A]n R [ B J .
v)
c
c
3. FUNCTIONS
23
c
(d) Show by an example that and _> in parts (b) and (c) cannot be replaced by =, (e) Prove parts (a)-(d) with R-' instead of R. (f) R-' [R[A]]2 A n dom R and R(R- l (B]] 2 B n ran R; give examples where equality does not hold. 2.4 Let R C X X Y. Prove: (a) R[X] = ran R and R-' [Y] = dom R. (b) If a $ dom R, R[{a}] = 0; if b $ ran R, R-' [{b)]= 0. (c) dom R = ran R-'; ran R = dom R-'. (d) ( R - ' ) - l = R. (e) R-' 0 R 2 IddomR;R 0 R-' 3 IdrauR. 2.5 Let X = (0,{g)), Y = P ( X ) . Describe (a) € v , (b) I ~ Y . Determine the domain, range, and field of both relations. 2.6 Prove that for any three binary relations R, S, and T
(The operation o is associative.) 2.7 Give examples of sets X , Y, and Z such that (a) X x Y # Y x X . (b) X X (Y X Z ) # ( X X Y) X Z. ( C) X 3 # X X X 2 (i.e., (X X X ) X X # X X (X X X)]. !Hint for part (c): X = {a) .] 2.8 Prove: (a) A X B = 0 if and only if A = 0 or B = 0. (b) (Al U AS) X B = (Al X B) U (A2 X B); A X (Bl U B2) = (A X B1) U (A X 82). (c) Same as part (b), with U replaced by n, -, and A.
3. Functions Function, as understood in mathematics, is a procedure, a rule, assigning to any object a from the domain of the function a unique object b, the value of the function a t a. A function, therefore, represents a special type of relation, a relation where every object a from the domain is related to precisely one object in the range, namely, to the value of the function at a. 3.1 Definition A binary relation F is called a junction (or mapping, correspondence) if aFbl and aFbz imply bl = b2 for any a , b l , and b2. In other words, a binary relation F is a function if and only if for every a from d o m F there is exactly one b such that aFb. This unique b is called the value of F at a and is denoted F ( a ) or F,. (F(a) is not defined if a 4 dom F.] If F is a function with dom F = A and ran F C B, it is customary to use the notations
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
24
F
:
A
4
B , ( F ( a )I a
range of the function
F
A), (F, I a E A), (Fa)afA for the function F. can then be denoted { F ( a )1 a E A) or
'rl~r
T h e Axiom of Extensionality can be applied to functions as follows. 3.2 Lemma Let F and G be functions. anti F ( x ) = G ( x ) for all X E dom F .
F
=G
2f
and only zf d o ~ F n = do~G n
We leave the proof to the reader. Sitice functions are binary relations, concepts of domain, range. image. inverse image, inverse, and composition can be applied to them. We introduce several additional definitions. 3.3 Definition Let F be a function and A and B sets. ( a ) F is a function on A if dom F = A. ( b ) F is a function into B if ran F C B. (c) F is a function onto B if ran F = B. ( d ) Tlie restriction of the function F to A is the function
F T A = { ( a , b )E F
( U E
A).
If G is a restriction of F to sorne A, we say that F is an extenszon of G. 3.4 Example Let F = { ( X , 1 / x 2 ) I X # 0, X is a real number). F is a function: If aFbl and aFb2, bl = l/u%nd bn = l / a 2 , so bl = b2. Slightly stretching our riotatioilal conventions. we can also write F = ( l/.rL X is a real number, X # 0). The value of F at X , F ( x ) , equals l / x 2 . It is ; l function otl A , where A = { X ) s is n real number and ,2: # 0 ) = dotn F. F is it function illto the set of all real nuitll~ers,but not onto the set ut' all real n u r ~ ~ l ~ r r s . If L3 = { X I X is a real number tud L > 0). then F is onto B. If C = { : I . l 0 <: .E < l ) , tlien f [ C ]= { : E ) x > 1 ) i~rldf'-'[C] = { X 1 x 5 - 1 o r : c > 1 ) . Let us find the composition f o f : f
0
1 there is 2~ for which ( X , y ) E f and ( y , z ) C f } = { ( X . z) I there is y for which X # 0, y = 1 / x 2 ,and y # 0, z = = { ( X , z ) I X # 0 and z = X ' }
f =
=
{(X,z )
(X"
X
1/y2}
# 0).
Notice t h a t f o f is a function. This is not an accident. 3.5 Theorem Let f and y be functions. Then y o f is a functior~. g o j' i s defined at X if and only if f is defined at X and y is defined at f ( x ), 2 . e.,
dorn(g o f ) = dom f 17f-'[domg].
Also, (y o f ) ( X )
=
y ( f ( X ) ) for all
.T
E ciom(g o f ).
26
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
Example Let f = (1/x2 ] X # 0);find f a ' . As f = { ( x , l / x " ) x # 0}, we get f - l = {(1/x2,X) 1 X # 0). f - l is not a function since ( 1 . -1) E f - l . ( 1 , l ) E f - l . Therefore, f is not one-to-one; ( 1 , l ) E f . ( - 1 , l ) E f . (b) Let g = (2x - 1 ) X real); find g - l . g is one-to-one: If 2 x l - 1 = 2x2 - 1. then 2x1 = 2x2 and X I = X ? . y = {(X,y) 1 y = 2x - 1, x real), therefore. g-l = { ( y , x ) I y = 22 - 1, z real).
As customary when describing functions, we express the second coordinate (value) in terms of the first:
Finally, it is usual to denote the first ("independent") variable x and the second ("dependent") variable y. So we change notation:
l
S-' = {(x,Y) Y =
x+l X +1 7 X, real} = (' r real).
3.10 Definition (a) Functions f and g are calIed compatible if f ( X )= y(x) for all X E dom f n dom g. (b) A set of functions F is called a compatible system of functions if any two functions f and y from F are compatible. 3.11 Lemma (a) Functions f and y are compatible if and only if f U y is a function. (6) f i n c t i o n s f und y are compatible if and only if f 1 (dom f n d o m g ) = g (dom f n domg).
We leave the easy proof to the reader, but we prove the following. 3.12 Theorem If F is a compatible system of functions. then U F is a functzun with d o m ( U F ) = U{dom f ) f E F ) . The function U F extends all f E F.
Functions from a compatible system can be pieced together t o form a single function which extends them alt. Clearly, U F is a relation; we now prove that it is a function. If Proof. (a, bl) f U F and (a, b z ) f U F, there are functions f l , f 2 E F such that ( a , bl) E f l and (a, bz) E f2. But f l and f2 are compatible, and a E dom f l n dorn f 2 ; so b l = f(a1) = f (a21 = b2. I t is trivial t o show that X f dom(U F ) if and only if X E dom f for some ffF. U 3.13 Definition Let A and B be sets. T h e set of all functions on A into B is exists; this is done in Exercise denoted B A . Of course, we have t o show that 3.9.
3. FUNCTIONS
27
It is useful to define a more general notion of product of sets in terms of functions. Let S = (S, I i f I ) be a function with domain I . The reader is probably familiar mainly with functions possessing numerical values; but for us, the values S, are arbitrary sets. We call the function (S, I i E I) an indexed system of sets, whenever we wish to stress that the values of S are sets. Now let S = (Sj1 i E I ) be an indexed system of sets. We define the product of the indexed system S as the set S = {f
If
is a function on I and f, E S, for all i
E I)
Other notations we occasionally use are
The existence of the product of any indexed system is proved in Exercise 3.9. The reader is probably curious to know how this product is related to the previously defined notions A X B and A X B X C. We return to this technical problem in Section 5 of Chapter 3. At this time, we notice only that if the indexed system S is such that Si = B for all i E I , then
The "exponentiation" of sets is related t o t'multiplication" of sets in the same way as similar operations on numbers are related. We conclude this section with two remarks concerning notation. U A and A were defined for any system of sets A (A # Q) in case of intersection). Often the system A is given as a range of some function, i.e.. of some indexed system. (See Exercise 3.8 for proof that any system A can be so presented, if desired.) We say that A is indexed by S if
n
where S is a function on I, It is then customary to write
and similarly for intersections. In the future, we use this more descriptive notation. Let f be a function on a subset of the product A X B. It is customary to denote the value of f at ( X ,y) f A X B by f( X ,y) rather than f ((X,Y)) and regard f as a function of two variables X and y.
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
28 Exercises
3.1 Prove: If ran f C dom g, then dom(y o f ) = dorn f . 3.2 T h e functions f , : i = 1 , 2 , 3 are defined as follows:
fl
= (2x - l l X real),
f2
=
-
f3
3.3
3.4
(6 1 X > O), {l/x l z real,
X
# 0).
Describe each of the following functions, and determine their ciomaitls and ranges: f2 0 f l , f l 0 f ~ f ~, 0 . f ~f l U f3. Prove t h a t the functions f l , f.2, f 3 from Exercise 3.2 are one-to-one, atitl find the inverse functions. 111 each case, verify that dorn f , = ran( f , - l ) . ran f , = dom( f i L ) . Prove: (a) If f is invertible, f - o f = Idno,,,I , f o f - l = Id ,, f . (b) Let f be a function. If there exists a function g such t h a t y o f = IddOknf then f is invertible and f = y r ran f . If there exists function h such t h a t f 0 h = Id,,, f then f may fail t o be invertible. Prove: If f and y are one-to-one functions, y o f is also a one-to-one function, and (y 0 f ) - ' = f - ' 0 y-l. T h e images and inverse images of sets by functions have the properties exhibited in Exercise 2.3, but some of the inequalities cat1 now be replac.etl by equalities. Prove: (a) If f is a function, f l [ A n B] = f l [ A ] n f L 1 \ B ] . (b) If f is a function, f [A - B] = f [A] - f - [B]. Give an example of a fuliction f and a set A such t h a t f n A" f [ A. Show t h a t every system of sets A can be indexed by a function. [Hint: Take I = A and set S, = i for all i E A.] (a) Show that the set B A vxists. [Hint: E P(A x B).] (b) Let (S, I i E I ) be at] indexed system of sets: show t h a t S, exists. (Hint: S* 5 P ( I X UzcIS,).\ Show that unions and intersections satisfy the following general f o r ~ nof' the associative law:
'
-'
3.5
3.6
-'
3.7 3.8 3.9
-'
n,,,
nlE,
3.10
nEU 5.'
'
C E S aEC
if S is a nonempty system of nonempty sets. 3.11 Other properties of unions and intersections can be generalized similarly. De Morgan Laws:
4. EQUIVALENCES AND PARTITIONS
Distributive Laws:
3.12 Let f be a function. Then
!-'I
U Fa] aE A
=
U f-'[Fa], aEA
If f is one-to-one, then in the third formula can be replaced by 3.13 Prove the following form of the distributive law:
=.
assuming t h a t Fa,,, n Fa,bz= Q) for all U € A and bl , b2 € B , h l # b2- [Hint: Let L be the set on the left and R the set on the right. F ~ , , (c ~ )U ,, F ~ .hence ~; F ~ , ~ C( ~ , F ~ .=~ L, ) so finally, R C_ L. To prove t h a t L C R, take any n: E L. P u t ( a ,6) E f if and only if z E Falb,and prove t h a t f is a function on A into B for which X E (loEA so z E R.]
naEA
4.
naEA(u,,,
Equivalences and Partitions
Binary relations of several special types enter our considerations more frequently. 4.1 Definition Let R be a binary relation in A. (a) R is called reflexive in A if for all U A, a R u . (b) R is called spmmetric in A if for all a , b E A, uRb implies bRa. (c) R is called transitive in A if for all a , 6, c E A, u R b and bRc imply uR.cm. (d) R is called a n equivulence on A if it is reflexive, symmetric, and t,ransit,ivc. in A .
30
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
4.2 Example
(a) Let P be the set of all people living on Earth. We say t h a t a person p is equivalent to a person q (p q ) if p and q both live in the same country. Trivially, = is reflexive, symmetric, and transitive in P. Notice that the set P can be broken into classes of mutually equivalent elements; all people living in the United States form one class, all people living in France are another class, etc. All members of the same class are mutually equivalent; members of different classes are never equivalent. T h e equivalence classes correspond exactly to different countries. (b) Define a n equivalence E on the set Z of all integers as follows: z E y if and only if X - y is divisibIe by 2. (Two numbers are equivalent if their difference is even.) T h e reader shouId verify 4.l(a)-(c). Again, the set. Z can be divided into equivalence classes under (or, as is customary to say, rnodulo) the equivalence E . In this case, there are two equivalence classes: the set of even integers and the set of odd integers. Any two even integers are equivalent; so are any two odd integers. But an even integer cannot be equivalent to an odd one. T h e situation encountered in the previous exampIes is quite general. Any equivalence on A partitions A into equivalence classes; conversely, given a suitable partition of A, there is a n equivalence on A determined by it. T h e foIIowing definitions and theorems establish this correspondence. 4.3 Definition Let E be an equivalence on A and let a E A. The equi~)ale~~cc~ class o f a rnodulo E is the set
4.4 Lemma Let a , b E A. (a) a i s equivalent to b rnodulo E if and only if [ a ]=~[bIE. (b) a is not equivalent to b rnodulo E if and only if [ a ]n~l b ] = ~ 4)-
Proof. i.e., xEa. By transitivity, x E a and ( a ) (1) Assume t h a t aEb. Let z aEb imply xEb, i.e., X E [bIE. Similarly, X E [bIE implies X E [a]E (bEa is true because E is symmetric). So [a]E = IbIE. (2) Assume t h a t [a]E = [b]E . Since E is reflexive, aEa, so a E [ a ] ~But . then a E [bIE,that is, aEb. (b) (1) Assume aEb is not true; we have to prove [ a ]f7 ~[bIE = Q). If not, there is X E [alEn IbIE; SO xEa and xEb. But then, using first symmetry and then transitivity, a E x and xEb, so aEb, a contradiction. (2) Assume finally that [a]E n [b]E = 4). If a and b were equivalent rnodulo E , aEb would hold, so a € [bIE. But also a E [aIE, implying [ u ] fl ~ [bjE # 0 , a contradiction. U
4. EQUIVALENCES AND PARTITIONS
31
4.5 Definition A system S of nonempty sets is called a partition of A if (a) S is a system of mutually disjoint sets, i.e., if C E S, D E S, and C # D, then C f~D = 8, (b) the union of S is the whole set A, i.e., US = A. 4.6 Definition Let E be an equivalence on A. The system of all equivalence classes modulo E is denoted by A/E; so A/E = { ( a J EI a E A).
4.7 Theorem Let E be an equivalence on A; then A/E is a partition of A.
Proof. Property (a) foIlows from Lemma 4.4: If ( u ]#~[bjE, then a and b are not E-equivalent, so f~ [bIE = Q). To prove (b), notice that U A/E = A because a E Notice also that no equivalence class is empty; surely a t least a E [a)& • We now show that, conversely, for each partition there is a corresponding equivalence relation. For example, the partition of people by their country of residence yields the equivalence from (a) of Example 4.2. 4.8 Definition Let S be a partition of A. The relation Es in A is defined by
Es = {(a, b) E A
X
A I there is C E S such that a
E
C and b E C).
Objects a and b are related by Es if and only if they belong to the same set from the partition S. 4.9 Theorem Let S be a partition of A; then Es is an equivalence on A.
Pmof. (a) Reflezivity. Let a E A; since A = US, there is C E S for which a E C , so ( a ,a ) E Es. (b) Symmetry. Assume aESb;then there is C E S for which a E C and b E C. Then, of course, b E C and a E C, so bEsa. (c) Transitivity. Assume aEsb and bEsc; then there are C E S and D E S such that a E C and b E C and b E D and c E D. We see that b E C n D, so C n D # 8. But S is a system of mutually disjoint sets, so C = D. Now we have a E C , c E C, and so aEsc. The next theorem should further cIarify the relationship between equivalence~and partitions. We Ieave its proof to the reader. 4.10 Theorem (a) If E is an equivalence on A and S = A/E, then Es = E. (b) If S is a partition of A and Es is the corresponding equivalence, th.en. A/Es = S.
32
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
So equivalence relations and partitions are two different descriptions of the same "mathematical reality." Every equivaIence E determines a partition S = A / E . The equivalence Es determined by this partition S is identical with the original E. Conversely, each partition S determines an equivalence Es; when we form equivalence classes modulo Es, we recover the original partition S. When working with equivalences or partitions, it is often very convenient t o have a set which contains exactly one "representative" from each equivalence class.
4.11 Definition A set X C A is called a set of representatives for the equivalence Es (or for t h e partition S of A ) if for every C E S, X n C = { a ) for sotne a E C. Returning to Example 4.2 we see that in (a) the set X of a11 Heads of State is a set of representatives for the partition by country of residence. T h e set X = {0, l } could serve as a set of representatives for the partition o f integers into even and odd ones. Does every partition have some set of representatives'? Intuitively. the answer may seem t o be yes, but it is not possible t o prove existence of some such set for every partition on the basis of our axioms. We return t o this problem when we discuss the Axiom of Choice. For the moment we just say that. for rnany mathematically interesting equivalences a choice of a natural set of' representatives is both possible and useful. Exercises 4.1 For each of t h e following relations, determine whether they are reflexive. symmetric, or transitive: (a) Integer X is greater than integer y. (b) Integer n divides integer m. (c) z # y in the set of all natural numbers. (d) C and C in P ( A ) . (e) Q) in Q). (f) Q) in a nonempty set A. 4.2 Let f be a function on A onto B. Define a relation E in A by: uEb if and only if f (a) = f (b). (a) Show that E is an equivalence relation on A. (b) Define a function cp o r 1 .4/E onto B by p ( [ a j E ) = f (u.) (verify t.hilt ~ ( [ uE )] = cp([ul]E ) if [ a ]E = l a ' ] ~ ) . (c) Let j be the function on A onto A / E given by j ( u ) = [ u J ~ ; Show . that rp o j = f. 4.3 Let P = { ( r ,y) E R X R ] r > O), where R is the set of all real numbers. View elements of P as polar coordinates of points in the plane, and define a relation on P by: (r, y ) r- ( r ' , y ' ) if and only if r = r' and 7 - 7' is an integer multiple of 27r. Show that .- is an equivalence relation on P. Show that each equivalence class contains a unique pair (r.y ) with 0 5 y < 27r. T h e set of all such pairs is therefore a set of representativ~s for .-.
5. ORDERINGS
Orderings
5.
Orderings are another frequently encountered type of relation. 5.1 Definition A binary relation R in A is antisymm.etric if for all uRb and bRa imply a = b.
U,
b
A.
5.2 Definition A binary relation R in A which is reflexive, antisymmetric. and transitive is called a (partial) ordering of A. The pair ( A , R ) is called an ordered
set.
aRb can be read as "a is less than or equal t o b" or "6 is greater than or equal t o a" (in the ordering R). So, every element of A is less than or equal t o itself. If a is less than or equal to 6 , and, at the same time, b is less than or equal t o U , then a = b. FinaIIy, if a is less than or equal to b and b is less than or equal t o c, U has to be less than or equal to c. 5.3 E x a m p l e (a) 5 is an ordering on the set of all (natural, rational, real) numbers.
y and (b) Define the relation C A in A as follows: X C A y if and only if z X ,y E A. Then S A is an ordering of the set A. (c) Define the relation _>A in A as follows: X 2~ y if and only if z 2 y and X , y E A. Then 2~ is also a n ordering of the set A. (d) The relation ( defined by: n ( m if and only if n divides m is a n ordering of the set of all positive integers. (e) The relation IdA is an ordering of A. T h e symbols 5 or $ are often used t o denote orderings. A different description of orderings is sometimes convenient. For example. instead of the relation 5 between numbers, we might prefer to use the relation < (strictly less). Similarly, we might use C A (proper subset) instead of L A . Any ordering can be described in either one of these two mutually interchangeable ways. 5.4 Definition A relation S in A is asymmetric if aSb implies that bSa does not hold (for any a, b E A). That is, aSb and bSu can never both be true. 5.5 Definition A relation S in A is a strict ordering if it is asymmetric and
transitive. We now establish relationships between orderings and strict orderings. 5.6 T h e o r e m (a) Let R be a n ordering of A; then the relation S defined in A by
aSb i f and only i f aRb and a # b is a strict ordering of A.
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
34
(b) Let S be a strict ordering of A; then the relation R defined in A by
aRb if and only if
aSb or a = b
is an ordering of A. We say that the strict ordering S corresponds to the ordering R and vice versa. Proof. (a) Let us show that S is asymmetric: Assume that both aSb and bSa hold for some a , b E A. Then also aRb and bRa, so a = b (because R is antisynlmetric). That contradicts the definition of aSb. We leave the verification of transitivity of S to the reader. (b) Let us show that R is antisymmetric: Assume that aRb and bRa. Becausc aSb and bSa cannot hold simultaneously ( S is asymmetric). we conclude that a = b. Reflexivity and transitivity of R are verified similarly.
5.7 Definition Let a, b E A, and let 5 be an ordering of A. We say that a and b are comparable in the ordering 5 if a < b or b 5 a. We say that a and b are incomparable if they are not comparable (i.e., if neither a 5 b nor b 5 a holds). Both definitions can be stated equivalently in terms of the corresponding strict ordering <; for example, a and b are incomparable in < if a # b and neither a < b nor b < a holds. 5.8 (a) (b) (c) (d)
Example Any two reaI numbers are comparable in the ordering 5 . 2 and 3 are incomparable in the ordering 1. Any two distinct a, 6 E A are incomparable in IdA. If the set A has at least two elements, then there are incomparable elements in the ordered set (P(A), C p ( A ) )
5.9 Definition An ordering .I: (or <) of A is called linear or total if any two elements of A are comparable. The pair (A, <) is then called a linearhj ordered
set. So the ordering 5 of positive integers is total, while I is not. 5.10 Definition Let B C_ A, where A is ordered by any two elements of B are comparable.
5. B is a chain in A if
For example, the set of all powers of 2 (i.e., {2', 2', 2', Z 3 , . . . ) ) is a chain in the set of all positive integers ordered by 1. A problem that arises quite often is to find a least or greatest element among certain elements of an ordered set. Closer scrutiny reveals that there are several different notions of "least" and "greatest."
5. ORDERINGS
35
5.11 Definition Let 5 be an ordering of A, and let B C_ A. (a) b E B is the least element of B in the ordering < if b < X for every X E B. (b) b E B is a minimal element of B in the ordering 5 if there exists no X E B such that X 5 b and X # b. (a') Similarly, b E B is the greatest element of B in the ordering < if. for every X € B,xIb. (b') b E B is a maximal element of B in the ordering . Iif there exists no X E B such that b 5 X and X # b. 5.12 E x a m p l e Let N be the set of positive integers ordered by divisibility relation 1. Then 1 is the Ieast element of N, but N has no greatest element. Let B be the set of a11 positive integers greater (in magnitude) than 1: B = 2 3 4 . . . } Then B does not have a least element in 1 (e.g., 2 is not the Ieast element because 2 1 3 fails), but it has (infinitely) many minimal elements: numbers 2, 3, 5, etc. (exactly a11 prime numbers) are minimal. B has neither greatest nor maximal elements. We list some of the properties of least and minimal elements in Theorem 5.13. The proof is left as an exercise.
5.13 T h e o r e m Let A be ordered by 5 , and let B C A. (a) B has at most one least element. (b) The least element of B (if it exists) is also minimal. (c) If B is a chain, then every minimal element of B is also least. The theorem remains true if the words "least" and "minimal" are replaced by "greatest" and "maximal", respectively.
<
5.14 Definition Let be an ordering of A, and let B C A. (a) a E A is a lower bound of B in the ordered set (A, <) if a 5 X for all X E B. (b) a E A is called an infimum of B in (A, 5 ) [or the greatest lower bound of B in (A, 5 ) )if it is the greatest element of the set of all lower bounds of B in (A, 5 ) . Similarly, (a') a E A is an upper bound of B in the ordered set (A, 5 ) if X 5 a for all
B. (b') a E A is called a supremurn of B in (A, <) [or the least upper bound of B in (A, 5 ) )if it is the least element of the set of all upper bounds of B in ( A ?5 ) . X E
Note that the difference between the least element of B and a lower bound of B is that the second notion does not require b E B. A set can have many lower bounds. But the set of all lower bounds of B can have a t most one greatest element, so B can have a t most one infimum. We now summarize some properties of suprema and infima.
CHAPTER 2. RELATIONS, FUNCTIONS, AND ORDERINGS
36
5.15 Theorem Let ( A , 5 ) be an ordered set and let B C A. (a) B has at most one infimum. (b) if b is the least element of B , then b is the infimum of B . (c) if b is the infimum of B and b E B , then b is the least element of B (d) 6 E A is an infimum of B in ( A , <) if and only if (i) b < x f o r a l l x ~ B .
(zi) ifbt
<X
for all
X E
B, then b1 < b.
The theorem remains true if the words "least" and "infimum" are replacott by the words "greatest" and L'supremum"and "<" is replaced by ">" in (i) arl(l (ii).
Proof. (a) We proved this in the remark preceding the theorem. (b) The least element b of B is certainly a lower bound of B. If b' is any 1owc.r
<
bound of B, bt b because b C B . So b is the greatest element of t , h s~r t of all lower bounds of B. ( c ) This is obvious. (d) This is only a reformulation of the definition of infimum. D We use notations inf(B) and sup(B) for t h e infimum of B and t.he suprcrrlu~ll of B, if they exist. If B is linearly ordered, we also use ~ n i n ( B )and rnax(B) t ( ~ denote the minimal ([east) and the maximal (greatest) elements of B, if the). exist . 5.16 Example Let 5 be the usual ordering of the set of real numbers; let, B1 = { X 1 0 < X < I ) , B2 = { X 1 0 5 X < l } , B3 = { X ( z > 0 ) : anti B4 = { X 1 X < 0). Then B1 has no least element and no greatest element, but ariy b 5 0 is a lower bound of B 1 , so 0 is the greatest lower bound of B1; i.e.. 0 = inf(B). Similarly, any b > l is an upper bound of B ] , so 1 = s u p ( B l ) . T h e set B2 has a least element; so 0 = min(B2) = inf(B2); it does not Iiczve w g r ~ a t e s eIement. t Nevertheless, sup(B2) = 1. The set Bg has neither a greatest dernent nor a supremum (actually Bs has no upper bound in 5 ) :of' colirsc.. inf(Bs) = 0. Similarly, B4 has no lower bounds, hence no infimurn. 5.17 Definition An isomorphism between two ortiered sets ( P . <) arlrl (Q. < j is a one-to-one function It with d o m i ~ i P ~ land range Q such that for all p , . p2 E F'
p1 < p2
if and only if
h ( p l ) 4 h(p;!)
If an isomorphism exists between (F', <<)and (Q, -i),then (P. <) and (Q, 4)art. isomorphic. We study isomorphisms in a more general setting in Chapter 3. At this yoi~lt,let us make the following observation.
5. ORDERINGS
37
5.18 Lemma Let (P, <) and (Q, 4)be linearly ordered sets, and Iet h be a oneto-one function with domain P and range Q such that h(pr) 4 h(p2) whenever p1 < PP. Then h is a n isomopphism between (P. <) and (Q, +).
Proof. We have t o verify that if p l , p2 E P are such that k(pI ) 4 h(pz), then p1 < p2. But if p1 is not less than p2, then, because < is a linear ordering: of P, either p, = p2 or p2 < pl. If p1 = p2, the11 h(pl ) = h(p2). and if p2 < 111:. then h(p2) 4 h(pl), by the assumption. Either case contradicts h(pl) + h(pa).
a
Exercises
5.1 (a) Let R be an ordering of A, S be the corresponding strict ordering of A, and R* be the ordering corresponding t o S. Show that R* = R. (b) Let S be a strict ordering of A, R be the corresponding ordering, and S* be the strict ordering corresponding t o R. Then S* = S. 5.2 State the definitions of incomparable elements. maxima[, minimal, greatest, and least elements and suprema and infirna in terms of strict orderings. 5.3 Let R be an ordering of A. Prove t h a t R - ' is also an ordering of A, and for B A, (a) a is the least element of B in R-' if and only if U is the greatest element of B in R; (b) similarly for (minimal and maximal) and (supremum and infimum). 5.4 Let R be a n ordering of A and let B C A . Show that R n B2 is an ordering of B. 5.5 Give exampIes of a finite ordered set (A. <) and a subset B of A so that (a) B has no greatest element. (b) B has no least element. (c) B has no greatest element, but B has a supremurn. (d) B has no supremurn. 5.6 (a) Let (A, <) be a strictly ordered set and b 4 A. Define R relation 4 in B = A U ( b ) as follows:
c
x+y
ifandonlyif
(x,yfAandx
Show that 3 is a strict ordering of B and 4 n A 2 =<. (Intuitively, 4 keeps A ordered in the same way as < and makes b greater than every element of A.) (b) Generalize part (a): Let (A1, < l ) and (A2, < 2 ) be strict orderings, AI n A2 = 0. Define a relation 3 on B = A I U A2 a s follows: x 4 y
ifandonlyif x , y ~ A ~ a n d z < ~ y or X,y E A2 and X < 2 y or X E A, and y E A2.
CHAPTER 2. RELATIONS, FUNCTIONS. AND ORDERINGS
38
a Show t h a t 4 is a strict ordering of B and 4 n ~ =: < l . + n ~ =
U
f .I:g
if and only if
f C_ g
<
(a) Prove that is a n ordering of Fn(X, Y ) . (b) Let F C_ Fn(X, Y). Show that sup F exists if and only if F is ;t compatible system of functions; then sup F = U F . 5.10 Let A # 0;let Pt(A) be the set of all partitions of A. Define a relation < in Pt(A) by S1 4 Sz if and only if for every C E S1 there is D E S2such that C
(We say that the partition Sr is a refinement of the partition S2if S*S S2 holds.) (a) Show t h a t 4 is an ordering. (b) Let S1,S2 E Pt(A). Show that {S1,S2)has an infimum. [Hiat: Define S = {C n D I C E S1and D E Sz).j How is the equivalence relation Es related t o the equivalences Es, and Es, ? ( c ) Let T C Pt(A). Show that inf T exists. (d) Let T P t ( A ) . Show that sup T exists. [Hint: Let T 1be the set. of all partitions S with the property that every partition from T is a refinement of S. Show that sup T' = inf T . ] Show that if (P, <) and (Q, 4 ) are isomorphic strictly ordered sets and < is a linear ordering, then 4 is a linear ordering. The identity function 011 P is an isomorphism between (P.<) and ( P . < ) . If h is isomorphism between (P,<) and (Q, +), then h-' is an isomorphism between (Q, +) and (P,<). If f is an isomorphism between ( P l , and ( P 2 ,<2), and if g is an isomorphism between (P2,< 2 ) and ( P 3 ,< 3 ) , then g o f is an isomorphism between ( P l ,< l ) and ( P 3 ,
c
5.1 1 5.12 5.13 5.14
cD
Chapter 3
Natural Numbers 1.
Introduction to Natural Numbers
In order to develop mathematics within the framework of the axiomatic set theory, it is necessary to define natural numbers. We all know natural numbers intuitively: 0, 1, 2, 3, . . . , 17, . . . , 324, etc., and we can easily give examples of sets having zero, one, two, or three elements: 0 has 0 elements. ( 0 ) or, in general, {a) for any a , has one element. (8, or (({0)), {({0)))), or, in general, {a,b) where a # b , has two elements, etc. The purpose of the investigations in this section is to supplement this intuitive understanding by a rigorous definition. To define number 0, we choose a representative of all sets having no elements. But this is easy, since there is only one such set. We define 0 = 0. Let us proceed to sets having one element (singletons): (01, {(g)), { (0, {g})); in general, {X). How should we choose a representative? Since we already defined one particular object, namely 0, a natural choice is (0). So we define
{a)),
Next we consider sets with two elements: (0, {0)), ((01, (0, (0)}), ((01, ((0)) ), etc. By now, we have defined 0 and l, and 0 # 1. We single out a particular two-element set, the set whose elements are the previously defined numbers 0 and 1: 2 = {O, l } = {0, {0}}. It should begin to be obvious how the process continues: 2) = (0, (01, (0, (0)) } 4 = { O l l J , 3) = ( 0 7 {@l,(0, {@l), 5 = {0,1,2,3,4) etc. 3=
{a*(01, {0,{0}}}}
CHAPTER 3. NATURAL NUMBERS
40
T h e idea is simply to define a natural number n as the set of all smaller natural numbers: 0 l , .. . , n - l . In this way, n is a particular set of n elements. This idea still has a fundamental deficiency. We have defined 0, 1 , 2 , 3, 4, and 5 and could easily define 17 and-not so easiIy-324. But no list of such definitions teIIs us what a natural number is in general. We need a statement, of the form: A set n is a natural number i f . . . . We cannot just say t h a t a set rt is a natural number if its elements are a11 the smaller natural numbers, because such a "definition" would involve the very concept being defined. Let us observe the construction of the first few numbers again. We defined 2 = (0, 1 ) . To get 3, we had t o adjoin a third element t o 2, namely, 2 itself:
Similarly, 4 5
3 u ( 3 ) = { 0 , 1 , 2 ) u( 3 1 , = 4 u (41, etc. =
Given a naturaI number n, we get the "next" number by adjoining one more element to n, namely, n itseIf. The procedure works even for 1 and 2: 1 = O U { O ) , 2 = 1 U { l ) , but, of course, not for 0, the least natural number. These considerations suggest the following. 1.1 Definition T h e successor of a set
X
is the set S ( x ) = X U ( 2 )
Intuitively, the successor S ( n ) of a naturaI number n is the "one bigger" number n+ l . We use the more suggestive notation n + l for S ( n ) in what follows. We Iater define addition of natural numbers (using the notion of successor) ill such a way t h a t n 1 indeed equals the sum of n and 1. Until then, it is just 21 notation, and no properties of addition are assumed or implied by it. We can now summarize the intuitive understanding of natural numbers &S folIows: (a) 0 is a natural number. (b) If n is a natural number, then its successor n + 1 is also it natural riunlb~i.. (c) All natural numbers are obtained by application of (a) and (b). i.e.. by starting with 0 and repeatedly applying the successor operation: 0. 0 + 1 = 1,1+1=2,2+1=3,3+1=4,4+1=5 , . . . etc.
+
1.2 Definition A set I is caIled inductive if (a) 0 E I. (b) If n E I , then ( n+ 1) E I. An inductive set contains 0 and, with each element, also its successor. A(.cording t o (c), an inductive set should contain all natural numbers. 'fhe precis? meaning of (c) is t h a t the set of natural numbers is a n inductive set which [:ontnins no other elements but natural numbers, i.e., it is the smallest i~~clucstivt. set. This leads t o the following definition.
1. INTRODUCTION T O NATURAL NUMBERS 1.3 Definition The set of all natural numbers is the set
N =
{X
IX
E I for every inductive set
The elements of N are called natural numbers. Thus a set if and only if it belongs to every inductive set.
I). X
is a natural number
We have to justify the existence of N on the basis of the Axiom of Cornprehension (see the remarks following Example 3.12 in Chapter l ) , but that, is easy* Let A be any particular inductive set,; the11 cleitrly
N =
{X E
A
I X E I for every inductive set I).
The only remaining question is whether there are any inductive sets at all. The intuitive answer is, of course, yes: the set of natural numbers is a prime example. But a careful look a t the axioms we have adopted so far indicates that existence of infinite sets (such as N ) cannot be proved from them. Roughly, the rwson is that these axioms have a general form: "For every set X, there exists a set Y such that . . . ,!' where, if the set X is finite, the set Y is also finite. Since the only set whose existence we postulated outright is 0,which is finite, all the other sets whose existence is required by the axioms are also finite. (See Section 2 of Chapter 4 for a more rigorous elaboration of these remarks.) The point is that we need another axiom. The Axiom of Infinity.
An inductive set exists.
Some mathematicians object to the Axiom of Infinity on the grounds that a collection of objects produced by an infinite process (such as N ) should not be treated as a completed entity, However, most people with some mathematical training have no difficulty visualizing the collection of natural numbers in that way. Infinite sets are basic tools of modern mathematics and the essence of set. theory. No contradiction resulting from their use has ever been discovered in spite of the enormous body of research founded on them. Therefore, we treat the Axiom of Infinity on a par with our other axioms. We now have the set of natural numbers N a t our disposal. Before proceeding further, let us check that the set N is indeed inductive. 1.4 Lemma N is inductive. If I is any inductive set, then N
c I.
Proof. 0 E N because 0 E I for any inductive I. If n E N , then n E I for any inductive I, so (72 + 1) E 1 for any inductive I, and consequently (n + 1) E N. This shows that N is inductive. Tllc second part of the Lemma follows immediately from the definition of N. The next step is to define the ordering of naturaI numbers by size. As it is our guiding idea to define each natural number as a set of smaller natural numbers, we are led to the following.
CHAPTER 3. NATURAL NUMBERS
42
1.5 Definition The relation < on N is defined by: m
< n if and only if m
E n.
It is of course necessary to prove that < is indeed a linear ordering and that the ordered set (N,<) actuaIly has the properties that we expect natural numbers t o have. The needed theory is developed in the rest of this chapter. Exercises
1.1
2.
X
c S(x) and there is no z such that c z C S ( x ) . X
Properties of Natural Numbers
In the preceding section we defined the set N of natural numbers to be the Ieast set such that (a) 0 E N and (b) if n E N, then (n + 1) E N . We also defined m < n to mean m E n. It is the goal of this section to show that the above-mentioned concepts really behave in the familiar ways. We begin with a fundamental tool for study of natural numbers, the wellknown principle of proof by mathematical induction. The Induction Principle. Let P ( x ) be a property (possibly with parameters). Assume that (a) P(0) holds. (b) For all n E N , P ( n ) implies P ( n 1). Then P hoIds for all natural numbers n.
+
Proof. This is an immediate consequence of our definition of N. The assumptions (a) and (b) simply say that the set A = ( n E N I P ( n ) ) is I7 inductive. N A follows.
c
The following lemma estabIishes two simple properties of naturaI numbers and gives an exampIe of a simple proof by induction.
2.1 Lemma (2) 0 5 n for all n E N. (iz) F o r a l l k , n ~N , k < n + l 2f and o n l y z f k < n o r k = n . (i) We let P ( x ) to be the property "0 5 X" and proceed to establish Proof the assumptions of the Induction Principle. (a) P(0) hoIds. P(0) is the statement "0 5 0," which is certainly true (0 = 0). (b) P ( n ) impIies P ( n + 1). Let us assume that P ( n ) holds, i.e., 0 n. This ) means, by definition of <, that 0 = n or 0 E n. In either case, 0 E n ~ { n = n + 1 , so 0 < (n + 1) and P ( n + 1) holds. Having proved (a) and (b) we use the Induction Principle to conclude that P ( n ) hoIds for all n E N, i.e., 0 5 n for all n E N. (ii) This part does not require induction. It suffices to observe that k E 0 n U { n ) if and only if k E n or k = n.
<
2. PROPERTIES OF NATURAL NUMBERS
43
The proof of the next theorem provides several other, somewhat more complicated examples of inductive proofs. 2.2 Theorem ( N , <) is a linearly ordered set. Pm0f. (i) The relation < is transitive on N. We have to prove that for all k,m,n E N , k < m and m < n imply k < n. We proceed by induction on n; i.e., we use as P ( x ) the property "for all k , m € N, i f k < m a n d m < X , then k < x . " (a) P ( 0 ) holds. P(0) asserts: for all k, m E N, if k < m and m < 0, then k < 0. By Lemma 2.l(i), there is no m E N such that m < 0, so P(0) is trivially true. (b) Assume P(n), i.e., assume that for all k, m E N, if k < m and m < n,then k < n. We have to prove P ( n l), i.e., we have to show that k < m and m < (n + 1) imply k < (n + 1). But if k < m and m < (n l ) , then by Lemma 2.l(ii) m < n or m = n. If m < n , we get k < n by the inductive assumption P ( n ) . If m = n, we have k < n from k < m. In either case, k < n + 1 by Lemma 2.l(ii). This establishes P ( n 1). The Induction Principle now asserts the validity of P ( n ) for all n E N ; this is precisely the statement of transitivity of ( N , < ) . (ii) The relation < is asymmetric on N. Assume that n < k and k < n. By transitivity, this implies n < n. SO we only have to show that the latter is impossible. We proceed by induction. Clearly, 0 < 0 is impossible (it would mean that 0 E 0). Let us assume that n < n is false and prove that (n + 1) < (n + 1) is false. If (n + 1) < (n + 1) were true, we would have either n + 1 < n or n + l = n ILemma 2.l(ii)]. Since n < n + 1 holds by Lemma 2.l(ii) and we have proved transitivity of < previously, we conclude that n < n, thus contradicting our inductive assumption (to wit, that n < n is false). We have now established both (a) and (b) in the Induction Principle [with P(x) being "X < X is false"]. We can conclude that n < n is impossible for any n E N. We now know that < is a (strict) ordering of N. It remains to prove (iii) < is a linear ordering of N. We have to prove that for all m, n E N either m < n or m = n or n < m. We proceed by induction on n. (a) For all m E N, either m < 0 or m = 0 or 0 < m. This follows immediately from Lemma Z.l(i). (b) Assume that for all m E N, either m < n or m = n or n < m. We have to prove an analogous statement with (n + 1) in place of n. If m < n, then m < (n + 1) by Lemma 2.1(ii) and transitivity. Similarly, if m = n then m < ( R + 1). Finally, if n < m, we would like to conclude that n 1 5 m . This would show that, for all m E N, either m < (n + 1) or m = ( n 1) or (n + 1) < m, establishing (b), and completing the proof. So we prove that if n < m, then (n + 1) 5 m holds for all m E N by induction on m [ r is ~
+
+
+
+
+
C H A P T E R 3. N A T U R A L NUMBERS a parameter; that is, we are going to apply the Inductiorl Principle to thv property P ( x ) : "If n < X, then n + 1 5 X"]. If m = 0, the statement .'if n < 0, then n l < 0" is true (since its assumption must be false). Assume P ( m ) , i.e., if n < m, then ( n 1) 5 m. To prove P(nt + l ) , assume that. 71 < m 1; then n < m or n m . If n < m, n + 1 m by the inductive assumption, and so n + 1 < m + l . 1 f n = m then of course n + 1 = nl + 1 . In either case P ( m + 1) is proved. We conclude that P ( m ) holds for ;ill m E N , as needed. Finally, we finish the proof of (iii) by observing that the assumptions ( a ) and (b) of the Induction Principle have now been established. 0
+
+
+
-
<
The reader should study the preceding proof carefully for its varied applications of the Induction Principle. In particular, the proof of part (iii) is t i r i example of "double induction": In order to prove a statement depending on two variables, m and n , we proceed by induction on one of them ( n ) ; t h t proof of the induction assumption (b) then in itself requires induction on the other variable m (for fixed n). See Exercise 2.13. Before proceeding further, we state and prove another version of the Inclt~ction Prirlciple that is often rnore convenient. The Induction Principle, Second Version. Let P(z) be a property (possiljtv with parameters). Assume that. for all 1%E N I
(*>
If P ( k ) holds for all k < n , then P ( n ) .
Then P holds for all natural numbers n. In other words, in order to prove P ( n ) for all n E N , it suffices to provv P ( n ) (for all n N) under the assumption that it holds for a11 smaller rlat.ural numbers.
Proof. Assume that (*) is true. Consider the property Q(n): P ( k ) liolcls for all k < n. Clearly Q(0) is true (there are no k < 0). If Q ( n ) holds. thtn Q ( n + 1) holds: If Q ( n ) holds, then P ( k ) holds for all k < I L , and consequttr~tl~ tdso for k = n (by (*)l. Lemma 2.l(ii) enables us to corlclude t h a t P ( k ) holds for all k < n + 1, and therefore Q ( n + l ) holds. By the Induction Pri1lc:iptc.. Q ( n ) is true for all n E N; since for k E N there is some n > k ({?.g..71 = k + 1 ) . we have P ( k ) true for all k E N, as desired. Cl The ordering of natural riumbers by size has an additional in~portantproperty that distinguishes it from, say, ordering of integers or ratior~alnumbers bv size. of a set A is a well-orderin,q if (\very nonempty subset of A has a Ieast elemerlt. The ordered set (.d. 4 ) is ttivn called a well-ordered set. 2.3 Definition A linear ordering
4
2. PROPERTIES OF NATURAL NUMBERS
45
Well-ordered sets form a backbone of set theory and we study then1 extensively in Chapter 6. At this point, we prove only:
2.4 Theorem ( N , <) is a well-ordered set.
Proo!. Let X be a nonempty subset of N; we have t o show that X 11% a least element. So let us assume that X does not have a least element and let us consider N - X. The crucial step is t o observe that if k E N - X for all k < n, then n E N - X: otherwise, n would be the least element of X. By t l ~ e second version of the Induction Principle we conclude that n E N - X holds for all natural numbers n [ P ( x ) is the property "X E N - X"] and therefore that 0 X = 0, contradicting our initial assumption. We conclude this section with another property of the ordering <.
2.5 Theorem I1 a nonempty set 01natural numbers has an upper bound ordering <, then it has a greatest element.
27,
th.r
c
P m o . Let A N , A # 0 be given; let B = (k E N J k is an upper bound of A); we assume that B # 0. By Theorem 2.4, B has a least element n, so n = sup(A). The proof is completed by showing that n E A. [See Theorem 5.15(c) in Cllapter 2. with "supremum" in place of "infimum," etc.] Trivial induction proves that either n = 0 or n = k 1 for some k E N (see Exercise 2.4). Assume that n # A: we then have n > m for all m E A. Since A # 0,i t means that n # 0. 'rl~<~refor
+
Exercises
2.1 Let n E N. Prove that there is no k E N such that n < k < n + 1. 2.2 Use Exercise 2.1 to prove for all m , n E N : if m < n , then 772 + 1 n . Conclude that m < n implies m + l < n 1 and that therefore the successor S ( n ) = n + 1 defines a one-to-one function on N. 2 . 3 Prove that there is a one-to-one mapping of N onto a propcr subset of N . [Hint: Use Exercise 2.2.1 2.4 For every n E N , n # 0, there is a unique k E N such that t~ = k + 1. 2.5 For every n E N, n # 0,1, there is a unique k E N such t h a t 1 1 = (k + 1) 1. 2.G Prove that each natural number is the set of all smaller natural numbers. i.e., n={mENIm
+
<
+
[Hint: Use induction to prove that all elements of a natural number are natural numbers.] 2.7 For all m , n E N m
if and only if
m c n.
CHAPTER 3. NATURAL NUMBERS
46
2.8 Prove t h a t there is no function f : N -, N such t h a t for all n E N , f (n) > / ( n 1). (There is no infinite decreasing sequence of natural numbers.) 2.9 If X C N , then (X, < X2} is well-ordered. 2.10 In Exercise 5.6 of Chapter 2, let A = N . b = N. Prove that 3 as defined there is a well-ordering of B = N LJ( N). Notice that X + y if and only if X E y holds for all X,y E B. 2.11 Let P ( x ) be a property. Assume that k E N and (a) P ( k ) holds. (b) For all n 2 k, if P ( n ) then P ( n + 1 ) . Then P ( n ) holds for all n 2 k . 2.12 (Finite Induction Principle) Let P ( x ) be a property. Assume that k E N and [a> P @ ) . (b) For all n < k, P ( n ) implies P ( n + 1). Then P(n) holds for all n 5 k. 2.13 (Double Induction) Let P ( x . y) be a property. Assume
+
(**) If P ( k , l) holds for all k, l E N such that k < m or (k = m and 1 < n), then P ( m , n) holds. Conclude that P (m, n ) holds for all m, n E N .
3.
The Recursion Theorem
Our next task is t o show how t o define addition, multiplication, and other familiar operations of arithmetic. To facilitate this, we develop a n important general method for defining functions on N. We begin with some new terminology. A sequence is a function whose domain is either a natural number or N. A sequence whose domain is some natural number n N is called a finite sequence of length n and is denoted (a, I i < n) or
a
1il
,.
- 1 )
or
( a ~ ,, a. . .~ , a n - l }
In particular, ()( = 0) is the unique sequence of length 0, the empty sequence. 'eq(A) = UneNA n denotes the set of all finite sequences of elements of A . (Prove t h a t it exists!) If the domain of a sequence is N, we call it an infinitt: sequence and denote it
So infinite sequences of elements of A are just members of A ~ The . not,ation simply specifies a function with an appropriate domain, whose value at i is a,. We also use the notation ( a , 1 i E N ) , { a , ) g o , etc., for the range of the sequence (a, I i C N). Similarly. { a , ( z < n ) or { a o ,a l , . . . , U,- 1 ) denotes t h e range of ( a , 1 i < n ) . Let us now corlsider two examples of infinite sequences.
3. T H E RECURSION THEOREM (a) The sequence s : N
-+
N is defined by s o = 1; Sn+l
(b) The sequence f : N
+
71
for all n E N .
N is defined by
fo = 1; f,+l = f ,
X
( n + 1) for a11 n E N
T h e two definitions, in spite of superficial similarity, exhibit a crucial difference. The definition of s gives explicit instructions how to compute s, for any X E N. More precisely, it enables us to formulate a property P such that S,
=y
if and only if
P ( x , y);
namely, let P be "either X = 0 and y = 1 or, for some n E N, z = n + 1 and y = n 2 ." The existence and uniqueness of a sequence S satisfying (a) then immediately follows from our axioms:
In contrast, the instructions supplied by the definition of f tell us only how to compute f , provided that the value of f for some smaller number (namely, X - 1) was already computed. It is not immediately obvious how t o formulate a property P, not involving the function f being defined, such that
f, = y
if and only if P ( x , y )
We might view the definition (b) as giving conditions the sequence f ought t o satisfy: " f is a function on N to N which satisfies the 'initial condition': fo = 1, and the 'recursive condition': for all n E N , f,+l = f , X (n + l)." Such definitions are widely used in mathematics; the reader might wish t o draw a parallel, e.g., with the implicit definitions of functions in calculus. However, a definition of this kind is justified only if it is possible to show that there exists some function satisfying the required conditions, and that there do not exist two or more such functions. In calculus, this is provided for by the Implicit Function Theorem. We now state and prove an analogous result for our situation. For any set A, any a E A, and any function g : A, there exists a unique infinite sequence f : N -+ A such that
T h e Recursion Theorem
A X N --+ (a) f o = a; (b) f , + ~ = g ( f n , n ) for all n E N.
In Example (b), we had A = N, a = 1, and g(v,u) = u X (u + 1 ) . The set a is the "initial value" of f . The role of g is t o provide instructions for computing fn+, assuming fn has been already computed.
CHAPTER 3. NATURAL NUMBERS
48
The proof of the Recursion Theorem c.onsists of devising an explicit definitioli of f . Consider again Example (b); then f,, is the ?L-factorial;; ~ n d explirit definition of f can be readily 1vrittt1n down:
fo=l
and
f m = 1 x 2 x . . . x ( m - Z ) x m i f m # O a n t l n ~N~.
The problem is in making ". . . '' pre<:ise. I t can be resolved by stating is the result of a computation
[l X I
X
2
X
X
t , I l i ~ tf',,,
( m - l)] X m.
A computation is a finite sequence starting with the "initial value" of f
itrlcl
repeatedly applying g . I11 the examplc above, the m-step computation t is ii fiilito sequence that is of length m 1 where to = Z and t k + l = tk X ( k 1) = ! j ( t k . k) for all k < m, k 2 0. The rigorous explicit definition of f then is:
+
f, = t ,
where t is an 7n-step computation (based on a
+
=
1 arid g ) .
The problem of the existence and uniqueness of f is reduced to the yrobletii of' showing that there is precisely orle m-step computat.ion for each m E N. We now proceed with the fornial proof of the Recursiori 'Thctort?ni. As t his theorem and its generalizations art? among the most irnyortarlt niethocls of' s t b t theory, the reader should study the preceding intuitive example and thc proot' itself carefully.
The existence o f f . A function t : ( r n + 1) -+ A is called an 111-step ProoJ computation based on a and g if t o = cz, and, for all k such that 0 5 k < I I Z . t k + = g ( t k , k ) . Notice that f N X A. Let
c
F
= {t E
P(N
X
A) [ t is an m-step cornputation for some natural nurnber
711.).
Let f = U F . 3.1 Claim f is a function.
It suffices to show that the syste~nof functions F is compatible-see Theorern 3.12 in Chapter 2. So let b , ZL E F, domt = TL E N , domu = m E N. Asst~rnc?. e.g., I L 5 rn; then n In, a11d it suffices t o show that tr, = I L ~for ill1 k. < I L . This can be done by induction ( i n the form stated i11 Exercise 2.12). Stlrt.1~. to = a = uo. Next let k be stlcti t t i i ~ tk + l < 71, ittld ~LSSUII~C:tr, = i l k . 'rIit?ri t k + l = g ( t k ,k ) = g ( u k , k ) = uk+~. Thus tk = u k for a11 /C < n.
c
3. T H E RECURSION THEOREM
49
3.2 Claim dom f = N; ran f 2 A.
c
We know immediately t h a t dom f N and ran f C_ A. To show that, dom f = N , it suffices to prove t h a t for each n E N there is an n-step computation t . We use the Induction Principle. Clearly, t = ((0,a ) ) is a 0-step computation. Assume that t is an n-step computation. Then the following function t + on ( n+ 1) I is a n (n + l)-step computation:
+
We conclude that each n E N is in the domain of some computation t E F, so N L U I E F domt = dom f . 3.3 Claim f satisfies conditions (a) and (b).
Clearly, fo = a since to = a for all t f F. To show that f,+, = g(f,, 71) for any n E N , let t be an ( n + 1)-step computation; then tk = f k for all k E doln t . So fn+l = t n + ~= g(tn,n) = g(fn,n), The existence of a function f with properties required by the Recursiou Theorem follows from Claims 3.1, 3.2, and 3.3. The uniqueness of f . Let h : N + A be such that ( a ' ) h. = a ; (h') h,+l = g ( h , , n ) for all n E N. We show that f,, = h , for a11 n E N, again using induction. Certainly fo = a = ho. If fn = h,, then f,+l = g(f,, n) = g(lt,, n ) = therefore, f = h , as claimed. • As a typical example of the use of the Recursion Theorem, we prove that t,he properties of t h e ordering of natural numbers by size established in the previous section uniquely characterize the ordered set (N.<). 3.4 Theorem Let (A, 4) be a nonempty linearly ordered set with the propertirs: ( a ) For eve?! p E A, there i s q E A such that g + p. (b) Evertj lzonempty subset of A has a +-least clement. (c) Every nonempty subset of A that has a n upper bound has a 4 - q r ~ a t c s t element. T h e n ( A , 4) is isomorphic t o ( N , <).
Proof. We construct t h e isomorphism f using the Recursion Theorem. Let a be t h e least element of A and let g(x, n ) be the least element of A greater than X (for any n ) . Then a f A and g is a function on A X N into A: notice that g(x. t z ) is defined for any X E A because of assumptions (a) and (1)) ill Theorem 3.4, and does not depend on n. The Recursion Theorem guarantees the existence of a function f : N -+ A such that
CHAPTER 3. NATURAL NUMBERS
50
(i) fo = a = the least element of A. (ii) f n + l = g(fn, n ) = the least eIenlent of A greater tha11 f,. It is obvious that f, 4 f n + I for each n E N. By inductio~l,we get /,, 4 f,,, whenever n < m (see Exercise 3 . 1 ) . Consequently, / is a one-to-one function. It remains t o show that the range of f is A. If not, A - ran f # 0; let y be the least element of A - ran f . T h e set, B = { q E A I q < p) has an upper bound p, and is nonempty (otherwise. y would be the least element of A , but then p = fo). Let q be the greatest element of B [it exists by assumption (c) in Theorem 3.41. Since q + p: we haw. q = f, for some m E N. However, it is now easily seen that p is the least element of A greater than q. Therefore, p = f,+l by the recursive conditioti (ii). Consequently, p E ran f , a contradiction. 0 In some recursive definitions, the value of f n + l depends not only oil f;,. tjut also on f k for other k < n . A typical example is the Fibonacci sequence:
+
Here fo = 1, f = 1, and f n + l = f, /,- 1 for n > 0. The following theorem formalizes this apparently more general recursive construction. 3.5 Theorem For any set S and any fvnction y : Seq(S) unique sequence f : N -+ S such that
Notice that, in particular, f o = g(f Fibonacci sequence, we let g(t) =
t,-l
+ tn-z
3
S there exists a
0) = g ( ( ) ) = y(0). To obtain thy
if t is a finite sequence of length 0 or 1; if t is a finite sequenceof length n > 1.
The idea is to use the Recursion Theorem t o define the sequericc Proof. (F,InfN)=(f r n j n ~ N ) . So let us define F0 = Fn+1=
0; Fn U
{(TL,
g ( F n ) ) ) for all n E N
The existence of the sequence (Fn I n E N ) follows from the Recursion Theorem with A = Seq(S), a = (), and G : A X N 4 A defined by
t U { ( n ,g(t))) if t is a sequence of length n;
0
otherwise.
It is easy t o prove by induction that each F, belongs to Sn and that F,, C: Fn+l for all n E N. Therefore (see Exercise 3.1), (F, I n N ) is a ~orrlpntikjl~
3. THE RECURSION THEOREM
51
system of functions. Let f = UnE F,; then clearly f : N -+S and / I n = F,l for all n E N; we conclude that f n = F,+ l ( n ) = g(Fn) = g(f n), as needed.
r
0 Further examples on the use of this version of the Recursion Theorem can be found in Chapter 4. At present, we state yet another, "parametric." version that allows us t o use recursion to define functions of two variables. 3.6 Theorem Let a : P -+ A and g : P X A X N -+ A be junctions. There exists a unique function f : P X N -,A such that (a) f (p, 0) = a(p) for all p E P ; (b) f ( y , n 1) = g(p, f (p, n ) , n ) for all n E N and y E P.
+
T h e reader may prefer the notation fPgo in place of f (p, O), etc
Proof. This is just a "parametric" version of the proof of the Recursion Theorem. Define an m-step computation t o be a function t : P X ( m+ 1) + A such that, for all y E P,
<
for all k such that 0 k < m. Then follow the steps in the proof of the Recursion Theorem, always carrying p along. Alternatively, one can deduce the parametric version directly from the Recursion Theorem (see Exercise 3 . 4 ) . We now have all the machinery needed to define addition of natural numbers. as well as other arithmetic operations. We do so in the next Section. Exercises
3.1 Let f be an infinite sequence of elements of A, where A is ordered by 4. Assume that fn < fn+r for aH n E N. Prove that n < m implies f , if , for all n, m E N. [Hint: Use induction on m in the form of Exercise 2.11, with k = n + l.] 3 . 2 Let ( A , 4 ) be a linearly ordered set and p,q E A. We say that (1 is a successor of p if p + q and there is no r E A such that p 4 r 4 g. Note that each p E A can have at most one successor. Assume that (A. +) is nonempty and has the following properties: (a) Every p E A has a successor. (b) Every nonempty subset of A has a +-least element. (c) If p E A is not the <-least element of A, then y is a successor of some q E A. Prove that (A, +) is isomorphic t o ( N I<). Show t h a t the conclusinn need not hold if one of the conditions (a)-(c) is omitted. 3.3 Give tt direct proof of Theorem 3.5 in a way analogous t o the proof of the Recursion Theorem.
CHAPTER 3. NAY'URAL NUlLlBERS
52
3.4 Derive the "parametric" version of the Recursion 'I'heorern (Theorem 3.6)
-
from the Recursion Theorem. (Hint: Define F : N A~ ljy rrrorsion:
where G : A' X N -, A' is defined by G ( r ,n )(p) = g(p. :r(p).7 1 ) for X E A ~ n , E N . Then set f ( p , n ) = F,(p).] 3.5 Prove the following versioil of the Recursion Theorem: Let g be R functiorl on a subset of A X N into A , a E A. The11 there>is ;i urlique sequence f of elemcrits of A such that (a) fo = a ; (b) f , + = g ( f,,, n) for all ? L E N such that (11 + l ) E tlom f'; (c) f is either i i r r irlfi11it.t) sr~c~uence or f is ;L finite st1clui3rlc:oof lvi1gtl1 k + 1 and g(fk,k ) is undefined. [Hint: Let 2 = A U ( 8 ) whilre ii # A. Define ij : 2 X N -4 as follows:
~ ( . , n )=
{
-
n
if defined; otherwise.
Use the Recursion Theorem t o get the corresponding infinite spqur.ilcci f . If T1= n for sorne 1 E N. roi~sick?~+ 7 t l for tllr least s ~ i ( l-.l] ~ 3.6 Prove: If X C N , then tllert? is ii one-to-tsrie (fii1it.t~or infinit(>)secluellc.rl / such that ran f = X. [ H i l ~ e Use : Exercise 3.5.1
4.
Arithmetic of Natural Numbers
As an applicatiorl of the Recursioil 'Lheoretn we now show how to defiile aclcfitiori of ~ i a t u r a numbers, l and we use Induction t o prove basic properties of addition. In similar fashion, orre can introtlu<:c:other arithmetic o p ~ r a t i o n s(see Exc~rcisc~s). 4.1 Theorem There is a unique function + : N X N ( a ) f(rrt.0) = m for all m E N: (b) +(m, n l ) = + ( m , 1 2 ) l f o ~ .i ~ l m, l IE E N .
+
Prooj.
+
N such that
+
In the parametric versiorl of the Recursiori I ' h c ~ o ~ elet ~ n,4 N . ( ~ ( 1 )=) 1) for a11 p E N i~nrly(j)..r. 1 1 ) = 3. + l for all 71, 2.. 11. E N .
=
F'
I : 7,L.-:
Notice t h a t letting n = 0 in ( b ) leads t o +(m, 0 + 1) = + ( m 0. ) + 1 : t ~ z t + (rn,0) = 7n by ( a ) and 0 + 1 = S ( 0 ) = 1 by the definition of the 11unltjc.r 1 Thus we have + (7n, 1) = m + l = S(TIL) and, as we claimed rarlier. the successor of 7n E N is really the surn of I I L zi11~1l , justifying our ilotation for it. Of co\irstb. we write 711 + I E instead of +(??L,? L ) i l l t,hc! sequel. T l ~ etiefining properties of adclition can now be rest.atecl :is
4. ARITHAIETIC OF NATURAL NUMBERS
Properties of recursively defined functions are typically established by illduction. As a n example, we prove the commutative law of addition. 4.4 Theorem Addition is commutative; i.e.,
f07.
rill n z , 7 1 E N ,
Let us say that n commutes if ( 4 . 5 ) holds for all m E N. ?l:v prove Proo!. that every n E N commutes, by induction on 7 1 . To show that 0 commutes, it suffices t o show that 0 + In = rn for a11 In [because then 0 + m = m + D by 4-21. Clearly, 0 + D = 0 , and if 0 + ?rt = m.then 0 + ( m 1) = ( D m) 1 = m + 1 , so the clairn follo~vsby induction (on 7 1 2 ) . Let us now assume that n commutes, and let, us show tliat 1~ -11 ~ : ~ n ~ i ~ l i t ( ~ s . We prove, by induction on m, that
+
+
+
If m = 0, the11 (4.6) holds, as we have already sllowri. Thus let us a~sulrlethat (4.6) hoIds for m, and let us prove that
We derive (4.7) as follows:
( m + l ) + (71,
+ 1) = ((m + l) + n) + 1 = ( n + ( m+ 1))+ 1 = ( @ + m ) + l )+ 1 = ( ( m+ n ) + I ) + 1 =(m+(n+2))+1
+1 = ( n+ 1) + (m + 1)
=
( ( n+ 1) + m )
(by 4.31
[since r~ commutes] (by 4.31 (since 71 commutes] [by4.3] [since (4.6) liolcls for m ]
[by 4.31.
Detailed development of the arithmetic: of natural numbers is beyond the scope of this book. Readers who are so inclined may continue in i t by working the exercises a t the end of this section, ancl thus (:onvincc?tl~elliselvtsthat axiomatic set theory ha? the power t o establish rigorously all of the usual laws of arithmetic:. From there one can proceed t o define such notions
CHAPTER 3. NATURAL NUMBERS Having established the natural numbers and their arithmetic, one can pruceed to define rigorously, first, the set Z of integers, then the set Q of rational numbers, and derive their corresponding arithmetic laws. These constructions are we11 known from algebra, and most readers probably encountered tlicnr there. For completeness, we describe them in some detail in Section 1 of Chapter 10, which can be studied a t this point. In any case, we view the arithmetic of integers and rationals as rigorously established from now on. We conclude this section with a remark about axiomatic arithmetic of natural numbers. The theory of arithmetic of natural numbers can be developccl axiomatically. The accepted system of axioms is due to Giuseppe Peano. The undefined notions of Peano arithmetic are the constant D, the unary operatiolr S, and the binary operations and .. The axioms of Peano arithmetic are the following: ( P I ) If S ( n ) = S ( m ) , then n = m . (P21 S ( n ) # D. (P3) n + O = n. (P4) n + S ( m ) = S ( n + m). (PS) n - 0 = 0. (P6) n S(m) = (n . m) + n. (P7) If n # 0, then n = S(k) for some k. (P8) The Induction Schema. Let A be an arithmetical property (i.e.. a property expressible in terms of +, -. S, D ) . If 0 has the property A iind if' A(k) implies A(S(k)) for every k , then every number has the property A . It is not difficult to verify that natural numbers and arithmetic operations, x i we defined them, satisfy all of the Peano axioms (Exercise 4.8). h.lost of' t , h ~ needed facts have been proved already.
+
Exercises
4.1 Prove the associative law of addition: (k
+ m) + n = k + (m + n)
for all k, m. n E N
4.2 I f m , n , k ~N, t h e n m < n i f a n d o n l y i f m + k < n + k . 4.3 If m, n E N then m . I7~ if and only if there exists k E N such that n = m + k . This k is unique. so we can denote i t n - m , the d i f l e ~ ~ l ( : r of n and m. This is how subtraction of natural numbers is defined. 4.4 There is a unique function . (multiplication) from N X N to N such that m . D = D for all m E
IV;
4.5 Prove that multiplication is commutative, associative, and distributive over addition. 4.6 If m , n , k E N a n d k > 0, then m < n if and only if m . k < n - k . 4.7 Define exponentiation of natural numbers as follows: m0 = 1 for all m E N (in particular, 0' = 1); mn+' = m n - m for all rn. n E N (in particular, On
=
D for n > 0)
5. OPERATIONS AND STRUCTURES
55
Prove the usual laws of exponents. 4.8 Verify the axioms of Peano arithmetic. The needed results can be found in the text and exercises. 4.9 For each finite sequence (kj ( 0 5 i < n ) of natural numbers define C ( k i 0 < i < n) (more usually denoted ki or C::. ki) SO that
I
C0 = 0; C ( k o }= ko; C ( k o , .. . , k,) =
5.
C{ko,. . . . kn-
1)
-I-k,
for n
> 1.
Operations and Structures
The functions +, ., etc., are usually referred to as operations. The aim of this section is to define the general concept of operation and to study its properties. Each of the aforementioned operations assigns to a pair of objects (numbers, sets) a third object of the same kind (their sum, difference, union, etc.). The order may make a difference, e.g., 7 - 2 and 2 - 7 are two different objects. We therefore submit the following definition. 5.1 Definition A binary operation on S is a function mapping a subset of S 2 into S.
Nonletter symbols such as +, X , *, A, etc., are often used to denote operations. The value (result) of the operation * at (X,y) is then denoted X * y rather than *(X,y). There are also operations, such as square root or derivative, which are applied to one object rather than to a pair of objects. We introduce the following definitions. 5.2 Definition A unary operation on S is a function mapping a subset of S into S. A ternary operation on S is a function on a subset of S 3 into S.
c
5.3 Definition Let f be a binary operation on S and A S, A is closed under the operation f if for every X,y C A such that f ( X ,y) is defined, f (X, Y) E A. One can give similar definitions in the case of unary or ternary operations.
5.4 Example (a) Let + be the operation of addition on the set of all real numbers. Then + is defined for all real numbers a and b. The set of all real numbers, as well as the set of all rational numbers and the set of all integers, are closed under +, The set of even natural numbers is closed under +, but the set of odd natural numbers is not.
CHAPTER 3. NA'l'URA L NUMBERS
56
(b) Let t be the operation of division on the set of all real numbers. + is not defined for (a,b) where b = 0. The set of all rational numbers is closccl under +, but the set of all intcgers is not. (c) Let S be a set; define binary operations Us and ns on S as follows; (i) If X,y E S and 3: U y E S. then 3: u.7 y = 3: U y. (ii) If X,y E S and x n y E S, then X nsy = .TI I y. If we take S = P ( A ) for some A, Us and ns are defined for every pair (2, Y) E S2. When developing a particular branch of mathematics, we are usually interested in sets of objects together with some relations between them and operations on them. For example, in number theory or analysis, we study the set of all (natural or real) numbers together with operations of addition and multiplication, relation of less than, etc, In geometry, we study the set of all points antl lines with relations of incidence ancl between and operation of intersectio~~. etc. We introduce the notion of structure to describe this situation abstractly. In general, a structure consists of a set A and of several relations a r ~ dopcrations on A. For example, we consider structures with two biniiry reli-itions arid two operations, say a unary operation and a binary operation. Let A be a set,. let R1 and Rz be binary relatioris in A, and let f be a unary operation ant.! g ;I binary operation on A. We ~ n a k euscl of the five-t;uple (A. R I .R 2 .f.g ) t,o clc~~iotc* the structure thus defined.
5.5 Example ( a ) Every ordered set is a struc:turc: with one binary relatioii. (b) (A, UA,n ~ A, ) is a structurtl with two binary operatiorls nrld orle binary relation. (c) Let R be the set of all real ~lurnbers. (R,+, -. X , t) is t i strur.t,urt. with four binary operations.
c
We now proceed to give a i general definition of a structure. In Chapter 2. we introduced unary ( l-ar y) , binary (2-ar y ) , and ternary (3-ar y ) relations. The rekson we did not talk about rl-itry relations for arbitrary T I is silrlplv that !v(. c.ould rlot then handle arbitrary liatural nurnbers. This obstitcle is [low ~.ert~ovt.d. and the present section gives precise definitiorls of n - t ~ ~ p l en-ary s ? relat,io~lsancl operations, and n-fold (:artesian products, for all natural numbers 71. We start with the definition of ;in ordered n-tuplt. Recall that ortlerecl pair ( a o , a l ) has been defined in Section 1 of Chapter 2 zts a set that uniciuely determines its two coorclinates, U() and u1: i.e., (no,al) = (bo,bl)
it'anclanly if
a0
= ba and u1 = b I .
We called a0 the first coordinnt,e of (uO,u1),ancl a1 its second coortlinntc. 111 analogy, an 71-tuple (ao,a l l . . . . U,,- l ) should be a set that uniquely tlcter~~lirlrs its n coorclinates, ao, u1, . . . . U , , - , ; that is, we want
(*l (IQ.
.
. . . ~ ~ - =1 ()b O , .. . D,,- l ) if ?
i i l l ~ lo111yif
uI:= L, for all L = U. . . .
, 11 -
l.
5, OPERATIONS AND STRUCTURES
57
But we already introduced a notion that satisfies (*): it is the sequence of length n , (ao,a l l . . . , un-l). The statement that (ao,. - . , a n - l ) = (bo,. . .
,bn-1)
if and only if a,
=
b, for all i = 0,. . . , n - 1
is just a reformulation of equality of functions, as in Lemma 3.2 in Chapter 2. We therefore define n-tuples as sequences of length n. For each i, D 5 i < n: ai is called the (i + 1)st coordinate of ( a o , a l , .. . , a n - l ) . SO a0 is the first coordinate, a1 is the second coordinate, . , . , an- l is the n-th coordinate. IReally, a more consistent terminology would be zeroth, first, . . . , (n - 1)st coordirlate. but the custom is to talk about the first and second coordinates of points in the plane, etc.] We notice that the only Q-tupleis the empty sequence () = 0, having no coordinates. l-t,uples are sequences of the form (ao>[i.e.,sets of the form ( (Q, n o ) )1; it usually causes no confusion if one does not disting~iishbetween a l-tuple ( a o ) and an element UO. If (A, 1 0 5 i < n) is a finite sequence of sets, then the n-fold cartesian product A,, as defined in Section 3 of Chapter 2, is just the set. of all n-tuples a = ( a o , a l , .. . ,a,- l) such that a* E Aol a1 E A I , . . . , an-l C An-1. A, = A n is the set of all n-tuples If A, = A for all i, D 5 i < n, then with all coordinates from A. We note that AO = {()) and A' can be identified with A. An n-an, relation R in A is a subset of A n . We then write R(au,a l , . . . . a, - l ) instead of ( a o , a l , .. . ,a,-l) E R. Similarly, an n - u q operation F o n A is a function on a subset of A n into A; we write F ( a o ,a , , . . . , a,- 1 ) inst,ead of F ( ( a o ,a l , . . . , a n - l ) ) . Finally, we generalize the notation already introduced in the case of pairs and triples. If P ( s o , s l , . . . , X,- 1) is a property with paramcters xo, X I , . . . , X,- 1, we write
no,,,,
no5i,n
{ ( a o , . . . , a n - l ) ( a 0 E Ao,. .. ,an-l E
A,,-1
and P(a0 , . . . , a n - l ) )
to denote the set
n
AI 1 for sorneao,... ,a,-l, a = (a0 , . . . . a n - l ) a n d P ( a o, . . . .a,,-l)).
~ < z < R
We note that l-ary relations need not be distinguished from subset.^ of A, and l-ary operations can be identified with functions on a subset of A into A. 0ary relations (0 and { ( ) ) do not have much use, but nonempty D-ary operstioils are quite useful. They are objects of the form ((0, a ) ) where a C A. We <:all them constants and in the sequel identify them with elements of A [i.e..we do a ) ) and a]. not distinguish between ((0, The exercises at the end of this section contain further information about these and related concepts. The problem with this approach is that the ordered pair from Chapter 2. (aO,u1) = {{ao),{ao,a1 )), is generally a different set from the just-defined 2-tuple (ao,a l = {(D, ao),( l , a l l ) . Consequently, we have two definitions of
>
CHAPTER 3. NATURAL NUMBERS
58
no<,<2
cartesian product, A. X A l and A,, two definitions of binary relations and operations, etc. However, there is a canonicaI one-to-one correspondence between ordered pairs and 2-tuples that preserves first and second coordinates. Define S((aola l ) ) = (ao,a l ) ; then S is a one-to-one mapping on A. X A I onto Ai and X is a first (second, respectively) coordinate of ( a o , a l ) if and only if X is a first (second, respectively) coordinate of (ao,al}. For almost a11 practical purposes, it makes no difference which definition one uses. Therefore. we do not distinguish between ordered pairs and 2-tuples from now on. Similar remarks hold about the relationships between triples and 3-tuples, etc. In fact. it is possible to define n-tuples for any natural number n along the lines of Chapter 2. The idea is to proceed recursively and let
( a o , a ,... ~
(ao) = ao; , a n ) = ( ( a o , a ,~. . . ,a,-I),an)
for a l l n e N, n 2 1 .
However, making this precise involves certain technical problems. We do not .yet have a Recursion Theorem sufficiently general to cover this type of definition (but see Exercise 4,2 in Chapter 6). As we have no need for an alternative definition of n-tuples, we refer the interested reader to Exercise 5.17. Wt. use notations ( a o l . . , a,-l) and (ao,.- . , a n - l ) interchangeably from now on. We continue this section by generalizing the notion of a structure. First. a type T is an ordered pair ((TO, . . . T . ~ -1 ), (fO,- . . f n - l ) ) of finite sequences of natural numbers, where r, > D for a11 i 5 m - 1, i 2 Q. A stm~cturcof type r is a triple = ( A , ( & , . . . , R r n - l ) , { F ~ , .. .,F,-l))
.
.
where R, is an ri-ary relation on A for each i . Im - 1 and Fj is an f,-ary operation on A for each j 5 n - 1 ; in addition, we require F, # 8 if f, = 0. Note that if f, = 0, F, is a 0-ary operation on A. Following our earlier remarks. Fj is a constant, i.e., just a distinguished element of A. For example, Tl = ( N . (<), (0, +, .)) is a structure of type ((2),(D, 2,2)); it carries one binary relation. one constant, and two binary operations. '3 = (R, {), (0, l , +, -, X , +)) is ?L structure of type (0,( 0 , 0 , 2 , 2 , 2 , 2 ) ) ,etc. We often present the structure as a (l + m + n)-tuple, for example (N,< , D , +,.), when it is understood which symbols represent relations and which operations. We call A the universe of the structure 2l. We now introduce a notion that is of crucial importance in the theory of structures. (In the special case of ordered sets, compare with Definition 5.17 it) Chapter 2.)
5.6 Definition An isomorphism between structures 2l and 2l' = (A'. (R;, . . . , RA-]), (F& , FA-l))of the same type r is a one-to-one mapping h on A onto A' such that (a) R i ( a o l . , a,, - 1 ) if and only if R:(h(ao), . . . , h(a,, - I ) ) holds for ;ill U". . . . . a,,-1 E A and i < m - 1. holds for all a*. . . . , a , - E (b) h(Fj ( a o , .. . , af, - 1 ) ) = F;(h(ao), . . . , h(a A and all j n - 1 provided that either side is defined. a
<
59
5. OPERATIONS AND STRUCTURES
For example, let (A, R1, R2,f , g) and (A , R;, R;, f',g') be structures with two binary relations and one unary and one binary operation. Then h is an isomorphism of (A, R1, R2,f , g ) and (A', R',, R;, f ' , g ) if a11 of the following requirements hold: (a) h is a one-to-one function on A onto A'. (b) For all a,b r A, aRlb if and only if h(a)Rih(b). (c) For all a , b r A, aR2b if and only if h(a) Rh h(b). (d) For all a r A, f ( a ) is defined if and only if f t ( h ( a ) )is defined and h(f (a)) = f ' ( W ) ). (e) For all a , b E A, g(a, b) is defined if and only if g' (h(a),h(b)) is defined and h(9(a,b)) = gf(h(a),h(b)). The structures are called isomorphic if there is an isomorphism between them. f
f
5.7 E x a m p l e Let A be the set of all real numbers, < A be the usual orderillg of real numbers, and + be the operation of addition on A. Let A' be the set of all positive real numbers, be the usual ordering of positive real numbers, and X be the operation of multiplication on A'. We show that the structures (A, < A , +) and (A', < A I , X ) are isomorphic. To that purpose, let h be the function h(x) = e x
for
X
E
A.
We prove that h is an isomorphism of (A, < A , i -) and (A', < A I , X ) . We have to prove: (a) h is a one-to-one function on A onto A'. But clearly h is a function, don1 h = A, and ran h = A'. If x1 # s2, then ex' # e r 2 , so h is one-to-one. (In this example, we assume knowledge of some facts from elementary calculus.) (b) Let ~ 1 ~ Ex A; 2 then X I SAx2 if and only if h ( z l ) S A / h(x2). But the x function e is increasing, so sl 5 x2 if and only if er' 5 eZ2is indeed true. (c) Let X 1,x2 E A; then X 1 x2 is defined if and only if h(xl ) X h(x2) is defined and h ( z l + X*) = h ( x l ) X h(s2). First, notice that both + on A and X on A' are defined for all ordered pairs. Now, h(xl + x 2 ) = eqtx2 = erl X e T 2 = X h(x2). U
+
The significance of establishing an isomorphism between two structures lies in the fact that the isomorphic structures have exactly the same properties as far as the relations and operations on the structures are concerned. Therefore. if the properties of these relations and operations are our only interest, it does not make any difference which of the isomorphic structures we study. 5.8 E x a m p l e Let (A, R) and (B,S) be isomorphic (R and S are binary relations). R is an ordering of A if and only if S is an ordering of B. A has a least element in R if and only if B has a least element in S.
Proof. Let h be an isomorphism of (A, R) and (B, S). Assume that R is a n ordering of A. We prove that S is an ordering of B. Hence, let b l , b z , b3 E B and b l S b 2 , b2Sb3. Since h is onto B, there exist al, a2, a3 E A such that bl = h ( a l ) ,
CHAPTER 3. NATURAL NUMBERS
60
b2 = h(a2), and b2 = h(a3). Because alRa2 holds if and only if h ( a l )Sh(ct2) holds, i.e., if and only if b l S b 2 holds, we conclude that a l Ra2 and similarly aaRa3. But R is transitive in A, so alRa3. But the11 h(ar)Sh(as),i.e., bl Sb:,. We leave the proof of reflexivity and antisymmetry to the reader. Similarly. ontb proves that if S is an ordering of B , then R is an ordering of A . Now let .4 have a least element. We claim that B has a least element. Let U be the least element of A, i.e, a& holds for all z E A. Then h(a) is the least element of B. as the following consideration shows. If y E B. then y = h ( a ) for some :r E '4 (h is onto B). But, for this X , aR.z holds. Correspondingly, h(u)Sh(x) holcls; and we see that h(a)Sy holds for all y E B. Thus h(a) is the least element o f
B. An isomorphism between a structure U and itself is called an uutomorphisni of U. The identity mapping on the universe of U is trivially an automorphisrn of 2f. The reader can easily verify that the structure (N,<) has no other automorphisms. On the other hand, the structure ( 2 , <), where Z is the set of all integers, hczsnontrivial automorpliisrns. In fact, they itre precisely the f u ~ ~ r t i o ~ l s fh, /L E Z , where fh(x) = X + h. S o n ~ cfurther properties of aut.onlorpllisn1 arc1 listed in Exercise 5.12. Consider a fixed structure U = (A. ( R o , .. . , R,-1). (Fo... . .F,- l ) ) . A sot, B A is called closed if the result; of applying any opc?ration to elenlents o f B is a g a i n i n B , i . e . , i f f o r a l l j ~ n - l ; l ~ i d a l ,l .a. .~, u f , - l E B, F,(uO, . . . . u / , - ~ )E B provided that it is defined. 111 particular, all constants of U belong to B. Ltlt C A; the closure of C, to be denoted C, is the least closed set containing id1 elements of C:
c
c
C=n
{
c ~A ( C c B and B is closed)
Notice that A is a closed set co~ltainingC , so the system whose ilrt,ers~c.tioii defines C is nonempty. It is trivial to check that C is rlosed; bp definition. tla.11. C is indeed the least closed set containing C. 5.9 Example (a) Every set B C_ A is closed if Zl has no operations. (b) Let R be the set of all real numbers and let C = {D). The set of a11 natural numbers N is the closure of C in the structure (R,f ) where f is the successor function defined by
f (X) = X + 1 for all real numbers
X.
( c ) Let C = (0.1); the set of all integers Z is t.he closure of C in the s t r u c t \ ~ r ~ ~ ( R , - ) or in (R,+, -, X ) . (d) The set of all rationaIs Q is the closure of 0 in (R,0.1, +, -, X , G).
+,
The notion of closure is important in algebra, logic, and other areas o f rruithematics. The next theorem shows how to construct the closure of a set "from below."
5. OPERATIONS AND STRUCTURES
61
5.10 Theorem Let U = (A, (&, . . . , (Fo,. . . , Fn-l ) ) be a structure and let C C_ A. If the sequence (Ci I i E N }is defined recursively b y
then C =
U ,:
C,.
Of course, the notation A o u .uA,-l is a shorthand for UoS2<., A,. Observe that C', C',+1 for all 2 , so the sequence (Cl 1 i E N ) is nondecreasing (see Exercise 3.1).
c
c
Let C = UEo C,. We have to prove that C C. This in turn Proof follows if we show that - C is closed, because C I> CO = C . So let j < n and ao,. . . , uf,_ l E C. From the definition of union we get that each (L,. Ifor 05r f, - l ] belongs to some Ci; let 2,. be the least i E N such that a,. E G,. The range of the finite sequence ( 2 , 1 0 7. 5 f j - 1) of natural numbers contains a greatest element 5 (this is a fact easily proved by induction: see Gi for Exercise 5.13). Since (C, I i C N ) is nondecreasing, we have a, C C,, all r f j - 1 , r > 0. We conclude that if Fj(ao,.. . ,a/;- l ) is defined, then it, C, so 6 is closed. belongs to Fj[C,f'] C_ It remains to prove the opposite inclusion C C C. Clearly, CO= C c p. If C, c then F~[c,I']C for each j 5 n - 1 because C is closed, and tlrerrfore also Ctcl C C. We conclude using the Induction Principle that G, C 5; for a11 i E N and, finally, 6 = ,U : Ci C, as required. 0
<
<
c
<
c,
c
c
We close this section with a theorem that is often used to prove tJhat all elements of a closure have some property P . It can be viewed as a gerieralizntioil of the Induction Principle, which is its special case fbr the structure ( N .S) (S is the successor operation) and C = {O), 5.11 Theorem Let P(x) be a property. Assume that ( a ) P ( a ) holds for all a C C. ( b ) For each j 5 n - 1, i f P(ao), . . . , P(aj, - I ) hold and F, ( u o, . . . . a ~- 1,) is defined, then P(Fj (ao,. . . , af, - I ) ) holds. Then P(x) holds jor all X E C.
Proof. The assumptions (a) and (b) postulate that the set B = P ( x ) ) is closed and C B,
c
(XE
A
I
0
Exercises
5.1 Which of the following sets are closed under operations of addition. subtraction, multiplication, and division of real numbers? (a) The set of all positive integers. (b) The set of all integers.
CHAPTER 3. NATURAL NUMBERS
5.2
5.3
5.4
5.5
(c) The set of all rational numbers. (d) The set of all negative rational numbers. (e) The empty set. Let * be a binary operation on A. (a) * is called commutative if, for all a, b E A, whenever a * b is defined. then b * a is also defined and a * b = b * a . (b) * is called associative if, for all a, b, c E A, (U * b) * c = U * (b * c) holds whenever the expression on one side of the equality sign is defined (the other expression then must also be defined). Which of the operations mentioned in this section are commutative or associative? Let * and A be operations in A. We say that * is distributive over A if a * ( b A c) = (a * b) A (a * C ) for all a , b, c C A for which the expression on one (and then also on the other) side is defined. For example, multiplication of real numbers is distributive over addition. but addition is not distributive over multiplication. Let A # 0, B = P ( A ) . Show that ( B , u a . n a ) and ( B , n s , u a ) are isomorphic structures. [Hint: Set h ( x ) = B - X ; notice that Ua in the first structure corresponds to ns in the second structure and vice versa.] Refer to Example 5.7 for notation. (a) There is a real number a E A such that a + a = a (namely. U = 0 ) . Prove from this that there is U' E A' such that a' X a' = U'. Fiud this a'. (b) For every a E A there is b E A such that a b = 0. Show that for every a' E A' there is b' E A' such that a' X 6' = 1. Find this b'. Let 2' and Z - be, respectively, the sets of all positive and negative integers. Show that (2'. <. +) is isomorphic to (2-. >. + ) (where < is the usual ordering of integers). Let R be a set all elements of which are n-tuples. Prove that R is n-ary relation in A for some A . For every n-ary operation F on A there is a unique (7% + 1)-ary relation R in A such that F ( a 0 , . . . , U , - 1 ) = a, if and only if R(a0, . . . . U,- 1 . U,) holds. Let B = A,; the projection of B onto its (i + 1 ) - s t coordinate set is the function ni : B -+ Ai defined by ni(a) = ai. Prove that n, is onto A, and that a = (no(a),. . . , T,- 1(a)}. More generally, (a) For any j : C B , let f, = T , 0 f , 0 5 i < n. Then f, : C -t A, and f(c) = ( f ~ ( c ) ,.-. ,fn-l(c))(b) Given f, : C + A,, 0 i < n, define f : C -+ B by f ( r ) = (fo(c),. - - f n - 1 ( ~ ) ) -Then f i = n, 0 f . Ai # 0 if and only if A, # 0 for all i 2 0, i < n. [Hint: Induction.] Let-B = Seq(A); the function length : B IV, and operations head. * (concatenation), and conv (converse) are defined on B as follows: tail, length((aol. . , , a,- l ) ) m; head((ao,. . . , a,,-1)) = uo (if m 2 1); tail({aol. . . , a,,- l } ) = ( a l ,. . . . am- 1 ) (if m 2 l ) ;
+
5.6
5.7 5.8
5.9
no,,,,
-+
<
5.10 5.11
no,,,,
1
-
-+
5. OPERATIONS A N D STRUCTURES
63
(ao,- . . , am-l) * (bo,. , . , b,-1) = (co,- . . , G,,+,- 1) where c, = a, for 0 i < m and ci = b,-, for m 5 i 5 nz + n - 1 ; conv((aO,. . . , am- 1)) = (bo,. . . , bm- l ) where b, = a ,-,- l for 0 ilm-l, Prove that for those a , b,c E B where the appropriate operations are defined length(a * 6 ) = length(a) length(b); length(tail(a)) = length(a) - 1; a = head(a) * tailla); head(a * b ) = head(a); tail(a * b) = tail(a) * b; a * (b * c) = (a * b) * c; conv(conv(a))= a; conv (a) = conv (t ail(a)) * head(a). 5.12 Let U, B, and C be structures of type 7 . Prove that (a) U is isomorphic to U. (b) If 2l is isomorphic to B then 93 is isomorphic to 2l. (c) If 2l is isomorphic to 93 and B is isomorphic to C, then U is isoinorphic to C. Prove further that (d) The identity function on the universe of U is an autornorphism of U. (e) If j and g are automorphisms of U, then fog is also an automorphism of U. ( f ) If f is an automorphism of U, then f - is also an autornorphism of U. 5.13 Let (ko,., . , k,-1) be a finite sequence of natural numbers of length n 2 1. Then its range {ko,. . . , k,-l) has a greatest element (in the standard ordering of natural numbers by size). [Hint: Use induction on the length of the sequence.] 5.14 Construct the sets CO,C l , C2, and C3 in Theorem 5.10 for (a) U = (R,S) and C = (0). (b) U = (R, +, -) and C = (0,l). 5.15 Let R A* be a binary reIation. Define a binary operation FRon A2 by
<
<
+
'
c
F R ( ( a l ,a'), (br, b z ) ) = (al,b2) if
a2 =
bl (and is undefined otherwise).
Show that the closure of R in (A', F R ) is a transitive relation. Show that if R is reflexive and symmetric, then its closure is an equivalence relation. 5.16 Let B = Seq(A) where A = {a, b, c,. . . , X ,y, t ) is the set of all letters in the English alphabet. Refer to Exercise 5.1 1. Identify sequences of Iength 1 with elements of A, so that A B . Now let F be a binary operation defined as follows:
c
if ?E, 7j E B and ij E A (i.e., it has length l ) , then F(?E,4)= ij
* 5 * ij
(F is undefined otherwise). Let C be the closure of A in the structure (B, F). Prove that c = conv(c)
64
CHAPTER 3. NATURAL NUMBERS
for all c E C. (The elements of C are called palindromes.) IHint: Usca Theorem 5.11 .] 5.17 Define n- tuples so that ( a > (ao) = ao. (b) ( a o , a l , .. . , a , ) = ((at),. . . ,a,,-l),a,) for all n > 1. [Hint: Say that a is a n n-tuple if there exist finite sequences f :incl ( a & *. - , a, - 1) of length n such that (i) f o = ao(ii) f z + l = ( f , , ~ , + ~for) all i < 71 - 1, i 2. 0. (iii) a = fn-1. Show t h a t for each n-tuple a there is a unique pair of finite sequences ,f'. (ao,. . . ,a,- l ) of length n such that ( i ) , (ii), and (iii) hold; we then writr a = ( a o , .. . ,a,,- l). Show that for every finite sequence (ao,.. . .a,,- 1 ) of length n there is a unique n-tuple a = ( a o , .. . , a,,- l ). The11 prove ( a ) and (b).]
Chapter 4
Finite, Countable, and Uncountable Sets I.
Cardinality of Sets
From the point of view of pure set theory, the most basic question about a set, is: How many elements does it have? It is a fundamental observation that we can define the statement "sets A and B have the sarrie number of elemer1t.s" without knowing anything about numbers. To see how that is done, consider the problem of determining whether the set of all patrons of some theater performance has the same number of elemeiits as the set of all seats. To find the answer, the ushers need not count the patrons or the seats. It is enough if they check t h a t each patron sits in one, and only one, seat, and each seat is occupied by one, and only one, theatergoer. 1.1 Definition Sets A and B are equzpotent (have the same cardinatity) if there is a one-to-one function f with domain A and range B. We rlenot,e this by IAI = IBI. .2 Example
{{{a)))) are
{@l{@)l and {{(8))1
{{{@))l.
let
f(@) = { { @ ) ) 'f ( i O ) ) =
( b ) {Q))and {Q),{Q)))are not equipotent. (c) The set of all positive real numbers is equipotent with the set of all negative real numbers; set f ( X ) = -X for all positive real numbers X 1.3 Theorem (a) A is equipotent to A. ( b ) If A is equipotent to B , then B is equipotent to A. (c) If A is equipotent to B and B is equipotent to C , then A is equipotent to
C.
66
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
Proof. ( a ) IdA is a one-to-one mapping of A onto A. (b) If f is a one-to-one mapping of A onto B, f is a one-to-one mapping of' B onto A. (c) If f is a one-to-one mapping of A onto B and g is a one-to-one mapping of B onto C, then g o f is a one-to-one mapping of A onto C.
-'
C Similarly, the following definition is very intuitive. 1.4 Definition The cardanalaty of A Is less than or equal to the cardinality of B (notation: (AI 5 (B1)if there is a one-to-one mapping of A into B.
<
Notice that IAl [B(means that (A( = 1C( for some subset C of B. We also write IA! < IBI to mean that IAl IBI and not \AI = IBI, i.e., that there is a one-to-one mapping of A onto a subset of B, but there is no one-to-one mapping of A onto B. Notice that this is not the same thing as saying that there exists a one-to-one mapping of A onto a proper subset of B; for example. there exists a one-to-one mapping of the set N onto its proper subset (Exercise 2.3 in Chapter 3), while of course IN1 = [NI. Theorem 1.3 shows that the property IAJ = IBj behaves like a n equivalence relation: it is reflexive, symmetric, and transitive. We show next that the (B)behaves like an ordering on the "equivalence classes" under property I AI equipotence.
<
1.5 Lemma (a) If IAl
Proof.
and IBI 5 )Cl, then IAl
Exercise 1.1.
< JCI. 0
We see that 5 is reflexive and transitive. It remains t o establish antisymmetry. Unlike the other properties, this is a major theorem.
If IX 1 < IYJ, then there is a one-to-one function f that maps X Proof. into Y; if IYJ < 1x1,then there is a one-to-one function g that maps Y into X . To show that JX) = ) Y )we have t o exhibit a one-to-one function which maps X onto Y . Let us apply first f and then g; the function g o f maps X into X and is one-to-one. Clearly, g [f [X]] C g[Y] C X; moreover, since f and g are one-toone, we have 1x1= Jg[f[X]]I and ]Y]= JgIY]I. Thus the theorem follows from LI the next Lemma 1.7. (Let A = X , B = glY], A I = g[f[X]).) 7
1. CA RDINALITY OF SETS 1.7 Lemma
If A1 C B C A and IA1] = IA], then IBI = IAl.
T h e proof can be followed with the help of this diagram:
Proof. Let f be a. one-to-one mapping of A onto A l . By recursion, we define two sequences of sets:
and
Bo,B l , . . . , Bn,.. . (In the diagram, then An's are the squares and the Bn's are the disks.) Let A. = A, B. = B , and for each n,
>
Since A. B. 2 A I , i t follows from (*),by induction, t h a t for each n, An 2 Anti. We let, for each n, C, = A , -Bn, and
(C is t h e shaded part of the diagram.) By (*), we have f [Cn] = C,+1,
Now we are ready t o define a one-to-one mapping g of A onto B: =
f (X) if X E C; i f x E D.
SO
68
CHAPTER 4. FINITE, COUNTABLE. AND UNCOUNTABLE SETS
Both g r C and g r D are one-to-one functions, and their ranges are disjoint,. T h u s g is a one-to-one function and maps A onto j[C]U D = B . 9 We now see t h a t 5 has all the attributes of a n ordering. A natural quest.ion t o ask is whether i t is linear, i.e., whether (e) IA( IBI or (B1 [ A [ holds for all A and B. I t is known t h a t t h e proof of (e) requires the Axiom of Choice. We returri to this subject in Chapter 8; in the meantime, we refrain from using (e) in aliv proofs. By now, we have established the basic properties of cardinalities, without. actually defining what cardinalities are. I n principle, it is possible t o c:ontiliut.~ the study of the properties (AI = IBI and IAl 5 1BJin this vein. without ever defining [A[;one can view [AI = [B1simply as shorthand for the property "A is equipotent t o B.'! etc. However, it is both conceptually and nottat,io~la1ly ustd'ul t o define [ A ( ,"the number of elemellts of the set A," as a n actual object o f ' s ~ t theory, i.e., a set. We therefore rnake the following assumption.
<
<
1.8 Assumption There are sets called cardinal numbers (or cardinals ) W it,h the property t h a t for every set X there is a unique cardinal (tohe cardir~al number. of X , the cardinality of X ) and sets X and Y are equipotent if and oiily if 1x1 is equal t o (YI.
1x1
I n effect, we are assuming existence of a unique "representative" for c.ac.11 class of mutuaIly equipotent sets. The Assumption is harmless in the sense tliat we only use it for convenience, and could formulate and prove all our theort1~1ls without it. T h e Assumption can actually be proved with the help of tilt. A x i o r ~ ~ of Choice, and we do t h a t in Chapter X. Moreover, for certain c,lwstls of sets. cardinal numbers can be defined, ar~c-lthe Assumption proved, even without, tllc, Axiom of Choice. T h e most important case is t h a t of finite sets. We study finite sets and their cardinal numbers in detail in the l ~ e x tsectjon. Exercises
1.1 Prove Lemma 1.5. 1.2 Prove: (a) If IAl < IBI and IB( ] C \ ,then (AI < ICI. (b) If iAl 5 (B(and IBI < ICI, then / A ( < ICI. 1.3 If A C B, then (AI J B ( . 1.4 Prove: (a) ( A X BJ= IB X Al. (b) \ ( A X B) X Cl = tA X ( B X C ) ( . (c) [ A ( 5 [ A X B1 if B # 8. 1.5 Show t h a t ISJ5 J P ( S ) J . [Hint: /S1 = ) { { a )) a E S)J.J 1.6 Show t h a t )AI (A'( for any A and any S # Q). [Hint: Consider constant functions.] 1.7 If S C_ T, then IASJ < IATl; in particular, JAnI 5 lAml if n m. IHint: Consider functions t h a t have a fixed constant value on T - S.]
<
<
<
<
2. FINITE SETS
69
1.8 IT1 5 1ST1 if IS1 2 2. [Hint: Pick u , v E S, U # u , and, for each t E T , consider f f : T -+S such that f t ( t )= U , j t ( x ) = U otherwise.) 1.9 If IAl 5 IBI and if A is nonempty then there exists a mapping f of B onto A .
It is somewhat peculiar t h a t the proof of a fundamental, general result. like the Cantor-Bernstein Theorem, should require the use of such specific sets as natural numbers. In fact, it does not. The following sequence of exercises leads to an alternative proof, and in the process, acquaints the reader with the fixed-point property of monotone functions, an important result in itself. Let F be a function on P(A) into PIA). A set X g A is called a fixed point of F if F ( X ) = X. The function F is called monotone if X 2 Y g A implies
F ( X ) c F(Y).
1.10 Let F : P(A) -, P ( A ) be monotone. Then F has a fixed point. [Hint: Let T = { X E A I F(X) C X}. Notice that T # 0. Let X = n T and prove that X E T, F(X) t T. Therefore, F ( X ) C X is impossible.] 1.11 Use Exercise 1.10 t o give an alternative proof of the Cantor-Bernstein Theorem. [Hint: Prove Lemma 1.7 as Follows: let F : P(A) -+ P(A) be defined by F ( X ) = (A - B) U f [X]. Show that F is monotone. Let C be a fixed point of F, i.e., C = (A - B) U f [C],and let D = A - C. Define g as in the original proof and show that it is one-to-one and onto B.] 1 . l 2 Prove that 5? in Exercise 1.10 is the least fixed point of F, i.e., if F ( X ) = X for some X C A, then X C X .
The remaining exercises show that the two proofs are not that different, after all.
P ( A ) -. P(A) is continuous if F ( & X') = U I E NF(Xz) holds for any nondecreasing sequence of subsets of A . ((X, 1 i E N ) is non&creasing if X, C_ Xg holds whenever i < j . ) A function F
:
1.13 Prove that F used in Exercise 1.11 is continuous. [Hint: See Exer(:ise 3.12 in Chapter 2.1 1.14 Prove that if ?T is the least fixed point of a monotone continuous function F : P ( A ) -+ P(A), then X = U I E N X' where we define recursively
X.
=
0,
=
F(Xi).
The reader should compare the proof of Lemma 1.7 with the construction of the least fixed point for F(X) = (A - B) U f [X] in Exercise 1.14.
2.
Finite Sets
Finite sets can be defined as those sets whose size is a natural number.
2.1 Definition A set S is finite if it is equipotent to some natural number n E N. We then define 1st = n and say that S has n elements. A set is infinite if it is not finite.
70
CHAPTER 4 . FINITE, COUNTABLE, AND UNCOUNTABLE SETS
By our definition, cardinal numbers of finite sets are the natural numbers. Obviously, natural numbers are themselves finite sets, and )nJ= n for all n E N. However, we have t o verify that the cardinal number of a finite set is unique. This follows from Lemma 2.2 2.2 Lemma I j n E N , then there is no one-to-one mapping o j n onto a proper subset X C n.
Proof. B y i n d u c t i o n o n n . F o r n = 0, theassertion is triviallytrue. Assuming that it is true for n, let us prove it for n 1. If the assertion is false for n + 1, then there is a one-to-one mapping f of n + 1 onto some X c n + 1. There are two possible cases: Either n f X or n 4 X . If n $ X , then X C 71. and f 1 n maps n onto a proper subset X - { f ( n ) of ) n, a contradiction. If n E X , then n = f ( k ) for some k 5 n. We consider the function g on 71 defined as follows:
+
The function g is one-to-one and maps n onto X contradiction.
-
{ n ) ,a proper subset of n. a U
2.3 Corollary (a) If n # m, then there is no one-to-one mapping of n onto m. (b) If IS)= n and JSI= m, then n = m. ( c ) N is infinite.
Proof. (a) If n # m, then either n C m or m C n, by Exercise 2 . 7 in Chapter 3, so there is no one-to-one mapping of n onto m. (b) Immediate from (a). (c) By Exercise 2.3 in Chapter 3, the successor function is a one- to-one mapping of N onto its proper subset N - (0). L
Another noteworthy observation is that if m , n E N and m < n (in the usual ordering of natural numbers by size defined in Chapter 3 ) , then m C n. so m = JmJ< 172-1 = n, where < is the ordering of cardinal numbers defined in the preceding section. Hence there is no need to distinguish between the two orderings, and we denote both by <. The rest of the section studies properties of finite sets and their cardinals in more detail. 2.4 Theorem
If X is a finite set and Y C X , then Y is finite. Momover.
2. FINITE SETS
71
Proof. We may assume that X = {xo, . . . , X, - l ), where (so,. . . , xn - l ) is a one-to-one sequence, and that Y is not empty. To show that Y is finite, we construct a one- to-one finite sequence whose range is Y. We use the Recursion Theorem in the version from Exercise 3.5 in Chapter 3. Let ko = the least k such that xk E Y; k,+l = the least k such that k > k,, k
< n , and xk
E Y (if such k exists).
We leave it to the reader t o verify t h a t this definition fits the pattern of Exercise 3.5 in Chapter 3. [Hint: A = n = (0, l , . . . , n - 1 ) ; a = min{k E n ] xk E Y ) ; g ( t , i ) = min{k E n / k > t and xk E Y ) if such k exists; undefined otherwise.] This defines a sequence (ko,., . , km-l). When we let p, = xk, for all i < m, then Y = {yi [ i < m). The reader should verify that m < n (by induction, k, i whenever defined, so especially m - 1 < km-l 5 n - 1).
>
2.5 Theorem If X is a finite set and f is Moreover, [f[Xj[<_
1x1.
U
fur~cti07j, then f[X] is finitt,.
Proof. Let X = {so,. . , , X,- 1 ). Again, we use recursion to construct a finite one-to-one sequence whose range is flX). Actually, here we use the version with f ( n + 1) = g(f n). We ask the reader to supply the details; the construction goes as follows:
-
k* = 0, ki+l the least k > ki such that k < n and f ( x k ) # f ( x ~for ) all j 5 i (if it exists), and y, = f ( x k i ) . Then f [X] = {po,.. . , ym-l} for some m
< n. 0
As a consequence, if (ai 1 i < n) is any finite sequence (with or without repetition), then the set ( a i 1 i < n ) is finite. All the constructions made possible by the Axioms of Comprehension w h e ~ l applied t o finite sets yield finite sets. We now show that if X is finite, then F ( X ) is finite, and if X is a finite collection of finite sets, then U X is finite. Hence addition of the Axiom of Infinity is necessary in order to obtain infinite sets. 2.6 L e m m a If X and Y are finite, then X U Y is finite. Moreover, IYI, and i f X and Y are disjoint, then J X U Y J= +IYI.
1x1+
1x1
\XU Y ( 5
1 f X = { x 0 , . . . , X , - 1 ) a n d i f Y ={yo , . . . , y , - ~ ) , w h e r e ( z o . . . . . Proof. xT1-1 ) and (yo, . . . , Y , ~ - are one-to-one finite sequences, consider the finite sequence z = ( x O , . . , xn- 1, yo, . . . , pm- 1 ) of length rt + m. (Precisely, define z = ( ~ , I O ~ i < n + m by)
72
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
+
Clearly, z maps n m onto X U Y. so X U Y is finite and [XU Y I 5 n + m bv Theorem 2.5. If X and Y are disjoint, z is one-to-one and ( X U Y I = n + m.
2.7 Theorem
If S is finite and if every X
E S is
f i n ~ t e ,then
U S is finite.
Proof.
We proceed by induction on the number of elements of S. Tht. statement is true if IS1 = 0. Thus assunze t h a t it is true for all S with IS1 = 7 1 . and let S = {Xo,. . . , X,-1, X,) be a set with n 1 elements, each X, E S being a finite set. By the induction hypothesis,; :U : X, is finite. and we hirvt~
+
us
=
( i X*)
X,,
which is, therefore, finite by Lemma 2.6. 2.8 Theorem If X i s finite, then P ( X ) i s finite.
By induction on / X ) . If J X J= 0, i.e., X = fl, then P ( X ) = (m) is finite. Assume t h a t P ( X ) is finite whenever 1x1 = n , and let Y be a set with n + 1 elements: Y = { y o,... .yn). Let X = { y o , . . . , ? j l l - l ) . We note t h a t P ( Y ) = P ( X ) U U, where U = {U ( .U C Y and y, E [ L ) . Wc, further note t h a t 1 U J = J P ( X )I because there is a one-to-one mapping of U onto P ( X ) : f ( u > = U - { ~ n )for all U E U . Hence P ( Y ) is a union of two finite sets anti. consecluently, finite. E
Proof
The final theorem of the section shows t h a t infinite sets really have more elements t h a n finite sets. 2.9 Theorem If X is infinite, then ]XI> n for all n
N.
Proof. I t suffices t o show t h a t 1x1> n for all n E N. This can be dorle by induction. Certainly 0 < (XI. Assume t h a t '2. n: then there is a one-to-one function f : n -+ X. Since X is infinite, there exists X E ( X - 1.iii1 j'). D~fiilo 9 = f U { ( n X)); , g is a one-to-one function on n + 1 into X . We1 co~icludethat 1x1> n + 1. C
1x1
To conclude this section, we briefly discuss another appr0ac.h t o finiteness. T h e following definition of finite sets does not use natural numbers: A set X is finite if and only if there exists R relation 4 such t h a t (a) + is a linear ordering of X . (b) Every nonempty subset of X has a least and a greatest element in 4 . Note t h a t this notion of finiteness agrees with the one we defined using finitr sequences: If X = {so,. . . , X,- l ), then so 4 . - . 4 X,- l describes R lincitr ordering of X with the foregoing properties. O n the other hand. if (X. + ) satisfies ( a ) and (b), we construc:t, by rec~lrsion,a sequence (h. f l . . . ) . ;ts ,
2. FINITE SETS
73
in Theorem 3.4 in Chapter 3. As in that theorem, the sequence exhausts a11 elements of X ,but the construction must come t o a halt after a finite rlunlber of steps. Otherwise, the infinite set Ifo, f l , f 2 , , . . ) has no greatest element in
(X, 4. We mention another definition of finiteness that does not involve natural numbers. We say that X is finite if every nonempty family of subsets of X has a c-maximal element; i.e., if 0 # U C P ( X ) . then there exists z E U such that for no y E U , z C y. Exercise 2.6 shows equivalence of this definition with Definition 2.1. Finally, let us consider yet another possible approach t o finiteness. It follows from Lemma 2.2 that if X is a finite set, then there is no one-to-one mapping of X onto its proper subset. On the other hand, infinite sets. such as t11e set of all natural numbers N, have one-to-one mappings onto a proper subset [e.g.? f ( n ) = n + l]. One might attempt to define finite sets as those sets which are not equipotent t o any of their proper subsets. However, it is impossible to prove equivalence of this definition with Definition 2.1 without using the Axiom of Choice (see Exercise 1.9 in Chapter 8).
Exercises
2.1 If S =
2.6
2.7
2.8
n- l
C i = o IxilIf X and Y are finite, then X X Y is finite, and / X X Y (= 1x1.( Y / . If X is finite, then )P(X)t= 2Ix1. If X and Y are finite, then X Y has lxllY elements. If = n > k = IYI,thenthenumber ofone-to-one functions f : Y + X is n - ( n - l ) . - . . . ( n - k + l ) . X is finite if and only if every nonempty system of subsets of X h ; ~ a E-maximal element. [Hint; If X is finite, 1 x1 = 11, for some 1 1 . If U C P ( X ) , Iet m be the greatest number in {IY/I Y E U). If Y E U and IY1 = m , then Y is maximal. On the other hand, if X is infinite. let U = {Y g X / Y is finite).) Use Lemma 2.6 and Exercises 2.2 and 2.4 to give easy proofs of commutativity and associativity for addition and multiplication of natural numbers, distributivity of multiplication over addition, and the usual arithmetic properties of exponentiation. [Hint: To prove, e.g., the commutativity of multiplication, pick X and Y such that ( X ( = ~ r a ,( Y J= 72. By Exercise 2.2, m . n = [X X YI, n . m = IY X XI. But X X Y and Y X X are equipotent.] If A, B are finite and X A X B, then = CoEA ku where k, = IX n ( { a ) B)I.
l US1
2.2 2.3 2.4 2.5
{Xo,. . . , Xn-l) and the elements of S are mutually disjoint, the11
1x1
c
1x1
74
CHAPTER 4. FINITE?COUNTABLE, AND UNCOUNTABLE SETS
3. Countable Sets The Axiom of Infinity provides us with an example of an infinite set - the set N of all natural numbers. In this section, we investigate the cardinality of N ; that is, we are interested in sets that are equipotent to the set N. 3.1 Definition A set S is countable if JSI= [NI.A set S is at most c o u n t a b l ~ if IS] IN).
<
Thus a set S is countable if there is a one-to-one mapping of N onto S, that. is, if S is the range of an infinite one-to-one sequence. 3.2 Theorem A n infinite subset of a countable set is countable. Let A be a countable set, and let B C A be infinite. There is an Proof. infinite one-to-one sequence (a,),"==,, whose range is A. We let bo = ak,,, where ko is the least k such that an, E B. Having constructed b,, we let bn+l = ar;,,,, . where kn+l is the least k such that ak E B and ak # b, for every i n. Such k exists since B is infinite. The existence of the sequence (b,,),",o follows easily from the Recursion Theorem stated as Theorem 3.5 in Chapter 3. It is easily seen that B = {bn I n E N ) and that (bn)r=o is one-to-one. Thus B is countable. U
<
If a set S is a t most countable then i t is equipotent to a subset of a countable set; by Theorem 3.2, this is either finite or countable. Thus we have Corollarv 3.3. 3.3 Corollary A set i s at most countable if and only i f it i s either finite or countable.
The range of an infinite one-to-one sequence is countable. If is an infinite sequence which is not one-to-one, then the set may be finite (e.g., this happens if it is a constant sequence). However, if the range is infinite, then it is countable.
(~~)r..~
3.4 Theorem The range of a n infinite sequence is at most muntable. i.e., either finite or countable. ( I n other words, the image of a countable sct under any mapping is at most countable.)
By recursion, we construct a sequence (b,) (with either finite or Proof. infinite domain) which is one-to-one and has the same range as (a,):==,. We let bo = a,, and, having constructed b,, we let b,+ l = ak,,,, , where k,+ 1 is the least k such that ak # b, for all i n. (If no such k exists, then we consider the finite sequence (bi / i . In ) . ) The sequence (bi) thus constructed is one-to-one 0 and its range is {a,),",,.
<
One should realize that not all properties of size carry over from finite sets to the infinite case. For instance, a countable set S can be decomposed into two
3. COUNTABLE SETS
75
disjoint parts, A and B , such that IAl = IBI = IS/;that is inconceivable if S is finite (unless S = 0). Namely, consider the set E = (2k I k E N } of all even numbers, and the set 0 = (2k 1 I k E N } of all odd numbers. Both E and 0 are infinite, hence countable; thus we have IN1 = [El= 101while N = E U 0 and E n 0 = Q). We can do even better. Let p, denote the nth prime number (i.e., p0 = 2, p1 = 3, etc.). Let
+
s o = { S k I k E N ) , SI = {3k I k E N ) , . . . ,
S,
-
lk
E N ) , ...
.
The sets Sn ( n E N ) are mutually disjoint countable subsets of N . Thus we have N 2 S,, where (SnI= IN1 and the Sri's are mutually disjoint. The following two theorems show that simple operations applied to countable sets yield countable sets. 3.5 Theorem The union of two countable sets is a countable set.
Proof. Let A = { a , 1 n E N } and B = {b, I n E N ) be countable. We construct a sequence (c~):=~ as follows: cak = ak and czk+l = bk
for all k E N .
Then A U B = {c,1 n E N ) and since it is infinite, it is countable.
3.6 Corollary The union of a Jinzte system of countable sets is countable.
Proof.
Corollary 3.6 can be proved by induction (on the size of the system). U
One might be tempted to conclude that the union of a countable system of countable sets in countable. However, this can be proved only if one uses the Axiom of Choice (see Theorem 1.7 in Chapter 8). Without the Axiom of Choice, one cannot even prove the following "evident" theorem: If S = { A , I ~z E N ) and lA,l = 2 for each n, then A, is countable! The difficulty is in choosing for each n E N a unique sequence enumerating A,. If such a choice can be made, the result holds, as is shown by Theorem 3.9. But first we need another important result. 3.7 Theorem If A and B are countable, then A X B is countable.
Proof. I t suffices to show that IN X NI = ]NI,i.e., to construct either a one-to-one mapping of N X N onto N or a one-to-one sequence with range N X N . [See Exercise 3.l(b).] (a) Consider the function
We leave it to the reader to verify that f is one-to-one and that the range o f f is N .
76
CHAPTER 4. FINITE, COUNTABLE, AND U N C O U N T A B L E S E T S
(b) Here is another proof: Construct a sequence of elements of N manner prescribed by the diagram.
X
N in the
We have yet another proof:
3.8 Corollary The cartesian product of a finite number of countable .sets zs countable. Consequently, N m i s countable, for every m > 0. Proof.
13
Corollary 3.8 can be proved by induction.
3.9 Theorem Let ( A , I n E N ) be a countable system of at most countable sets, and let ( a , J n E N ) be a system of enumerations of A,; i,e., for each T L E N , a , = (a,(k) I k E N )is a n infinite sequence, and A , = {a,(k) I k E N ). Then An is at most countable.
Up=o
Proof. Define f : N X N -+ An by f ( n ,k ) = a n ( k ) . f maps N 00 onto Un=-, A n , SO the latter is at most countable by Theorems 3.4 and 3.7.
As a corollary of this result we can now prove
X
N
3. COUNTABLE SETS
77
3.10 T h e o r e m If A i s countable, then the set Seq(A) of all finite sequences of elements of A is countable.
It is enough to prove the theorem for A = N. As Seq(N) = Proof. N n , the theorem follows from Theorem 3.9, if we can produce a sequence (an I n L. 1) of enumerations of N n . We do that by recursion. Let g be a one-to-one mapping of N onto N X N. Define recursively
a1 ( 2 ) = ( i )for aH i E N; an+l ( i )= (bo,. . . , bn-l, i2)where g ( i ) = ( i l , iz) and (boy. . . , b,- 1) = a n ( i l ) , for all i E N .
+
The idea is to let an+ l ( i )be the ( n l)-tuple resulting from the concatenatiorl of the (il)t h n-tuple (in the previously constructed enumeration of n-tuples, a,) with i2. An easy proof by induction shows that a , is onto N n , for all N n is countable. Since N O = {()l, N n is also n 2 1, and therefore countable. 3.11 Corollary The set of all finite subsets of a countable set is countable.
Proof. The function F defined by F ( ( a o. . . . . = { a o. . . . . U , , - L ) maps the countable set Seq(A) onto the set of all finite subsets of A. The conclusion follows from Theorem 3.4. Other useful results about countable sets are the following. 3.12 T h e o r e m The set of all integers are countable.
Proof.
Z and the set of all rational numbers Q
Z is countable because it is the union of two countable sets:
Q is countable because the function f : Z p/q maps a countable set onto Q.
X
( Z - (0))
+Q
defined by f ( p . q )
=
3.13 T h e o r e m A n equivalence relation o n a countable set has at most courstably m a n y equivalence classes.
Proof. Let E be an equivalence relation on a countable set A. The fuilction F defined by F ( a ) = [aIE maps the countable set A onto the set A / E . By Theorem 3.4, A / E is at most countable. E 3.14 T h e o r e m Let U be a structure with the universe A , and let C 2 A bc at. most countable. T h e n the closuw of C , is also at most countable.
c.
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
78
Proof. Theorem 5.10 in Chapter 3 shows that C = C,, where CO= C -1 and C,+l = C, U F ~ [ cU ~ . . ]U Fn- (C, 1. It therefore suffices to produce a system of enumerstions of (Ci / i E N). Let (c(k) I k E N) be an enumeration of C, and let g be a mapping of N onto the countable set ( n + 1) X N X N f " X . X We define a system of enumerations (at I i E N } recursively as follows: f*A
for all k E N; ~ ~ (=kc(k) ) aa+l(k) =
p
..
. , a,(%
at (q>
i f 0 < p j n - I; i f p = n,
where g(k) = (p,q, (r:, . . . l rf)-'}, . . . l { 0T ~ -. .~. , ~f:-~'-l})+ The definition of ai+l is designed so as to make it transparent that if ai enumerates C,, a,+l enumerates Ci+1 (with many repetitions). By induction, U , enumerates C, for each i E N, as required. E We conclude this section with the definition of the cardinal number of countable sets. 3.15 Definition IAl = N for all countable sets A
We use the symbol No (aleph-naught) to denote the cardinal number of countable sets (i.e., the set of natural numbers, when it is used as a cardinal number). Here is a reformulation of some of the results of this section in terms of tflie new notation. (a) No > n for all n E N ; if No > )E for some cardinal number n, then n = No or K = n for some n E N (this is Corollary 3.3). (b) If ) A ] = No, 1B1 = No, then IAU Bj = No, J A X B1 = No (Tkieorenis 3.5 and 3.7). (c) If [A[= N o , then I Seq(A)(= No (Theorem 3.10). 3.16
Exercises
3.1 Let IAl( = ( A z f , lBlt = 1B21-Prove: (a) If A1 n A2 = 0, B1 n B2 = 0, then (A1 U A21 = IBI U Bzl. (b) ( A 1 X A21 = (B1X B2l. (c) I Ses(A1) 1 = I Seq(A2)I. 3.2 The union of a finite set and a countable set is countable. 3.3 If A # 0 is finite and B is countable, then A X B is countable. 3.4 If A # 0 is finite, then Seq(A) is countable. 3.5 Let A be countable. The set [AIn = { S C_ A I IS1 = n ) is countable for all n E N, n # 0.
4. LINEAR ORDERINGS
79
3.6 A sequence (s~)?&of natural numbers is eventually constant if there is no E N , s E N such that S, = S for all n no. Show that the set of eventually constant sequences of natural numbers is countable. 3.7 A sequence (S,)F=~ of natural numbers is (eventually) periodic if there are no,p E N, p 1 , such that for all n 2 no, S,+, = S,. Show that the set of all periodic sequences of natural numbers is countable. 3.8 A sequence (sn):=o of natural numbers is called a n arithmetic progression if there is d E N such that s,+l = S, d for all n E N. Prove that the set of all arithmetic progressions is countable. 3.9 For every s = (so,. . . , Sn- 1) E S e q ( N - ( Q } ) , let f (S) = p': . . - p ,S,,- ,- 1 where p, is the ith prime number. Show that f is one-to-one and use this fact to give another proof of I Seq(N)I = No. 3.10 Let [ S , C ) be a linearly ordered set and let (A, 1 n E N) be an infinite =:,- An is at most countable. sequence of finite subsets of S. Then U [Hint: For every n E N consider the unique enumeration ( a , ( k ) j k < ]A,,1) of A, in the increasing order.] 3.11 Any partition of an a t most countable set has a set of representatives.
>
>
+
4.
Linear Orderings
In the preceding section we proved countability of various familiar sets, such as the set of all integers Z and the set of all rational numbers Q. The important point we want to make is that we cannot distinguish among the sets N : Z ? and Q solely on the basis of their cardinality; nevertheless, the three sets "look" quite different (visualize them as subsets of the real line!). In order to capture this difference, we have to consider the way they are ordered. Then it becomes apparent that the ordering of N by size is quite different from the usual ordering of Z (for example, N has a least element and Z does not), and both are quite different from the usual ordering of Q (for example, between any two distinct rational numbers lie infinitely many rationals, while between any two distinct, integers lie only finitely many integers). Linear orderings are a n important tool for deeper study of properties of sets; therefore, we devote this section to the theory of linearly ordered sets, using countable sets for illustration. 4.1 Definition Linearly ordered sets (A, <) and (B, 4)are similar (have the same order type) if they are isomorphic, i.e., if there is a one-to-one mapping f on A onto B such that for all a l , a2 E A, a1 < a2 holds if and only if f ( a l )4 f ( a 2 ) holds. (See Definition 5.17 in Chapter 2 and the subsequent Lemma.)
Similar ordered sets "look alike"; their orderings have the same properties [see Example 5.8 in Chapter 31. I t follows that [N, <) and ( 2 , <) are not similar; likewise, (2,<) and ( Q , <) are not similar, and neither are (N,<) and (Q, <). (Here, < is the usual ordering of numbers.) I t is trivial to show that the property of similarity behaves like an equivalence relation:
80
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
(a) (A, <) is similar t o (A, <). (b) If (A, <) is similar to ( B , +), then (B, 4)is similar to ( A , <). is similar to (A2, <2), and (Az, < 2 ) is similar t o (A31 thc'11 ( C ) If (A1, is similar t o (A3,
Proof. We prove that every nonempty finite subset B of a linearly ordered set (A, <) has a least element by iriduction on the number of elements of B. If B has 1 element, the claim is clearly true. Assume t h a t it is true for all n-elemetit sets and let B have n + l elements. Then B = {b) U B' where B' has n elemerits and b B'. By the inductive assunlption, B' h a s a least element b'. If 6' < I), then b' is the least element of B; otherwise, b is the least element of B. In eithvr case, B has a least element. and ( A 2 .<2) are linearly ordered sets and /A11 = J Ay.! 4.3 Theorem If (A1, is finite, then ( A l , and (A2. <2) are similar. Proof. We proceed by induction on n = [All = [ A z ( . If ? L = 0, tile11 A1 = A2 = 8 and ilql, < i ) , (Az. <2) are clearly isonlorpl~ic. Assume t h a t t.licl claim is true for all linear orderings of n-element sets. Let IA l l = IA2)= 71 + 1. We proved that < l and
I t is easy to check that f is an isomorphism between (A1, < l ) and ( A 2 .c 2 ) . C We conclude t h a t for finite sets, order types correspond to cardinal numbers. As the examples at the beginning of this section show, linear orderings of infinite sets are much more interesting. We next look at, some ways of producing linear orderings that will be useful later.
4 . L I M A R ORDERINGS
81
4.4 Lemma i f ( A , <) i s a linear ordering, then ( A . < - l ) is also a linear or-
dering. Proof.
Left to the reader (see Exercise 5.3 in Chapter 2 ) .
For example, the inverse of the ordering ( N ,<) is the ordering ( N .< l ) : . . - 4 < - l 3 < - l 2 < - l 1 < - l 0; notice that it is similar to the ordering of negative integers by size:
and that it is not a well-ordering. 4.5 Lemma Let ( A l , and (A2,< 2 ) be linearly ordered sets and A l n A a = g. T h e relation < o n A = A 1 U A2 defined by
o r a , b ~A 2 a n d a o r a E A I , b~ A2
<2
b
i s a linear ordering. Proof.
This is Exercise 5.6 in Chapter 2, so again we leave it to the reader.
c! The set A is ordered by putting all elements of A I before all elements of A 2 . We say that the linearly ordered set ( A , <) is the sum.of the linearly orderecl sets ( A l , < I ) and ( A 2 ,<2), Notice that the order type of the sum does not clepend on the particular orderings ( A l , and ( A z ,< 2 ) , only on their types (see Exercise 4.1). As an example, the linearly ordered set ( 2 ,<) of all integers is similar t,o the sun1 of the linearly ordered sets (N,< - l ) and ( N ,<) (< denotes the usual ordering of numbers by size). Next, let us consider ways to order cartesian product. 4.6 Lemma Let (Al, < l ) and ( A z ,< 2 ) be linearly ordered sets. The relation
on A
=
<
A1 X Az defined by
( a l la2) < (bl , b2) if and only if
a1
< 1 bl or (a1 = bl and a2
<2
b2)
is a linear ordering. Proof. l?ansitiwity: If ( a l . a 2 ) < (bl, b2) and (bl, b2) < (c1,c 2 ) ,WP have either a1 < l bl or ( a l = bl and a2 <2 62). In the first case a l < l 6 1 and bl cl gives a1 < I c l . In the st!cond case. either bl < l cl and a1
82
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
(a) a1 <1 b l [so @ l , a2) < ( h , b2)](b) b l < l a1 [so (bl, b2) < (a1, az)]. (c) a1 = bi and a2 < 2 b2 [so ( a l , az) < (bl, h ) ] . (d) a1 = b l and b2
I i E I) be an indexed system
sets, where I C N. The relation
4
on
ntEIA, defined by
of linearly ordered
f + 9 i j and only 2 j ififf(f, g) = {iE I I f, # gi) # 0 and f*,)
g,,,
where io is the least element of diff(f,g) (in the usual ordering < of natural numbers)
is a linear ordering of
n,,,A, (it is called its lexicographic ordering).
Proof. hnsitivity: Assume that f 4 g and g 4 h and let io [ j o ,respectively] be the least element of diff (f, g) [diff(g, h), respectively]. If io < jo,we have fi,, < g*, and g,,, = hi,,, so fa,, < h,,, and io is the least element of diff(f , h). We conclude that f + h. The cases io = jo and io > jo are similar. Asymmetry: f + g and g + f is impossible because it would mean that fio < gl,, and gi,, < fi,) for io = the least element of diff(f,g) = diff(g. f ) . Linearity: If diff ( f ,g) = 0, we have f = g. Otherwise, if io is the least element of diff f,g), either f,,, < g,,, or f,,, > g,,, holds and, consequently, either f+9orf+9. a In particular, if (A,, < i ) = (A, <) for all i E I = N , + is the lexicographic ordering of the set A N of all infinite sequences of elements of A. One can also choose to compare second coordinates before comparing t.hct first coordinates and so define the antilexicographic ordenrly 4 o f A1 x A2: (al,a2) 4 ( b l , b2)
if and orlly if
a2
<2
b2 or (a2 = b2 and a l
hl).
The proof that 4 is a linear ordering is entirely analogous to the lexicographic case. The two orderings are generally quite different; compare, for example. the lexicographic and antilexicographic products of A1 = N = { 0 , 1 , 2 , .. . ) ancl A2 = {0,1) (both ordered by size):
4. LINEAR ORDERINGS ordered lexicographically: (0, l )
(131)
(291)
(371)
ordered antilexicographically: (0, l ) +l l ) +(2, l ) +3 , l )
.
The first ordering is similar to ( N , <) and the second is not [it is the sum of two copies of (N,<)l. The previous results show that there is a rich variety of types of linear orderings on countable sets. I t is thus rather surprising to learn that there is a universal linear ordering of countable sets, i.e., such that every countable linearly ordered set is similar to one of its subsets. The rest of this section is devoted to the proof of this important result. 4.8 Definition An ordered set [X, <) is dense if it has a t least two elements and if for all a , b E X, a < b implies that there exists X E X such that a < X < b.
Let us call the least and the greatest elements of a linearly ordered set (if they exist) the endpoints of the set. The most important example of a countable dense linearly ordered set is the set Q of all rational numbers, ordered by size. The ordering is dense because, if r, S are rational numbers and r < S , then X = ( r + s)/2 is also a rational number, and r < X < S . Moreover, (Q, <) has no endpoints (if r E Q then r + l , r - 1 E Q and r - 1 < r < r 1). Other examples of countable dense linearly ordered sets are in Exercises 4.6 and 4.7. However, we prove that all countable linearly ordered sets without endpoints have the same order type.
+
4.9 Theorem Let (P,4) and (Q, <) be countable dense linearly ordered sets
without endpoints. Then (P,4)and (Q, <) are similar. Proof. Let (p, I n E N) be a one-to-one sequence such that P = {p, I n E N ) , and let (g, ] n E N) be a one-to-one sequence such that Q = { g , 1 n E N ) . A function h on a subset of P into Q is called a partial isomorphism from P to Q if p 4 p' if and only if h(p) < h@') holds for all p, p' E dom h. We need the following claim: If h is a partial isomorphism from P to Q sut-11 that dom h is finite, and if p E P and q E Q, then there is a partial isomorphisrll h p , , _> h such that p E dom hp,,, and q E ran h.,,,
84
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
.
Proof of the claim. Let h = ((pi,, qi,). . - . (pi,, g*,)} where p,, 4 p12 4 . . - 4 p,,., and thus also qi, < g,, < - - . < qi, . If p 4 dom h, we have either p 4 Pil or p,, 4 p 4 p,,,+l for some 1 I e 5 k , or p,, 4 p. Take the least natural number n such that qn is in the same relationship to g,, , . . . .g,, as p is to p,, , . . . , p,, ; more precisely, qn is such that: if P 4 P,, , then 9, < G , ; if P,, + P 4 P,,.,,, then < qn < gi,+,; and if p,, 4 p, then < qnThe possibility of doing this is guaranteed by the fact that (Q, <) is a dense linear ordering without endpoints. It is clear that h' = h U { ( p ,g,)) is a partial isomorphism. If g E ran h', then we are done. If q 4 ran h', then by the same argument as before (with the roles of P and Q reversed), there is p,, E P sucll that h' U {(P,, g)) is a partial isomorphism, and we take the least such m, ant1 fl let hPTq= h' U {(p,, g ) ) . The claini is now proved. We next construct a sequence of compatible partial isomorpk~isn~s by rt.c.rlrsion:
where (hn ) p , . ,qv. is the extension of h, (as provided by the claim) such that. Pn E dom(hn)p.,,q,,, Qn E ran(hn)p,,,q,,. Let h = UnENh n It is trivial t o veritji that h : P -,Q is an isomorphism between (P, 4 ) and ( Q , c ) . 0 4.10 Theorem Every countable linearly ordered set can be mapped isornarphically into any countable dense linearly ordered set without endpoints.
Theorem 4.10 requires a one-sided version of the previous proof'. Proof. Let (P, 4 ) be a countable linearly ordered set and let (Q, <) be a countat~le dense linearly ordered set without endpoints. For every partial isomorphism 11 from the ordered set ( P , 4 ) into Q and for every p E P, we define a partial h such that p E clom h,. Then we use recursion again. D isomorphism h,
>
Exercises
4.1 Assume t h a t ( A l , < l ) is similar to (B,, 4 ,) and (A2.<2) is similar to (B23 4 2 ) .
(a) The sum of (A1, < l ) and (A2, < 2 ) is similar t o the sum of (Bl.4 [ ) and ( B z ,42), assuming that A r n A2 = 0 = B1n B2. (b) The lexicographic product of ( A l l < l ) and (A2, < 2 ) is sirriilar t)o t,l~cl lexicographic product of ( B I ,1 )~ and ( B 2 , 4 2 ) . 4.2 Give an example of linear orderings ( A l , < l ) and (A2,< 2 ) such thiit. the sum of ( A l , < r ) and ( A 2 ,< 2 ) does not have the same order tvpe as the sum of (A2, <2) and (A1, ("addition of order types is not commutative"). Do the same thing for lexicographic product. 4.3 Prove that the sum and the lexicographic product of two well-orderings are well-orderings.
4. LINEAR ORDERINGS
85
4.4 If (Ai 1 i E N ) is an infinite sequence of linearly ordered sets of rlatural numbers and (All 2 for all i E N , then the lexicographic ordering of A, is not a well-ordering. 4.5 Let ((Ai, <*) I i E I ) be an indexed system of mutually disjoint linearly A, defined by: a 4 b if and N . The relation 4 on ordered sets, I only if either a , b E A, and a <, b for some i E 1 or a E A,, b E A, and i < j (in the usual ordering of natural numbers) is a linear ordering. If all <, are well-orderings, so is 4. 4 . 6 Let (2.<) be the set of all integers with the usual linear ordering. Let 4 be the lexicographic ordering of Z" as defined in Theorem 4.7. Fir~ally, let FS C_ Z N be the set of all eventually constant elements of Z N : i.e., ( a i 1 i E N ) F S if and only if there exist no E N . U E Z such no (compare with Exercise 3.6). Prove that FS that a, = a for all i is countable and (FS, 4 ~ F S is~a )dense linearly ordered set without. endpoints. 4.7 Let 4 be the lexicographic ordering of N" (where N is assumed to be ordered in the usual way) and let P G N" be the set of all evr~ltrlally periodic, but not eventually constant, sequences of natural numbers (see Exercises 3.6 and 3.7 for definitions of these concepts). Show that (P,i np2) is a countable dense linearly ordered set without endpoints. 4.8 Let (A, <) be linearly ordered. Define 4 on Seq(A) by: (ao.. . . . a ,,,+l } i ( b o , . . . , b , - , ) if and only if there is k < 7% such that a , = b, for all i < k and either ak < bk or ak is undefined (i.e., k = nz < n ) . Prove t,llilt 4 is a linear ordering. If (A, <) is well-ordered, (Seq(A). 4) is also ~vrllordered. (In this ordering, if a finite sequence extends a shorter seclutbnrtt. the shorter one comes before the longer one.) 4.9 Let (A, <) be linearly ordered. Define 4 on Seq(A) by: ( a o ,. - . , a,*- 1 ) 4 (ho, . . . , b,- l ) if and only if there is k < m such that a, = b, for all i < k and either ak < bk or bk is undefined (i.e., k = n < m ) . Prove that. iis a linear ordering. If (AI 2 2, it is not a well-ordering. (In this ordering, if a finite sequence extends a shorter one, the longer one comes before? thth shorter one.) If A = N and < is the usual ordering of natural iiumb(trs, 4 is called the Brouwer-KIeene ordering of Seq(N); it is n cl~usvl i ~ l c w r . ordering with no least element and a greatest element (). 4.10 Let (A, <) be a linearly ordered set without endpoints, A # Q). A closcjd internal [a,b] is defined for a, b E A by [a,b] = { s E A I (I < : I . h} Assume that each closed interval [ a ,bj, a, b E A, has a finite number of elements. Then (A, <) is similar to the set Z of all integers ill the us11a1 ordering. 4.11 Let (A, <) be a dense linearly ordered set. Show that for all (I,6 E A , a < b, the closed interval [a,b ] , as defined in Exercise 4.20, 11asirifinitdy many elements. 4.12 Show that all countable dense linearly ordered sets with bath erlrlpoirlts are similar. 4.13 Let (Q,<) be the set of all rational numbers in the usual ortlrring. Fii~tl sut,sets of Q similar to
Hi,"
>
>
<
86
CHAPTER 4, FINITE, COUNTABLE, AND UNCOUNTABLE SETS (a) the sum of two copies of (N,c ) ; (b) the sum of IN, <) and (N,< - l ) ; (c) the lexicographic product of (N,<) with ( N , <)
Complete Linear Orderings We have seen in the preceding section that the usual ordering < of the set Q of all rational numbers is universal among countable linear orderings (Theorem 4.10). However, when arithmetic operations on Q are considered, it becomes apparent that something is missing. For example, there is no rational number z such that x 2 = 2 (Exercise 5.1). Another example of this phenomenon appears when one considers decimal representations of rational numbers. Every rational number has a decimal expansion that is either finite (e.g., 1/4 = 0.25) or infinite but periodic from some place on (e.g,, 1/6 = 0.1666 . - . ) (see Section 10.1). Although it is possible to write down decimal expansions O.araaas. - . where ( U ~ ) Pis ~an, ~arbitrary sequence of integers between 0 and 9, unless the sequence is finite or eventually periodic, there is no rational number X such that s = O.alazas . . . (As a specific example, consider 0.1010010001 - . - .) It is r:lcar from these considerations that the ordered set ( Q , <) has gaps. The notion of a gap can be formulated in terms involving only the linear ordering.
5.1 Definition Let (P,<) be a linearly ordered set. A gap is a pair (A. B) of sets such that (a) A and B are nonempty disjoint subsets of P and A U B = P. (b) If a E A and b E B, then a < b. (c) A does not have a greatest element and B does not have a least element. For example, let B = { X E Q I X > 0 and x 2 > 2) and A = Q - B = { X E Q 1 X 1. 0 or ( X > 0 and x 2 < 2)). It is not difficult to check that (A, B ) is ii gap in Q (Exercise 5.2). Similarly, an infinite decimal expansion which is not eventually periodic gives rise to a gap (Exercise 5.3). At this point, we ask the reader to recall the notions of upper and lower bound, and supremum and infimum, that were introduced in Chapter 2. We call a nonempty subset of a linearly ordered set P bounded if it has bot.h lower and upper bounds. A set is bounded from above (from below) if it has an upper (lower) bound. Let (A, B ) be a gap in a linearly ordered set. The set A is bounded frorn above because any b B is its upper bound. We claim that A does not have a supremum. For if c were a supremum of A then either c would be the greatest, element of A or the least element of B, as one can easily verify. On the other hand, let S be a norlempty set, bounded from above. Let A = (X I X 5 B = {X( X >
S
s
for some s C S ) , for every s E S).
1. WELL-ORDERED SETS
105
the initial segment of W given by a. (Note that if a is the least element of W , then W ( a ]is empty.) By Lemma 1,2, each initial segment of a well-ordered set is of the form W [ a ]for some a E W. Of course, W[a] is also well-ordered by < (more precisely, by < nW[a12); we usually do not mention the well-ordering relation explicitly when it is understood from the context. 1.3 Theorem If ( W l ,< and (W2,c2) are well-ordered sets, then exactly one of the following holds: (a) either W, and W2 are isomorphic, or (b) W r is isomorphic to an initial segment of W z , or (c) W;! is isomorphic to an initial segment of Wl. In each case, the isomorphism is unique.
This theorem provides the method of comparison for well-orderings mentioned above: we say that Wl has smaller order type than W2if W1 is isonlorphic to W2[a]for some a E W2. Before we prove Theorem 1.3, we prove the following lemma and state some of its corollaries. A function f on a linearly ordered set (L, <) into L is increasing if zl < xz implies f (xl) < f (xz). Note that an increasing function is one-to-one, and is an isomorphism of (L, <) and (ran f,<). 1.4 Lemma If (W,<) is a well-ordered set and i f f : W function, then f (X) X for all X E W.
>
-+
W is an increasing
If the set X = { X E W I f ( X ) < X ) is nonempty, i t has a least Proof. element a. But then f (a) < a, and f (f (a)) < f (a) because f is increasing. This means that f ( a ) E X , which is a contradiction because a is least in X . 0 1.5 Corollary (a) No well-ordered set is isomorphic to an initial segment of itself. (b) Each well-ordered set has only one automorphism, the identity. (c) If Wl and W2 are isomorphic well-ordered sets, then the isomorphism betzueen W, and W2 is unique.
Pro0 f. (a) Assume that f is an isomorphism between W and W [ a ]for some a E W. Then f (a) E W(a] and therefore f (a) < a, contrary to the lemma, as f is an increasing function. (b) Let f be an automorphism of W. Both f and f - l are increasing functions and so for all X E W, f ( x ) 2 X and f-'(X) 2 X , therefore X 2 f ( z ) . It follows that f (X)= X for all X E W. (c) Let f and g be isomorphisms between Wl and W2. Then f o g-' is an automol.phism of W2 and hence is the identity map. It follows that f = g. 1.6 Proof of Theorem 1.3 Let Wl and W2 be well-ordered sets. It is a consequence of Lemma 1.4 that the three cases (a), (h), and (c) are mutually
88
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
is dense in C ) there is p E P such that c d p 4 d. It is readily seen that SUP* Sc +* p 4* SUP* Sd and hence h(c) +* h(d). From this we conclude that h is an isomorphism using Lemma 5.18 from Chapter 2. Finally, if :I: E P. t.hen X = sup Sz = sup* Sz and so h(x) = X . U To prove the existence of a completion, we introduce the notion of a Dedekind cut. 5.4 Definition A cut is a pair (A, B ) of sets such that (a) A and B are disjoint nonempty subsets of P and A U B = P. (b) If a E A and b E B, then a < b.
We recall that a cut is a gap if, in addition, A does not have a greatest, element and B does not have a least element. Notice that since P is dense, it is not possible that both A has a. greatest element and B has a least element. Thus the remaining possibilities tire that either B has a least element and -4 does not have a greatest element, or A has a greatest element and B does not have a least element. In either case, the supremum of A exists: In the first case, the supremum is the least element of B, and in the other case, the supremum is the greatest element of A. Hence, we consider only the first case and disregard the cuts where A has the greatest element.
5.5 Definition A cut (A, B) is a Dedelclnd cut if A does not have a greatest element. We have two types of Dedekind cuts (A, B): (a) Those where B = { X E P I X 2 p) for some p E P; we denote (A. B) = [p]. (b) Gaps. Now we consider the set C of all Dedekind cuts (A, B) in (P,<) and order C :is follows: (A, B ) < ( A , B ) if and only if A C_ A'. We leave i t to the reader to verify that (C, <) is a linearly ordered set. If p,q E P are such that p < g, then we have b] 4 [q]. Thus the linearly ordered set (P', d ) , where P' = {b]I p E P ) , is isomorphic to (P,<). We intend to show that (C, 4) is a completion of ( P , 4). Since (P, <) and (P', 4 ) are isomorphic, it follows that (P,<) has a completion. It suffices to prove (C') P' is dense in (C, +), (d') C does not have endpoints, and, of course, (e) (C, +) is complete. To show that P' is dense in C, let c, d E C be such that c 4 d; i.e., c = (A, B ) , d (A', B'), and A C A'. Let p E P be such that p E A' and p A. Moreover, we can assume that p is not the least element of B. Then (A, B) 4 [ p ] 4 (A', B') and hence P is dense in C. [This also shows that (C, 4 ) is a densely ordered set .) f
-
f
5. COMPLETE LINEAR ORDERINGS
89
Similarly, if (A, B ) E C, then there is p E B that is not the least element of B , and we have (A, B) 4 Ip]. Hence C does not have a greatest element, For analogous reasons, i t does not have a least element. To show that C is complete, let S be a nonempty subset of C, bounded from above. Therefore, there is (Ao, Bo) E C such that A A. whenever (A, B) C S. To find the supremum of S , let A S = ~ { A ( ( A , B ) E S }and
B~=P-A~=~{BI(A,B)ES}.
It is easy to verify that (As, Bs) is a cut. (Note that Bs is nonempty because B. C Bs.) In fact, (As, Bs) is a Dedekind cut: As does not have a greatest element since none of the A's does. Since As 2 A for each (A, B) E S, (As, Bs) is an-upper - bound of S; let us show that (As , Bs) is the least upper bound of S. If (A, B) is any upper bound of S, then A g 2 forall (A, B) E S, and, so, As = U{A I (A, B ) E S} A: hence ( A s ,Bs) 4 ( A , B). Thus (A s , Bs) is the supremum of S.
c
Theorem 5.3 is now proved. In particular, the ordered set ( Q , <) of rationals has a unique completion (up to isomorphism); this is the ordered set of real numbers. As the ordering of real numbers coincides with < on Q , it is customary to use the symbol < (rather than 4)for it as well. 5.6 Definition The completion of (Q, <) is denoted (R,<); the elements of R are the real numbers.
We get immediately the following characterization of ( R , <).
5.7 Theorem ( R , <) is the unique (up to isomorphism) cumplete linearly ordered set without endpoints that has a countable subset dense in it. Proof, Let (C, 4 ) be a complete linearly ordered set without endpoints, and let P be a countable subset of C dense in C. Then (P,4)is isomorphic to (Q,<) and by the uniqueness of completion (Theorem 5.3),( C , 4 ) is isomorphic to the completion of ( Q , <), i.e., t o (R,<). 0
I t remains to define algebraic operations on R and show that they satisfy the usual laws of algebra and that on rational numbers they agree with their previous definitions. A s this topic is of more interest to algebra and real analysis than set theory, we do not pursue it here. The reader interested in Iearning how the arithmetic of real numbers can be rigorously established can a t this point read Section 10.2. Exercises
5.1 Prove that there is no X E Q for which x2 = 2. [Hint: Write z = p / q where p, q c Z are relatively prime, and use = 2q2 to show that 2 has to divide both p and
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
90
5.2 Show that ( A , B ) , where A = { X E Q ( X < 0 or (X > 0 and x 2 < 2 ) ) . B = {X E Q ( X > 0 and '2 > 2 ) , is a gap in ( Q , < ) . [Hint: ?b provc that A does not have ;L greatest element. given X > 0.:c2 < 2 . find i i ratiorial E > 0 such t h a t (X: E ) <~ 2. I t suffices to takt. E < .c suc,ll t l l a t X% 3xe < 2.1 5.3 Let 0.ala2as. . . be an infinite. but not periodic, decimal expansiori. L(tt A = { X E Q 1 X < O . U ~ ( L ~ . .forsolnek .(J~ t N B = (:I: E Q X 2 O.alaa... a k for all k f N - ( 0 ) ) . Show t h a t ( A . B ) is ;I gap i l l
+
(OH,
(Q1
4.
5.4 Show that a dense linearly ordered set (P,<) is complete if' and only if' every nonempty S L P bounded from below has an infimum. 5.5 Let D be dense in (P,<), arid let E be dense in (D, <). Show that E is dense in ( P , < ) . 5.6 Let F be the set of all rational numbers that have a decimal expalision with only a finite number of nonzero digits. Show that F is dense in Q. 5.7 Let D (the dyadic ratio7luls) be the set of a11 numbers r n / 2 " whcbrth ),L is an integer and n is a riatural riumber. Show that D is derlse in Q. 5.8 Prove that the set R - Q of all irrational rlunibers is dense irr R. [ H ~ J J ~ : Given a < b , let z = ( a + b ) / 2 if it is irrational. r = ( o + b ) / & ot,l~clr\vist~. Use Exercise 5.1.]
6.
Uncountable Sets
All infinite sets whose cardinalities wc have determined up t o this point t,url~cbtl out t o be countable. Naturally, a question arises whether perhaps all infinite sc?t.s are countable. If it were so, this book might end with t h e preceding sec*tion. It was a great discovery of Georg Carltor that unr:ountable sets. in fact. exist,. 'l'liis discovery provided an impetus for the dcvelopme~ltof set theory and bcc:;~niclii source of its depth and richness. 6.1 Theorem The set R of all real numbers is uncountable
Prooj.
(R,<) is a dense linear ordering without endpoints. If R were
countable, ( R , <) would be isomorphic to ( Q , <) by Theorem 4.9. But this is 0 not possible because ( R , <) is complete and (Q,<) is not. The proof above relies on the theory of linear orderings developed in Sectiorl 4. Cantor's original proof used his famous "diagonalization argument." 1% givtb it below.
Cantor's Proof of Theorem 6.1.
Assume that R is countable, i.e.. R is t.hv range of some infinite sequence (r,, Let a.(n).al a,'n'a 'n) , - . . bp tllr det:i~ilitl expansion of r,. (We assume that a decimal expansion does [lot contain orilv t htl digit 9 from some place on, so each real number has a ur~iquedecimal r?xp;~risiori. See Sectior, 10.1.) Let b, = 1 if' U!:) = 0 , b, = 0 otherwise: and let I- btx tliv
6. UNCOUNTABLE SETS
(31
real number whose decimal expansion is O.bl b2b3 . . . We have bn r # r,, for all n = 1,2,3,. . . , a contradiction.
#
a!:). hence
The combinatorial heart of the diagonal argument (quite similar t o Russell's Paradox, which is of later origin) becomes even clearer in the next tlleorem. 6.2 Theorem The set of all sets of natural num.be7.s is uncountable; in fact,
l P ( N ) l> ( N I Proof. The function f : N -+ P ( N ) defined by f ( 7 1 ) = { n ) is on(?-to-one, so ( N1 < I P ( N )I . We prove that for every sequence (S, 1 n C N ) of stibsets of N there is some S N such that S # S, for all n E N . This shows that t.herr! is no mapping of N onto P ( N ) ,and hence (NI < I P ( N ) J . We define the set S N as follows: S = {n E N I n 4 S,). The 1111ulber n is used to distinguish S from S,: If n E S,, then lz. @ S, and if 12 $ S,,. then n.C S. In either case, S # S,,, as required. •
c
Detailed study of uncountable sets is the subject of Chapter 5 (and subsequent chapters). Here we only prove that the set 2" = {U, 1)" of all infinite sequences of U's and l's is also uncountable, and, in fact, has the same cardinality as P ( N ) and R. 6.3 Theorem lP(N)I = 1 2 ~=1 (RI.
Proof. For each S (0, l ) , as follows:
C N define the characteristic
function of S ! x.9 : N
-+
It is easy t o check t h a t the correspondence between sets and their characteristic functions is a one-to-one mapping of P ( N ) onto {0,1}". To complete the proof, we show t h a t IR( < IP(N)I and also 12N( < [RI and use the Cantor-Bernstein Theorem. (a) We have constructed real numbers as cuts in the set Q of all rational numbers. The function that assigns t o each real number 7- = (A. B) the set A C Q is a one-to-one mapping of R into P ( Q ) . Therefore (RI IP(Q)(. As JQI = ( N I , we have / P ( & ) [= IP(lV)I (Exercis(->6 . 3 ) . Hence \RI 5 IP(N)I, ( h ) To prove 12" 1 5 IRI we use the decimal representation of real nurnbers. Tlie function that assigns to each infinite sequence (a,):==, of 0's and l's the unique real number whose decimal expansiori is O.aaalae- - . is a one-to-oric mapping of 2" into R. Therefore we have (2"I 5 (RI. U
<
We introduced No as a notation for the cardinal of N . Due to Theorem G.3, the cardinal number of R is usually denoted 2"". The set R of all real
92
CHAPTER 4. FINITE, COUNTABLE, AND UNCOUNTABLE SETS
numbers is also referred to as "the continuum"; for this reason, 2"" is called the cardinality of the continuum. In this notation, Theorem 6.2 says t h a t No < 2'". Exercises
-
6.1 Use the diagonal argument t o show t h a t N N is uncountable. [Hint: Consider (a, I n E N ) where a , ( a n x I k E N ) . Define d E N N by dn = arm l.] 6.2 Show t h a t IN"] = 2'". [Hint: 2N c N N 2 P ( N x N ) . ] 6.3 Show t h a t 1Al = \ B (implies IP(A)I = ) P ( B ) I .
+
Chapter 5
Cardinal Numbers 1.
Cardinal Arithmetic
Cardinal numbers have been introduced in Chapter 4. The present chapter is devoted to the study of their general properties, with the particular emphasis on the cardinality of the continuum, 2"). In this section we set out to define arithmetic operations (addition, multiplication, and exponentiation) on cardinal numbers and investigate the properties of these operations. To define the sum K.+X of two cardinals, we use the analogy with finite sets: If a set A has a elements, a set B has b elements, and if A and B are disjoint, then A U B has a b elements.
+
1.1 Definition
K
+ X = IA U B Jwhere JAl = K, \B1= X, and
A n B = 0.
In order to make this definition legitimate, we have to show that K + X does not depend on the choice of the sets A and B. This is the content of the following lemma.
1.2 Lemma If A , B , A', B' are such that tAJ= (A'I, IBI 0 = A' n B', then ( A U B1 = (A' U B'(.
=
J B ' Jand , AnB
=
Let f and g be, respectively, a one-to-one mapping of A onto A' Proot and of B onto B'. Then f U g is a one-to-one mapping of A U B onto A' U B'. D Not only does addition of cardinals coincide with the ordinary addition of numbers in case of finite cardinals, but many of the usual laws of addition remain valid. For example, addition of cardinal numbers is commutative and associative: (a) K . + X = X + K . . (b) K + ( X p ) = (K X) p. These laws follow directly from the definition. Similarly, the following inequalities are easily established:
+
+ +
CHAPTER 5. CARDINAL NUMBERS
94
5 K -k X. (d) If K , 5 6 2 and XI I X2, then (C)
K.
+ X1 IK Z + X2.
~1
However, not all laws of addition of numbers hold also for addition of cardinals. In particular, strict inequalities in formulas are rare in case of infinite cardin;tls and, as is discussed later (K6nig's Theorem), those that hold are quite c l i f i c : ~ i l t to establish. As an example, take the simple fact that if n # 0, then 71.+ 72 > 7 1 . If K is infinite, then this is no longer true: We have seen that No No = N o ( s c ~ c h 3.16(b) in Chapter 41, and the Axiorn of Choice implies that ti + K = K for evrlry infinite K . The multiplication of cardinals is again motivated by prop~rt~ics of r n t ~ l t i p l i cation of numbers. If A and B are sets of a and b elements. respectively, t h c ~ the procluct A X B has a . b elements.
+
1.3 Definition
K
- X = \A X
B / , where IA] =
K
and IBI
=
X.
The legitimacy of this definition follows from Lemma 1.4
1.4 Lemma IA' X B'(.
A'
If A, B , A', B'
Proof. Let j : A B' as follows:
-t
are such that ( A (= (A'(,( B (= ( B f ( ,then ( AX B1 =
A', g : B
B' be mappings. We define h : A
-+
X
B
-+
X
'1
=
(f( ( ' ) ) g ( ' ) ) '
Clearly, if f and g are oneto-one arid onto, so is h. Again, multiplication has some expected properties; in particular, it. is cornmutative and associative. Moreover, the distributive law holds. (e) K . X = X -K . ( f ) K . (X p ) = ( K - X) . p. (g) K . ( X + p ) = K . X + K . P . The last property is a consequence of the equality
that holds for any sets A, B, and C . We also have: (h) K 5 K . X if X > 0. (i) If ~ 1 5 K:! and X1 5 XZ, then K I X 1 5 ~ 2 X2.. To make the analogy between multiplication of cardinals and multiplicatior~of numbers even more complete, let, us prove (j) K + K = ~ . K . +
If / A / = K , then 2 . r; is the cardinal of (0, I ) X A. We note that Prooj, (0, l ) X A = ((0) X A ) U ( { l ) X A), that I{O) X AI = / { l ) X AI = K , and that the two summands are disjoint. Hence 2 . K = K + K . 0 As a consequence of (j), we have (k) K + K 5 K . K , whenever K ) 2 .
1. CARDINAL ARITHMETIC
95
As in the case of addition, multiplication of infinite cardinals has some properties different from those valid for finite numbers. For example, No . No = NI, [see 3.16(b) in Chapter 41. (And the Axiom of Choice implies that K. - K = K. for all infinite cardinals.) To define exponentiation of cardinal numbers, we note that if A and B are finite sets, with a and b elements respectively, then a b is the number of all functions from B to A.
1.5 Definition
K.'
=
The definition of
K.'
1 ~ ~ where 1 , [A[ = K. and
(B1 = X.
does not depend on the choice of A and B.
1.6 Lemma If IA] = IA'I and (B1= (B'I, then IA
B
1-I
1.
Let f : A -+ A' and g : B -+ B' be one-to-one and onto. Let F : A B -+ A ~ be~ defined ' as follows: If k E A ~ let, F ( k ) = h, whew h E A ' ~ ' is such that h(g(b))= f(k(b)) for all b E B, i.e.. h = f o k o $ - l . Proof.
Then F is one-to-one and maps A B onto A' B'. It is easily seen from the definition of exponentiation that if X > 0. (1) K. <_ (m) X < n " f h : > 1. (n) If K I 5 nz and XI Xz, then tcfl 5 K.;'. We also have (0) K . . K. = K ~ . To prove (o), it suffices to have a one-to-one correspondence between A X A. t.11~ set of all pairs (a, b) with a , b E A, and the set of all functions from { O , 1 ) into A. This correspondence has been established in Section 5 of Chapter 3. The next theorem gives further properties of exponentiation.
<
1.7 Theorem (a,) K . ~ + P=
(bj (C)
=
(K."W
(K
X)'
FP.
K."?
= K.' .
X'
Proof. Let K. = 1 K 1, X = I LI,p = I M I . To show ( a ) , assume that L and h4 [. are disjoint. We construct a one-to-one mapping F of K L X K" onto K
CHAPTER 5. CARDINAL NUMBERS
96
If ( f , g ) E K' X K ~ welet , F ( f , g ) = f U g . We note that f ~g is a function. LUM , and every h E K M is equal to F( f , g ) for some in fact a member of K M X K (namely, f = h L , g = h I M). It is easily seen t h a t F is ( j , g )E one-to-one. To prove (b), we look for a one-to-one map F of K L x onto (K L ) I I f . A typical element of M is a function f : L X M --+ K. We let F assign to f ' the function g : M + K L defined as follows: for all m E M, g(m) = t~ E K L where h(1) = j ( l , m) (for all 1 E L). We leave it to the reader to verify that. F is one-to-one and onto. For a proof of (c) we need a one-to-one mapping F of X onto ( K X L ) ~ For . each ( f i , f i ) E x L M , let F ( f l ,f i ) = g : M -+ (K X L ) where g(m)= ( f l ( m ) f, 2 ( m ) )for , all m E M. It is routine t o check t h a t F is one-to-one and onto. We now have a collection of useful general properties of cardinals, but the only specific cardinal numbers encountered so far are natural numbers, No, ancl 2N0. We show next that, in fact, there exist many other cardinalities. The furldamental result in this direction is the general Cantor's Theorem (see Theorenl 6.2 in Chapter 4 for a special case). 1.8 C a n t o r ' s T h e o r e m
I X I < IP ( X ) / ,for every set
X.
Proof.
The proof is a straightforward generalization of the proof of Theorem 6.2 in Chapter 4. Its heart is a n abstract form of the diagonalization argument. First, the function f : X -, P ( X ) defined by f ( x ) = { X )is clearly one-toone, and so 5 J P ( X ) 1 .It remains t o be proven that there is no mapping of X onto P ( X ) . So let f be a mapping of X into P ( X ) . Consider the set S = { X E X ( X 4 f (X)). We claim that S is not in the range of f . Supposr that S = j ( z ) for some z E X . By definition of S , 2 E S if and only if z f ( z ) : so we have z E S if and only if z # S, a contradiction. This shows that j is not C onto P ( X ) ,and the proof of < [ P ( X ) is ( complete.
1x1
1x1
The first half of Theorem 6.3 of Chapter 4 also generalizes to arbitrary sets. 1.9 Theorem / P ( X ) I= 21xl, for every set X .
Proof. Replace N by X in the proof of l P ( N ) I = 2INI in Theorern 6.3 in Chapter 4. C l Cantor's Theorem can now be restated in terms of cardinal numbers xs follows: K < 2 K for every cardinal number K. We conclude this section with a n observation t h a t for any set of cardinal numbers, there exists a cardinal number greater than all of them.
1. CARDINAL ARITHMETIC
97
1.10 Corollary For any system of sets S there is a set Y such that JY1> J XI
holds for all X
C
S.
Proof. Let Y = P ( U S). By Cantor's Theorem, 1Y 1 > ( US ( ,and clearly ( U S 1 2 1x1 for all X E S (because X 2 U S i f X E S). U
Exercises
1 .l Prove properties (a)-(n) of cardinal arithmetic stated in the text of this section, 1.2 Show that K* = 1 and K' = K for all K . 1.3 Show that Z K = 1 for all K and 0" = 0 for all K > 0. 1.4 Prove that 5 2K'K. 1.5 If ( A ( 5 (B1 and if A # 0, then there is a mapping of B onto A. We later show, with the help of the Axiom of Choice, that the converse is also true: If there is a mapping of B onto A, then IA( 5 ( B ( . 1.6 If there is a mapping of B onto A, then 2IAI 21sl. [Hint: Given g mapping B onto A, let f (X) = g- '[X], for all X A.] 1.7 Use Cantor's Theorem to show that the "set of all sets" does not exist. 1.8 Let X be a set and let f be a one-to-one mapping of X into itself such that f \ X ] C X . Then X is infinite.
<
Call a set X Dedekznd infinite if there is a one-to-one mapping of X onto its proper subset. A Dedekznd finite set is a set that is not Dedekind infinite. The remaining exercises investigate properties of Dedekind finite and Dedekind infinite sets. 1.9 Every countable set is Dedekind infinite. 1.10 if X contains a countable subset, then X is Dedekind infinite.
1.11 If X is Dedekind infinite, then it contains a countable subset. [Hint:Let X E X - f ( X ] ; define xo = X , X I = f (xo), . . . , X n + l = f (X*), . . . . The set (X, I n E N ) is countable.]
Thus Dedekind infinite sets are exactly those that have a countable subset. Later, using the Axiom of Choice, we show that every infinite set has a countable subset; thus Dedekind infinite = infinite. 1 . l 2 If A and B are Dedekind finite, then A U B is Dedekind finite. [Hint: Use Exercise 1.11 .l 1 .13 If A and B are Dedekind finite, then A X B is Dedekind finite. [Hint: Use Exercise 1.11 .l 1.14 If A is infinite, then P(P(A)) is Dedekind infinite. [Hint: For each n E N , let S, = ( X C A 1 = n). The set (S, 1 n E N ) is a countable subset of P(P(A)).l
1x1
CHAPTER 5 . CARDINAL N UA4BER.S
98
2.
The Cardinality of the Continuum
We are by now well acquainted with the properties of the cardinal number No. the cardinality of countable sets. We summarize them here for rrferenc:e. using the cortcepts of cardinal arithmetic: introduccd it1 the precetling sectiol~. (a) K. < No if arid orily if K. E N. (b) n + N o = No H0 = No ( n E N). ( c ) n . Eto = N o , Eto = No ( n E N.rz > 0 ) . (d) N;: = No (71 E N , 72 > 0 ) . In the present section we study t h e second rnost important infinite cardilinl number, the cardiriality of the c:oritinuum. 2"0. 'To begirl with. wc recall that 2"" is indeed the cardinality of t l ~ t set l R of all real riumbers.
+
2.1 Theorem
Proof.
1RJ= 2 N 0
This is the second half of Theorem 6 . 3 in Chapter 4 .
l1
T h e next theorem summarizes arithmetic properties of'the carciint~lityof' tlio continuum.
.
2.2 Theorem (a) n + 2N~1= N o + 2'0 = 2'1' + 2'" = 2'0 (, E N ) . (b) . 2"~' = N o - 2"" = 2N" 2H" = 2H" (, E N , > 0). (c) (2'0)" = (2"")"0 = NO = = 2'11 (71 E N , n > 0)
NOHI
Proof. (a) This follows from the obvious sequence of inequalities
by the Cantor-Bernstein Theorem. (b) Similarly, we have
(c:)
We have both 2N"
and
< ( 2K1 )
2No < - nNo < -
5 (2"")""
N N I I< 0 -
2
= 2% = 2
(2N")'"
p,, = ~ L I , C1
It is interesting t o notice that Theorem 2 . 2 , although a n e~asyc o r o l l ; ~ ~o .f, ~ laws of cardinal arithmetic and the Cantor-Bernstein Theorem, h{zs some rather unexpected consequences. For example, 2N0 . 2'" = 2'~' means that ( R X RI = IR1; however, the set R X R of all pairs of real numbers is in a one-t,o-one correspondence with the set of all poirlts in the plane (via a cartesian coordiri:ite system). Thus we see that there exists a one-to-one mapping of a straight line
2. THE CARDINALITY OF THE CONTINUUM
99
R onto a plane R X R (and similarly, onto a three-dimensional space R X R X R, etc.). These results (due to Cantor) astonished his contetnporaries; they seem rather counterintuitive, and the reader may find i t helpful to actually construct such a mapping (see Exercise 2.7). The next theorem shows that, several important sets have the cardinality of the continuum. 2.3 Theorem (a) The set of a16 (b) The set o j all (c) The set of all (dj The set of all
points in the n-dimensional space Rn has cardinality 2'" complex numbers has cardinality 2"". infinite sequences of natural numbers has cardinality 2'". injinite sequences of real numbers has cardinalitlj 2"".
Prooj. (a) [Rnl = ( 2 ' ( 1 ) ~by definition of cardinal exponentiation; ( 2 ' 0 ) ~ = 2'0 by Theorem 2.2(c). (b) Complex numbers are represented by pairs of reals (see Exercise 2.6 in Chapter 10),so the cardinality of the set of all complex numbers is /RxRI = (2H(1)2= 2Ho (c) T h e set of all infinite sequences of natural numbers is N N and lNNl = = 2"". (d) pN1= (2No)W" = 2'".
~2'
0 T h e next theorem helps t o establish further results of this sort. 2.4 Theorem I f A is a countable subset of B and ( B (= 2N". then [B-AI = 2N".
(Here we remark t h a t using the Axiom of Choice, we are able to show in general t h a t if [ A [< IB(,then IB - AI = IBI.)
Proof.
We can assume without loss of generality t h a t B = R
Let P = dom A:
P=
{X
E R
I (x,y) E
A for some
X
R.
g).
Since IAl = No, we have IPI 5 No. Thus there is xo E R such t h a t xo # P. Consequently, the set X = ( s o )X R is disjoint from A, so X (R X R) - A . Clearly, 1x1 = JRI=2'0, and we have J ( R x R ) - A1 2 2'1'. U
c
CHAPTER 5. CARDINAL NUMBERS 2.5 (a) (b) (c)
Theorem The set of all in-atzonal numbers has cardinality 2N0. The set o j all infinite sets o j natural numbers has cardinality 2N0. The set o j all one-to-one mappings oJ N onto N has cardinality 2 N o .
Prooj. (a) The set of all rationals Q is countable, hence the set R - Q of all irrational numbers has cardinality 2N0by Theorem 2.4. (b) The set of all subsets of N, P ( N ) , has cardinality 2"", and the set of all finite subsets of N is countable (see Corollary 3.11 in Chapter 4 ) , hence the set of all infinite subsets of N has the cardinality of the continuum. (c) Let P be the set of all one-to-one mappings of N onto N; as P 2 N N . clearly IPI 5 2'". Let E and 0 , respectively, be the sets of all even and N odd natural numbers. If X C E is infinite, define a mapping jx : N as follows:
-
jx(2k) = the kth element of X (k E N); jx(2k
+ 1) = the kth element of N
-
X (k
C
N).
Notice that N - X 3 0 is infinite, so jx is a one-to-one mapping of IV onto N. Moreover, it is easy to show that XI # Xzimplies jx, # f x , . We thus have a one-to-one correspondence between infinite subsets of E and certain elements of P. Since there are 2'(' infinite subsets of E by Theorem 2.5(b), we get [ P ] 2"" as needed.
>
0 Yet other similar results are provided by the next theorem. The definitions and basic properties of open sets and continuous functions can be found in Section 3 , Chapter 10. 2.6 Theorem
(a) The set o j all continuous junctions on R to R has ca~dznality2"". (b) The set o j all open sets o j reals has cardinality 2N0. Prooj. (a) We use the fact (proved in Theorem 3.1 1 , Chapter 10) that every continuous function on R is determined by its values on a dense set, in particular 63. its values a t rational arguments: If f and g are two continuous functioris on R, and if j ( q ) = g ( q ) for every rational number g , then $ = g . 'Tlius let C be the set of all continuous real-valued functions on R. Let F be ;I defined by F ( j ) = f I Q. By the fact above. F' mapping of C into is one-to-one, so (C( 1 ~ = (2N'1)N" ~ 1 = 2'(l. On the other hand. clearly jCI 2 2'" (consider the constant functions). (b) Every open set is a union of a system of open intervals with rational endpoints (see Lemma 3.14 in Chapter 10). There are N o open intervals with rational endpoints (each such interval is determined by an ordered pair of
<
2. THE CARDINALITY OF THE CONTINUUM
10 1
rationals), and hence 2"" such systems. This shows that there are at most 2Nnopen sets. On the other hand, if a , b E R, a # b, then (a, m) # (b,m), so there are at least 2"" open sets. The results of this section demonstrate the importance of the cardinal number 2N". It should not be surprising that the problem of determining the magnitude of 2N0is of fundamental significance. We know that 2N0is greater than No, but how much greater? Cantor conjectured that 2"" is the next cardinal number after No. This is the famous Continuum Hypothesis. The Continuum Hypothesis
that
K. <
There is no uncountable cardinal number
K
such
2No.
In words, the Continuum Hypothesis asserts that every set of real numbers is either finite or countable, or else it is equipotent to the set of all real numbers. There are no cardinalities in between. In 1900, David Hilbert included the Continuum Problem in his famous list of open problems in mathematics (as Problem 1). It is still not fully resolved today. In 1939, Kurt Gtidel showed that the Continuum Hypothesis is consistent with the axioms of set theory. T h a t is, using the axioms of Zermelo-Fraenkel set theory (including the Axiom of Choice), one cannot refute the Continuum Hypothesis. In 1963, Paul Cohen proved t h a t the Continuum Hypothesis is independent of the axioms. This means that one cannot prove the Continuum Hypothesis from the axioms. We discuss these questions in more detail in Chapter 15. We conclude this section with a n example of a set that has cardinality greater than the continuum. 2.7 Lemma The set of all real-valued functions on real numbers has cardznalzty 22H" > 2N11.
Proof.
The cardinal number of RR is (2"0)2*" = 2 ~ 0 . 2-~ 22'O. ~'
0
Exercises
2. Z Prove that the set of all finite sets of reals has cardinality 2N".We remark here that the set of all countable sets of reals also has cardinality 2N0, but the proof of this requires the Axiom of Choice. 2.2 A real number X is algebmic if it is a solution of some equation
where a*, . . . ,a, are integers. If X is not algebraic, it is called transcendental. Show that the set of all algebraic numbers is countable and hence the set of all transcendental numbers has cardinality ZNn. 2.3 If a linearly ordered set P has a countable dense subset, then IPI 5 2N0. 2.4 The set of all closed subsets of reals has cardinality 2"".
102
C H A P T E R 5 . C A R D I N A L NUMBERS
2.5 Show that, for n > 0, n . 22N"= No . z2'" = 2N" . 22*" = 22'" . 22'" 22Nc~ 0'2 n - 22'" Nt) = p0 p0 (2 1 - ( 1 (2 1 2.6 The cardinality of the set of all discontinuous functions is 2"". [Hint: Using Exercise 2.5, show that I R- C ~ )= 2'*" whenever JCl 2N0.J 2.7 Construct a one-to-one mapping of R X R onto R. [Hint: If a. b E (0. l ? have decimal expansions 0.alu;?a3 . . - and 0. bl b2b3 - - . rimp the ord(tret1 pair ( a ,b) onto 0.albla2b2asb3. E [0,l]. Make adjustments to avoid sequences where the digit 9 appears from some place onward.]
<
Chapter 6
Ordinal Numbers 1.
Well-Ordered Sets
When we introduced natural numbers, we were motivated by the need t o formalize the process of "counting": the natural numbers s t a r t with 0 and are generated by successively increasing the number by one unit: 0, 1, 2, 3. . . . , and so on. We defined the operation of successor by S(z) = s U ( X ) and introduced natural numbers as elements of the smallest set containing 0 and closed under S . It. is desirable t o be able to continue the process of counting beyorid natural numbers. T h e idea is t h a t we can imagine a n infinite number w t h a t corrles "after" all natural numbers and then continue the counting process illto the transfinite: W , w + 1, (W + 1) + 1, and so on. In the present chapter we formalize the process of tibansfinitecounting, and introduce ordinal numbers as a generalization of natural numbers. A s is usriat in any meaningful generalization, the resulting concept shares many feat~tresof natural numbers. Most important, t h e theorems o n induction and recursion arr generalized t o theorems o n transfinite induction and transfinite recursion. As a starting point, we use the fact t h a t each natural number is idcntifieci with the set of all smaller natural numbers: n = {m E N 1 nz < 7 1 ) . (See Exercise 2.6 in Chapter 3.) By analogy, we let w , the least transfirlit~ilunlbcr. t o he the set N of all natural numbers: W = N = {0,1.2,.. . ). I t is easy t o continue the process after this "limit" step is made: 'l'hc opcration of successor can be used to produce numbers following w in the same way we used it t o produce numbers following 0: S ( w ) = W U ( W )= { 0 , 2 , 2 , . . . , W ) , S ( S ( w ) ) = S ( w ) U ( S ( w ) ) = { 0 , 1 , 2 , . . . ,U, S ( w ) ) , etc.
We use the suggestive notation S ( W )=
W
+ 1,
S ( S ( w ) )=
(W
+ 1 ) + 1 = w + 2,
etc.
CHAPTER 6. ORDINAL NUMBERS
104
In this fashion, we can generate greater and greater "numbers" : w, w + 1. + 2 , ... , W + n , . . . , for all n E N. A number following all w + n can again conceived of as a set of all smaller numbers:
The reader may wish to introduce still greater numbers; e.g.,
The sets we generate behave very much like natural numbers, in this respect: they are linearly ordered by E , and every nonempty subset has a least element. We called linear orderings with this property well-orderings (see Definition 2.3 in Chapter 3). As the concept of well-ordering is, along with cardinality, one of the most important ideas in abstract set theory, let us recall the definition: 1.1 Definition A set W is well-ordered by the relation < if (a) (W, <) is a linearly ordered set. (b) Every nonempty subset of W has a least element. The sets generated above are all examples of sets well-ordered by the relatiorl E . Later in this chapter we show that all well-orderings can be represented by such sets and we introduce ordinal numbers to serve as order types of wellordered sets. More examples of well-ordered sets can be found in the exercises at the end of this section. The fundamental property of well-orderings is that they can be comparetl by their "lengths." The precise meaning of this is given in Theorem 1.3. Let (L, <) be a linearly ordered set. A set S C L is called an initial segment of L if S is a proper subset of L (i.e., S # L) and if for every a E S, all X < a are also elements of S. (For instance, both the set of all negative reals and the set of all nonpositive reals are initial segments of the set of all real numbers.) 1.2 Lemma If (W, <) is a well-ordered set and if S is an initial segment of (W, <), then there exists a E W such that S = { X E W ) X < a } . Proof. Let X = W - S be the complement of S. As S is a proper subset of W, X is nonempty, and so i t has a least element in the well-ordering <. Let u be the least element of X. If X < a, then X cannot belong to X , as a is its least member, so X belongs to S. If X a , then X cannot be in S because otherwise a would also be in S as S is an initial segment. Thus S = { X C W I X < U). l3
>
If a is an element of a well-ordered set (W,<), we call the set
1. WELL-ORDEItED SETS
105
the initial segment of W given b y a. (Note that if a is the least element of W , then W [ a ]is empty.) By Lemma 1,2, each initial segment of a well-ordered set is of the form W [ a ]for some a E W. Of course, W[a] is also well-ordered by < (more precisely, by < nw[aI2); we usually do not mention the well-ordering relation explicitly when it is understood from the context. 1.3 Theorem If (W1,< l ) and (W2,cz)are well-ordered sets, then exactly one of the following holds: (a) either W1 and W2 are isomorphic, or (b) W1 is isomorphic to a n initial segment of W z , or (c) W2 is i s o m o ~ p h i ct o a n initial segment of W1. I n each case, the isomorphism is unique.
This theorein provides the met hod of comparison for well-orderings mentioned above: we say that Wl has smaller order type than W2 if W1 is isomorphic to W 2 [ a ]for some a E W2. Before we prove Theorem 1.3, we prove the following lemma and state some of its corollaries. A function f on a linearly ordered set (L, <) into L is increasing if x l < x2 implies f (xi) < f (x2). Note that an increasing function is one-to-one, and is an isomorphism of (L, <) and (ran f , <). 1.4 Lemma If (W, <) i s a well-ordered set and if f : W function, then f (X) X for all X E W.
>
-t
W i s a n increasing
If the set X = { X E W I f (X) < X ) is nonempty, it has a least Proof. element a . But then f ( a ) < a , and f (f ( a ) ) < f ( a ) because f is increasing. This means that f ( a ) E X , which is a contradiction because a is least in X . 0 1.5 Corollary (a) N o well-ordered set is isomorphic t o a n initial segment of itself. (b) Each well-ordered set has only one automorphism, the identity. (c) If W] and W2 are isomorphic well-ordered sets, then the isomorphism between W1 and W2 i s unique.
Proof. (a) Assume that f is an isomorphism between W and W[a] for some a E W. Then f ( a ) C W [ a ]and therefore f ( a ) < a , contrary to the lemma, as f is an increasing function. (b) Let f be an automorphism of W. Both f and f - are increasing functions and so for all X E W, f (X) 2. X and f -'(X) 2 X, therefore X 2 f (X). It follows that f (X)= X for ali X E W. (c) Let f and g be isomorphisms between Wl and W2. Then f o g-l is an automorphism of W2 and hence is the identity map. It follows that f = g. 1.6 Proof of Theorem 1.3 Let Wl and W2 be well-ordered sets. It is a consequence of Lemma 1.4 that the three cases (a), (b), and (c) are mutually
CHAPTER 6. ORDINAL NUMBERS
106
exclusive: For example, if Wl were isomorphic to W2[t12jfor some a2 E W2 arlcl a t the same time W2 were isomorphic to Wl(al] for some a1 E W1, then the colnposition of the two isomorphisms would be an isomorphism of a well-orderc~l set onto its own initial segment. Also, the uniqueness of the isomorphism in each ccwefollows from Corolltlrj7 1.5; thus, we only have t o show that one of the three cases (a). (b). ar~rl( ( . I idways holds. We define a set of pairs f G W1 X W2 and show that either f or f is an isomorphism attesting to (a), (b), or (c). Let
'
f = {(X,y) E W1 x W2 I Wl[x] is isomorphic to W2[y])
First, it follows from Corollary 1.5 (a) that f is a one-to-one function: If Wl [ X ) is isomorphic both to W2[y] and to Ur2[y'],then y = g' because otherwise Ur21y] would be an initial segment of W2(yf](or vice versa) while they are isomorphic:. and that is impossible. Hence (X,y) f f and (X,y') E f imply y = y'. A similar argument shows that ( X ,y ) E f ancl (X', y) E f imply z = a'. Second, X < X' implies f (X) < f (X'): If h is the isoniorphisnl between Wl [X'] and Wz[f (X')],then the restriction h 1 Wl [X] is an isomorphism between \VI[ X ] and W2[h(x)],so f ( X ) = h(z) and f (X)< f (X'). Hence f is an isomorphism between its domain, a subset of W1, slid its range, a subset of Wz. If the domain of f is W1 and the range of f is W2. the11 W1 is isornorphic to W2. We show now that if the domain of f is not all of M,', then it is its initial segment, and tl-te range o f f is all of W2. (This is t:r~ought o compiete the proof as the remaining case is obtained by interchar~gingthe role of W1 and W2.) So assume that dom f # W1. We note that the set S = clon1f is an initial segment of W1: If X E S and z < X , let h be the isomorphisrn between WI [X] and W2[f(X)];then h W1 It] is an isomorphism between HTl [z] and W2l h ( z ) ! , so z E S. To show that the set T = ran f = W2, we assume otherwise artd, by a similar argument as above, show that T is an initial segment o f W 2 But then dom f = Wl [a] for some a W1, and ran f = Wz[b] for some b E W2. I n other words, f is an isomorphistn between Wl [U]and W2[bj. This means. by the clefinition of f , that (a, b) C f , so n E dom f = Wl[nj, that is, a < a . ;i contradiction. III:
r
Exercises
1.1 Give an example of a linearly ordered set (L, <) and an initial segnter~t S of L which is not of the form {XI X < a ) , for any a E L. 1.2 W + 1 is not isomorphic to w (in the well-ordering by E ). 1.3 There exist 2*" well-orderings of the set of all natural numbers. 1.4 For every infinite subset A of N , ( A ,<) is isomorphic to ( N ,<).
and (W2,< 2 ) be disjoint well-ordered sets, each isomorphic: 1.5 Let (W1, to ( N , <). Show that the sum of the two linearly ordered sets (as defined in Lemma 4.5 in Chapter 4) is a well-ordering, and is isomorpl~icto the ordinal number w + w = { 0 , 1 , 2 , .. . , W , W + 1, w + 2 , . . . ). 1.6 Show that the lexicographic product (N X N, <) (see Lemma 4.6 in Chapter 4) is isomorphic to w - w .
2. ORDINAL NUMBERS
107
1.7 Let (W, <) be a well-ordered set, and let a 4 W. Extend < t o W'' = W U { a ) by making a greater than all X E W . Then W has smaller order type than W'. 1.8 The sets W = N X { O , 1 ) and W' = ( 0 , l ) X IV,ordered lexicographically are nonisomorphic well-ordered sets. (See the remark following Theorem 4.7 in Chapter 4 . )
2.
Ordinal Numbers
In Chapter 3 we introduced the natural numbers to represent both the cardinality and the order type of finite sets and used them t o prove theorems on induction and recursion. We now generalize this definition by introducing ordinal numbers. Ordinal numbers continue the procedure of generating larger numbers into transfinite. As was the case with natural numbers. ordinal numbers are defined in such a way that each is well-ordered by the E relation. h'loreovrtr, t . 1 ~ collection of all ordinal numbers (which as we shall see is not a set) is itself wellordered by E , and contains the natural numbers as an initial segment. And nlost significantly, ordinal numbers are representatives for all well-ordered sets: every well-ordered set is isomorphic t o an ordinal number. Thus ordinal numbers can be viewed as order types of well-ordered sets. 2.1 Definition A set T is transitive if every element of T is a subset of T.
In other words, a transitive set has the property that U E v E 7' implies U E T. For more on transitive sets we refer the reader t,o the exercises in this section. and to Chapter 14. 2.2 Definition A set cw is an ordinal number if (a) a is transitive. (b) a is well-ordered by E,.
It is a standard practice t o use lowercase Greek letters t o denote ordinal numbers. Also, the term ordinal is often used for ordinal number. For every natural number m, if k E 1 E m (i.e., k < l < m): the11 k E 1 1 1 . Hence every natural number is a transitive set. Also, every natural ~ ~ u r n b eisr well-ordered by the E relation (because every n E N is a subset of N a r l r l N is well-ordered by E ). Thus 2.3 Theorem Every natural number is a n ordinal.
T h e set N of all natural numbers is easily seen to be transitive, and is also well-ordered by E . Thus N is a n ordinal number. 2.4 Definition w = N
We have just given the set N a new name W
CHAPTER 6. ORDINAL NUMBERS
108
2.5 Lemma If a is an ordinal number, then S(a) is also an ordinal number
Proof. S(&) = aU {a) is a transitive set. Moreover, c r {~a )is well-ordered by E , a being its greatest element, and cr C a U { a ) being the initial segment given by a. So S ( a ) is an ordinal number. Cl We denote the successor of a by cu + 1 a
+ 1 = S(&) = a U {a)
An ordinal number a is called a successor ordinal if a = P + 1 for some p. Otherwise, it is called a limit ordinal. For all ordinals a and P, we define a < P if and only if a E P, thus extending the definition of ordering of natural numbers from Chapter 3. The next theorem shows that < does indeed have all the properties of a linear ordering, in fact, of a well-ordering. 2.6 Theorem Let a,Q, and y be ordinal numbers. (a) If a < G , and < y, then cr < y. (b) a < 0 and G , < a cannot both hold. ( C ) Either a < P or cy = 0 or P < a holds. (d) Every nonempty set of ordinal numbers has a <-least element. Conscquently, every set of ordinal numbers is well-ordered b y <. (e) For every set of ordinal numDe~*s X , there is an ordinal number cr f X . (In other words, "the set of all ordinal numbers" does not exist.)
The proof uses the following lemmas. 2.7 Lemma If a is an ordinal number, then a
4 cr.
None of the axioms of set theory we have considered excludes the existence of sets X such that X E X . However, the sets which arise in mathematical practice do not have this peculiar property.
Proof. such that
X
If a E a then the linearly ordered set (a,E,) has an element E X , contrary to asymmetry of E,.
X
= cr
0
2.8 Lemma Every element of an ordinal number is an ordinal number.
Let a be an ordinal and let X E a. First we prove that rc is Prooj transitive. Let U and v be such that u E v E X ; we wish to show that U E :K. Since a is transitive and X E a, we have v E cr and therefore, also u E cu. Thus U , v, and X are all elements of cr and u E v E X . Since E , linearly orders a,WE: conclude that U E X . Second, we prove that E, is a well-ordering of X. But by transitivity of Q we have X C a and therefore, the relation E, is a restriction of the relation E,,. C1 Since E, is a well-ordering, so is E,.
2. ORDINAL NUMBERS 2.9 Lemma I f a and
0 are ordinal numbers such that a C 0, then cr E 0.
Proof. Let a C 0. Then 0 - a is a nonempty subset of 0, and hence has a least element y in the ordering € 0 . Notice that y C a: If not, then any 6 E y - cr would be an element of - cr smaller than y (by transitivity of 0). The proof is complete if we show that cr c y (and so cr = 7 E P). Let 6 E a; we show that 6 E y. If not, then either y E 6 or y = 6 (both 6 and y belong to 0, which is linearly ordered by E ). But this implies that y E cr, since a is transitive. That contradicts the choice of y E 0 - a.
Proof of Theorem 2.6 (a) If a E and p E y, then cr E y because y is transitive. and 0 E a. By transitivity, a E a , contradicting (b) Assume that cr E Lemma 2.7. (c) If a and ,B are ordinals, a n 0 is also an ordinai (check properties (a) and (b) from the definition) and a n 0 C a , a n 0 C 0. If cr n 0 = a , then a C 0 and so a E P or a = P by Lemma 2.9. Similarly, cr n ,G = implies p E a or p = cr. The only remaining case, a n 0 C cr and cr n P C P, cannot occur, because then CY n p E o and cr n C p, leading to cr n E cr n p and a contradiction with Lemma 2.7. (d) Let A be a nonernpty set of ordinals. Take a E A, and consider the set c r n A. I f c r n A = 0, cr is the least element of A. If c r n A # 0, a n A cr has a least element 0 in the ordering E,. Then 0 is the least element of A in the ordering <. (e) Let X be a set of ordinal numbers. Since all elements of X are transitive sets, U X is also a transitive set (see Exercise 2.5). It immediately follows from part (d) of this theorem that E well-orders U X ; consequently, X is an ordinal number. Now let a = S(U X ) ; cr is an ordinal number and a 4 X . [Otherwise, we get a C U X and, by Lemma 2.9, either a = U X or cr E U X . In both cases, a S ( U X ) = a , contradicting Lemma 2.7.1
c
U
0 The ordinal number U X used in the proof of (e) is called the suprem?imof X and is denoted sup X. This is justified by observing that X is the least, ordinal greater than or equal to all elements of X: (a) If cr E X , then a G U X ,so cv 5 U X . (b) If a 5 y for all a E X, then a y for all a E X and so U X y. i.e.,
U
U ~ < Y .
c
If the set X has a greatest element 0 in the ordering <, then sup X = D. Otherwise, sup X > y for all y E X (and it is the least such ordinal). We see that every set of ordinals has a supremum (in the ordering <). The last theorem of this section restates the fact that ordinals are a generalization of the natural numbers:
CHAPTER 6. ORDINAL NUMBERS
110
2.10 Theorem The natuml numbers are exactly the finite ordinal nurnbe7.s.
Proof. We already know (from Theorem 2.3) t h a t every natural number is an ordinal, and of course, every natural number is a finite set. So we o ~ l l y have t o prove t h a t all ordinals t h a t are not natural numbers are infinite sets. If a is an ordinal and a $ N , then by Theorem 2.6(b) it must be the case t.11;~t (L 2 W (because a W ) , so a w because (L is transitive. So (L 11as ill1 i r ~ f i l ~ i t t ~ subset and hence is infinite. Every ordinal is a well-ordered set, under the well-ordering E . If ck a11d 3 are distinct ordinals, then they are not isomorphic 'as well-ordered sets because one is a n initial segment of the other. We also prove that a n y well-ordered set is isomorphic t o an ordinal number. This, however, requires an introduc:tion of another axiom, a n d is done in the next section. One final comment. Lemma 2.8 establishes t h a t each ordinal number (E has the property t h a t a =
{ p 1 0 is an ordinal and
< a).
If we view a as a set of ordinals, then if a is a successor, say + 1. the11 it has a greatest element, namely 0. If a is a limit ordinal, then i t does not f-tilvcl i F greatest element, and a = SUP{^ I 0 < a). Note also t h a t by definition, 0 is a limit ordinal, and s u p 0 = 0. Exercises
c
2.1 A set X is transitive if and only if X P ( X ) . 2.2 A set X is transitive if and only if U X C X . 2.3 Are the following sets transitive'! (a) {07{0h{{@}l}Y 2.4 Which of the following statements are true? (a) If X and Y are transitive, then X U Y is transitive.
(b) If X and Y are transitive, then X n Y is transitive. (c) If X E Y and Y is transitive, then X is transitive. (d) If X E Y and Y is transitive, then X is transitive. (e) If Y is transitive and S P ( Y ) , then Y U S is transitive. If every X E S is transitive, then US is transitive. An ordinal a is a natural number if and only if every nonempty subset of a has a greatest element. If a set of ordinals X does not have a greatest element, then sup X is a limit ordinal, If X is a nonempty set of ordinals, then n X is a n ordirtal. hlorc~over. r) X is the least. element of X .
c
2.5 2.6
2.7 2.8
3. THE AXIOM OF REPLACEMENT
3.
111
The Axiom of Replacement
As we indicated a t the end of Section 2, well-ordered sets can be represented by ordinal numbers. The following theorem states precisely what we mean by "representation." 3.1 Theorem Every well-ordered set is isomorphic to a unique ordinal number.
We give a proof of this theorem below. Although the reader may find the proof entirely acceptable, it nevertheless has a deficiency: i t uses an assumption which, however plausible, does not follow from the axioms that we have introduced so far. For that reason it is necessary t.o introduce an additional axiom. Let (W, <) be a well-ordered set. Let A be the set of all those Proof. a E W for which W[a) is isomorphic t o some ordinal number. As no two distinct ordinals can be isomorphic (one is an initial segntent of the 0 t h ) . t,his ordinal number is uniquely determined, and we denot,e it by 0,. Now suppose t h a t there exists a set S such that S = {a, I a E A } . The set S is well-ordered by E as it is a set of ordinals. It is also transitive, because if y E o, E S, let cp be the isomorphism between W [a]and a, and let c = v- (7); it is easy t o see that cp 1 c is an isomorphism between W [ c ]and y and so ..I. E S . Therefore, S is an ordinal number, S = cr. A similar argument shows that a E A, b < a imply b E A: let (F be the isomorphism of W [a] and a,. Then cp W[b] is an isomorphism of IY (b] nnd an initial segment I of a,. By Lemma 1.2, there exists L? < 0, such t,hilt. I = iy E cr, ( y < Q) = Q; i.e., Q = a*. This shows t,l~ath E A and a* < a,. We conclude that either A = W or A = W [ c ]for some c E IV (Lemma 1.2 again). We now define a function f : A -,S = a by f ( a ) = a,. Froin the definition of S and the fact t h a t b < a implies c r b < a, it is obvious that f is an isomorphism of (A, <) and a. If A = W[c], we would thus have c E A , a contradiction. Therefore A = W, and f is an isomorphism of (W, <) and the ordinal cr. This would complete the proof of Theorem 3.1, if we were justified t,n i11;ikr. the assumption t h a t the set S exists. The elements of S are certainly well specified, but there is no reason why the existence of such a set should forirlally follow from the axioms that were considered so far. To further illustrate the problem involved, consider t,wo other c~x~irnpl(~s: To construct a sequence
'
we might define
CHAPTER 6. ORDINAL NUMBERS
112
following the general pattern of recursive definitions. The difficulty here is that to apply the Recursion Theorem we need a set A, given in advance, such that g : N X A + A defined by g(n, X) = {X) can be used to compute the ( n t 1)st term of the sequence from its nth term. But it is not obvious how to prove frorn our axioms that any set A such that
exists. It seems as if the definition of A itself required recursion. Let us consider another example. In Chapter 3, we have postulated existence ofw; from it, the s e t s w + l = wu{w), w + 2 = ( w + l ) u { w + l ) , etc., can easily be obtained by repeated use of operations union and unordered pair. In the introductory remarks in Section 1, we "defined" w + w as the union of w and the set of all w + n for all n E w, and passed over the question of existence of this set. Although intuitively it is hardly any more questionable than the existence of w, the existence of w + w cannot be proved from the axioms we accepted so far. We know that, to each n E W , there corresponds a unique set w n; as yet. we do not have any axiom that would allow us to collect all these W + n into one set. The next axiom schema removes this shortcoming.
+
Let P(x,y ) be a property such that for every X there is a unique y for which P ( x , y) holds. For every set A, there is a set L? such that, for every X E A, there is y E B for which P ( x , y ) holds. The Axiom Schema of Replacement
We hope that the following comments provide additional motivation for the schema. 3.2 Let F be the operation defined by the property P; that is, let F ( x ) denote the unique y for which P ( x , y ) . (See Section 2 of Chapter 1.) The corresponding Axiom of Replacement can then be stated as follows:
For every set A there is a set B such that for all
X
A, F ( z ) E B
Of course, B may also contain elements not of the form F ( x ) for any X E A: however, an application of the Axiom Schema of Comprehension shows that {S,E
B I y = F ( x ) for some
X
i y E B ( P ( x , y) holds for some X E A ) = { y I P ( x , y ) holds for some X E A)
E A) =
exists. We call this set the image of A by F and denote it { F ( x ) I simply F[A].
X
E A ) or
3.3 Further intuitive justification for the Axiom Schema of Replacement [:an be given by comparing it with the Axiom Schema of Comprehension. The latter allows us to go through elements of a given set A, check for each X E A whether or not it has the property P ( x ) , and collect those X which do into a set. In
3. T H E AXIOM OF REPLACEMENT
113
an entirely analogous way, the Axiom Schema of Replacement allows us to go through elements of A, take for each X E A the corresponding unique y having the property P ( x , p), and collect all such y into a set. It is intuitively obvious that the set F [ A ] is "no larger than" the set A. In contrast, all known examples of 'Lparadoxicalsets" are "large," on the order of the "set of all sets." 3.4 Let F be again the operation defined by P, as in (3.2). The Axiom of Replacement then implies that the operation F on elements of a given set A can be represented, "replaced," by a function, i.e., a set of ordered pairs. Precisely:
For every set A, there is a function f such that dom f = A and f ( X ) = F ( x ) for all X E A. We simply let f = {(X, y) E A X B I P(x, y)), where B is the set provided by the Axiom of Replacement. We use notation F 1 A for this uniquely determined function f . Notice that r a n ( F [ A) = F[A]. We can now complete the pmof of Theorem 3.1 : We have concluded earlier that in order to prove the theorem, we only have to guarantee the existence of the set S = {a, I a E W}, where for each a E W , a, is the unique ordinal number isomorphic to W [a]. Let P ( x , y) be the property: Either or X
W and y is the unique ordinal isomorphic to WIx],
X
E
#
W and y = 0.
Applying the Axiom of Replacement [with P ( x , y ) as above] we conclude that (for A = W) there exists a set B such that for all a C W there is a E B for which P ( a , cr) holds. Then we let
S = {a E B I P ( a , a ) holds for some a E W)
= F[WJ
where F is the operation defined by P. Using Theorem 3.1, we have: 3.5 Definition If W is a well-ordered set, then the order type of W is the unique ordinal number isomorphic to W.
And what about the examples mentioned above? It turns out that we need a more general Recursion Theorem. Compare the following theorem with the Recursion Theorem proved in Chapter 3: 3.6 The Recursion Theorem Let G be a n operation. For any set a them is a unique irrfinite sequence (a, I n E N ) such that (a) a0 = a . (b) a,+l = G(a,, n) for all n E N .
CHAPTER 6 . ORDINAL NUMBERS
114
With this theorem, the existence of the sequence (0, {g), { (0)). . . . ) and of W + W follows (see Exercise 3.2). We prove this Recursion Theorem, a s well as the more general Transfinite Recursiori Theorem, in the next section. Exercises
3.1 Let P ( x , y) be a property such that for every X there is at most one p for which P ( x , y) holds. Then for every set A there is a set B such that,. for all X E A, if P ( x , y) holds for some y , then P(x,y ) holds for sornc !j E U . 3.2 Use Theorem 3.6 to prove the existence of (a) The set (0, (01, ({0)), {{(@))), .. .). (h) The set { N , P ( N ) , P ( P ( N ) ) ,. . . ) . (c) T h e s e t w + w - w u { w . w + 1 , ( w + 1 ) + 1 . . . . ). 3.3 Use Theorem 3.6 to define
3.4 (a) Every X E V, is finite. (b) V, is transitive. (c) V, is an inductive set. The elements of V, are called hereditarily finite sets. 3.5 (a) If X E V, and y E V,, then {X,y) E V,. (b) If X E V,, then U X E 11, and P ( X ) E V,. (c) If A E V, and f is a function on A such that f ( X ) E V, for each X E A , then f [ X J E V,. (d) If X is a finite subset of V,, then X E V,.
4.
Transfinite Induction and Recursion
The Induction Principle and the Recursion Theorem are the main tools for proving theorems about natural numbers and for constructing functions urit,h donlain N. We used them both extensively in the previous chapters. In this section, we show how these results generalize to ordinal numbers.
Let P ( x ) be a property (possibl?j 4.1 The 'lkansfinite Induction Principle with parameters). Assume that, for all ordinal numbers a: (4.2)
If P ( P ) holds for all 0 < a , then P(a).
Then P ( a ) holds for all ordinals a . Proof, Suppose that some ordinal number y fails to have property P, and let S be the set of all ordinal nurnbers D y that do not have property P. The
<
4. TRANSFINITE INDUCTION AND RECURSION set S has a least element a. Since every (4.2) that. P(a) holds, a contradiction.
p<
CL
115
11~a.sproperty P. i t follo~lsby
n
U
It is sometimes convenient to use the Transfinite Induction Principle in ;L form which resembles more closely the usual formulation of the Inductioll Priiiciple for N. We do it by treating successor and limit ordi~lalsseparately. 4.3 The Second Version of the Transfinite Induction Principle Let P(x) be a property. Assume that ( a ) P(0) holds. (b) P(a) implies P(a 1) for all ordinals a. ( c ) For all limit ordinals a # 0, i f P(0) holds for all fl < a , then P(o) holds. T h e n P(a) holds for all ordinals a.
+
Proof. It suffices to show that the assumptioris (a). (b), and (c) irrlpty (4.2). So let a be an ordinal such that P(@) for all 4 < 0.If a = 0, then P(u) holds by (a). If a is a successor, i.e., if there is 4 < ck such that, a = 3 + 1. n.cb know that P(0)holds, so P(a)holds by (b). If 0 # 0 is limit, w r 1i;tvr P(o) by (C>
We proceed to generalize the Recursion Theorem. Functions whose domain is an ordinal a are called transfinite sequences of length a. 4.4 Theorem Let R be an ordinal number, A a set, and S = U n, A" the set of all transfinite sequences of elements o f A oflength less than R . Let g : S -+ A be a function. T h e n there exi,qt.s a unique function f : R -+ A such that f(o)= g(f to) for all o < 0 .
The reader might try to prove this theorem in a way entirely analogous to the proof of the Recursion Theorem in Chapter 3. We do not go into the details since this theorem follows from the subsequent more general Transfinitcl Recursion Theorem. If I9 is an ordinal and f is a transfinite sequence of length 6,we use the notation f = (aa I o < 19). Theorem 4.4 states that if g is a function on the set of all transfinite scquurlces of elements of A of length less than fl with values in A, then there is a trt~nsfinit,~ sequence ( a , I 0 < 0 ) such that for all a < 0 , a , = g ( ( a EI < a ) ) .
Let G bc an operation.. th.en thc property P stated in (4.6) defines a n operation F such that F ( a ) = G ( F 1 (1) for all ordinals a .
4.5 The Transfinite Recursion Theorem
Proof. domt = a
We call t a computation of length o based o n G if t is a functiol~.
+ 1 and for all D 5 o,t ( p ) = G(t I p).
CHAPTER 6. ORDINAL NUMBERS
116
Let P(x,y) be the property (4.6) X
is an ordinal number and y = t(x) for some computation t of length
X
based on G , or X is not an ordinal number and y = 0. We prove first that P defines an operation. We have to show that for each X there is a unique y such that P ( x , y). This is obvious if X is not an ordinal. To prove it for ordinals, it suffices to show. by transfinite induction: For every ordinal a there is a unique computation o f length cr. We make the inductive assumption that for all 0 < a there is a unique computation of length P, and endeavor to prove the existence and uniqueness of a computation of length cr. Existence: According to the Axiom Schema of Replacement applied to the property "y is a computation of length X" and the set a,there is a set
T = {t I t is a computation of length p for some
p
Moreover, the inductive assumption implies that for every ,O < a there is a unique t E T such that the length o f t is P. T is a system of functions; let i = UT. Finally, let r = j U {(cr. G($))). W prove that r is a computation of length a. 4.7 Claim r is afunction and d o m r = a
+ 1.
We see immediately that doml = UtETdom t = UBEo(,f?+ 1) = a ; consequently, d o m r = domf U { D ) = cr + 1. Next, since a 4 domt, i t is enough to prove that 5 is a function. This follows from the fact that T is a compatible system of functions. Indeed, let t l and t2 E T be arbitrary, and let dom t l = ol, dom t 2 = 02. Assume that, e.g., pl 02;then pl P2, and it suffices to show that t (7) = t2(y) for all y < ol. We do that by transfinite induction. So assume that y < P1 and t l (b) = tz(b) for all b < y. Then t l [ y = t 2 t y, and we have t l ( y ) = G ( t l r y ) = G ( t 2 [ y ) = t2(7). We conclude that t l ( 7 ) = t 2 ( y ) for all y < P1. This completes the proof of Claim 4.7.
<
4.8 Claim 7 ( P ) = G ( r 1 P) for all
c
P < a.
This is clear if 0 = a, as r(a)= G(f) = G ( T r a ) . If P < a , pick t E T such that 0 E dom t. We have r(0) = t(@)= G ( t r 0 ) = G ( r 1 P) because t is a computation, and t G 7. Claims 4.7 and 4.8 together prove that 7 is a computation of length a .
Uniqueness: Let D be another computation of length a; we prove T = U . As 0 are functions and d o m r = a + 1 = dom 0, it suffices to prove by transfinite induction that r ( y ) = ~ ( yfor ) all y < cr. r and
4. TRANSFINITE IND UCTlON AND RECURSION Assume that r ( 6 ) = 4 6 ) for all ~ ( y ) The . assertion follows.
117
S < y. Then r ( y ) = G(r 1 y) = G ( o 17)=
This concludes the proof that the property P defines an operation F. Notice that for any computation t , F 1 domt = t. This is because for any ,O E dom t . to = t r ( p 1 ) is obviously a computation of length B, and so, by the definition of F , F ( @ )= to(P) = t ( B ) . To prove that F ( a ) = G ( F a) for all a , let t be the unique computation of length a; we have F ( o ) = t(a)= G(t / a )= G ( F 1 a ) .
+
We need again a parametric version of the Transfinite Recursion Theorem. If F ( z , X ) is an operation in two variables, we write F , ( X ) in place of F ( r , X). Notice that, for any fixed z , F , is an operation in one variable. If F is defined by Q ( z ,X , y ) , the notations F,[A] and F , 1 A thus have meaning:
F z [ A ]= { y I Q ( z , x , y) for some a: E A } ; F , 1 A = { ( X , y) 1 Q ( z , X ,v ) for some X E A ) We can now state a parametric version of Theorem 4.5.
Let G 4.9 The Transfinite Recursion T h e o r e m , P a r a m e t r i c Version be a n operatzon. The property Q stated in (4.10) defines a n operation F such that F ( z , a ) = G ( z ,F e r a ) for all ordinals o and all sets z . Proof. Call t a computation of length a based o n G and z if t is a function, domt = a + 1, and, for all @ S o ,t ( @ )= G ( z , t [ P ) . Let Q ( z , X , y ) be the property
X
X
is an ordinal number and y = t ( x ) for some computation t of length based on G and z , or is not an ordinal number and y = 0.
X
Then carry z as a parameter through the rest of the proof in an obvious way.
It is sometimes necessary to distinguish between successor ordinals and limit ordinals in our constructions. It is convenient to reformulate the Transfinite Recursion Theorem with this distinction in mind. 4.11 T h e o r e m Let G 1 , G z , and Gg be operations, and let G be the operation defined in (4.12) below. T h e n the property P stated in (4.6) (based o n G ) defines a n operation F such that
F ( 0 ) = G1(0), F ( a + 1) = GZ(F(a)) for all a , F ( a ) = G 3 ( F a ) for all limzt a
# 0.
CHAPTER 6. ORDINAL NUMBERS
118
Proof.
Define an operation G by
4.12) G(x) = y if and only if either (a) X = 0 and y = G1(0),or (b) X is a function, dom X = a 1 for an ordinal a and y = G z(x(a)). or (c) X is a function, domx = cr for a limit ordinal a # 0 and y = G ; < ( r ) ,or (d) X is none of the above and y = 0.
+
L
Let P be the property stated in (4.6) in the proof of the Transfinite Recursiorl Theorem (based on G ) . The operation F defined by P then satisfies F ( a ) = G ( F 1 a ) for all cr. Using our definition of G , it is easy to verik that F has the required properties. 0
A parametric version of Theorem 4.11 is straightforward and we leave i t to the reader. We conclude this section with the proofs of Theorems 3.6 and 4.4.
Proof of Theorem 3.6. Let G be an operation. We want to find. for t.verv set a , a sequence (a, I n E W ) such that a g = a and u,+1 = G(a,. 71) for ill1 n E N. By the parametric version of the Transfinite Recursion Theorem 4.11. thew is an operation F such that F ( 0 ) = a and F ( n + l ) = G ( F ( n ) ,rt) for all 7 1 E N. Now we apply the Axiom of Replacement: There exists a sequence ( U , I r~E U ) that is equal to F r W . and the Theorem follows. C Proof of Theorem
4.4.
Define an operation G by G(t) =
g(t)
0
if t G S ; otherwise.
The Transfinite Recursion Theorem provides an operation F such that F(cr) = I ., - ., G ( F 1 a ) holds for all ordinals a . Let f = F 1 R . Exercises
4.1 Prove a more general Transfinite Recursion Theorem (Double Recursion Theorem): Let G be an operation in two variables. Then there is an operation F such that F(p,a ) = G ( F 1 (P X a)) for all ordinals 3 and a. [Hint: Computations are functions on (P + 1) X ( a + l).] 4.2 Using the Recursion Theorem 4.9 show that there is a binary operation F such that (a) F ( x , 1) = 0 for all X. (b) F ( x , n + 1) = 0 if and only if there exist y and z such that X = (y. z ) and F ( y , n ) = 0. We say that X is an n-tuple (where n E W , n > 0) if F ( : c , 7.1) = 0. Prove that this definition of n-tuples coincides with the one given in Exert-ist. 5.17 in Chapter 3.
5. ORDINAL ARITHMETIC
5.
Ordinal Arithmetic
We now use the Transfinite Recursion Theorem of the preceding section to define addition, multiplication, and exponentiation of ordinal numbers. These definitions are straightforward generalizations of the corresponding definitions for natural numbers. 5.1 Definition - Addition of O r d i n a l Numbers (a) P + O = p . ( b ) p (a 1) = (D a ) 1 for all a. (C) D 0 = sup{D y I y < a ) for all limit a # 0.
+ + +
+
For all ordinats
3,
+ +
If we let (Y = 0 in (b), we have the equality p + 1 = p + 1 ; the left-hand side denotes the sum of ordinal numbers p and 1 while the right-hand side is the successor of B. To see how Definition 5.1 conforms with the formal version of the Transfinite Recursion Theorem, let us consider operations G I , Gz, and G 3 where G l ( z , x ) = z, G z ( z , x ) = X + 1 , and G 3 ( z , x ) = sup(ranx) if z is a function (and G3(2, X) = 0 otherwise). The parametric form of Theorem 4.11 then provides an operation F such that for all z (5.2) F(z,O) = Gl(z,O) = z. F ( z , a + 1) = G ~ ( zF,z ( a ) )= F(z,a ) 1 for all a . F ( Z . a ) = G3(1,F~ a ) = s u P ( r a n ( ~I 0 ~ ) )= SUP({F(Z? Y) I T < (1) 1 for limit a # 0.
+
If p and a are ordinals, then we write P + a instead of F(P,a ) and see that t,he conditions (5.2) are exactly the clauses (a), (b), and (c) from Definition 5.1. In the subsequent applications of the Transfinite Recursion Theorern. we use the abbreviated form as in Definition 5.1, without explicitly formulating the defining operations G l , Gz, and Gg to conform with Theorem 4.11. One consequence of (5.1) is that for all P ,
etc. Also, we have (if a =
and similarly, (U
+
B = U):
W) +U
= sup{(w + U )
+n I n
In contrast to these examples, let us consider the sum m + W for m < W . We have m + w = sup{m + n ( n < W ) = w,because, if m is a natural number,
CHAPTER 6. ORDINAL NUMBERS
120
+ n is also a natural number. We see that m + o # + m; the addition of ordinals is not commutative. One should also notice that, while 1 # 2, we have 1 + w = 2 + U . Thus cancellations on the right in equations and inequalities are nt
W
not allowed. However, addition of ordinal numbers is associative and allows left cancellations. (Lemma 5.4.) In Chapter 4 we defined the sum of linearly ordered sets. We now prove that the ordinal sum defined in 5.1 agrees with the earlier more general definition. 5.3 Theorem Let ( W 1 <, 1 ) and (W2,< 2 ) be well-ordered sets, isomorphic to ordinals a: l and cr2, respectively, and let (W,<) be the sum of (W1, < and (W2,< 2 ) . Then (W,<) is isomorphic to the ordinal a1 + n2.
Proof. We assume that Wl and W2 are disjoint, that W = Wl U W 2 ,and that each element in W 1precedes in < each element of W 2 ,while < agrees with < l and with < z on both W , and W2. We prove the theorem by induction on (32.
I f a 2 = O then W2= 8 , W = W 1 ,and a1 + 0 2 = a l . a2 = 0 + 1 then W2 has a greatest element a , and W [ a ]is isomorphic to 0; the isomorphism extends to an isomorphisr~ibetween W and a1 (1.2 =
If
a1
+
(01
+
+ P) + 1.
Let a2 be a limit ordinal. For each P < a2 there is an isomorphism J;) of n1+ 0 onto W [ a a ]where a0 E W,; moreover, jo is unique, ag is the 4"" element of W2,and if p < 7 then jo C f,. Let j = U D < 0 2 f g . As a1 nz = U g < o l ( a+ l P), it follows that f is an isomorphism of nl a2 onto W . 13
+
+
5.4 Lemma (a) If al,cr2, and p are ordinals, then a1 < a2 if and only ij P a , < 0 nz. (b) For all ordinals a:l, a : 2 , and 8, 0 a1 = 0 a2 ij and only zjal = a2. ( C ) (0 P) + y = a: (0 7)for all ordinals a , 0 , and y.
+ +
+
+
+
+
+
Proof. (a) We first use transfinite induction on a2 to show that a1 < 012 ilnplies 0 a , < 0 (1.2. Let us then assume that a2 is an ordinal greater than a l and that a1 < 8 implies 0 + a1 < 0 6 for all 6 < a2. If (1.2 is a successor ordinal, then a:? = 6 1 where 6 (1.1. By the inductive assumption in case 6 > a l ,and triviallyincase6 = a ] ,weobtain B+nl D+6 < ( 0 + 6 ) + 1 = 0 (6 1) = 0 a2. If a2 is a limit ordinal, then a1 + 1 < (1.2 and we have < ( p ( 1 . 1 ) 1 = 0 -1- (a1 i. 1 ) 5 sup{P 5 ( 5 < a 2 )= D + a2 B To prove the converse we assume that P + a1 < 0 + a2. If a2 < a l , the implication already proved would show D + a2 < 0+ a l . Since a2 = a1 is also impossible (it implies P + 0 2 = B a l ) ,we conclude from linearity of < that n1 < (1.2. (b) This follows immediately from (a): If a l # az, then either a1 < a2 o r (1.2 < (1.1 and, consequently, either D al < 0 a2 or 0 a2 < 3 + G ] . If a1 = c r q , then P + a1 = B + a2 holds trivially.
+
+
+
+ + +
+
+
>
+
<
+
+
+
+
+
+
5. ORDINAL ARITHMETIC
121
+ +
(c) We proceed by transfinite induction on y. If y = 0, then ( a P) 0 = a + 0 = a + (P + 0). Let us assume that the equality holds for y, and let us prove it for y + 1:
(we have used the inductive assumption in the second step, and the second clause from the definition of addition in the other steps). Finally, let y be a limit ordinal, 7 # 0. Then ( a 0 ) + y = sup{(a P) 6 1 b < y ) = sup(& + (P 6) 1 6 < 7). We observe that sup{P 6 1 6 < y} = p + 7 (the third clause in the definition of addition) and that P y is 5 p 6 for some 6 < y and so a limit ordinal (if C < p + 7 then + 1 5 ( p 6) 1 = P + (6 1 ) < p + y because y is limit). It remains to notice that sup{a + (p 6) 1 6 < y) = sup{a + I < B y ) (because ~+y=sup{/3+6~6
+
+
<
+ +
+
+ + + +
+
+
< <
+
The following lemma shows that it is possible to define subtraction of ordinal numbers:
<
5.5 L e m m a If o 5 j3 then there is a unique ordinal number 4 such that a + =
PProof. AS a is an initial segment of the well-ordered set P (or a = P), Theorem 5.3 implies that P = a +c where C is the order type of the set P - a = U { v I a 5 U < p). By Lemma 5.4(b), the ordinal is unique.
<
Next, we give a definition of ordinal multiplication: 5.6 Definition - Multiplication of O r d i n a l N u m b e r s
For all ordinals
P, (a) 0 . 0 = 0. (6) 0 . ( a l ) = - a+ ,i3 for all a. (C) 0 . a = SUP{/?-y17 < a )for ail limit a
+
# 0.
5.7 E x a m p l e s (a) P . 1 = p . ( O + l ) = p . O + p = O + P = p . (b) p . 2 = ~ . ( 1 + 1 ) = p . l + ~ = P + P ; i n p a r t i c u l a r , w . 2 = ~ + ~ . ( C ) P . 3 = P . ( 2 + 1 ) = p . 2 + / 3 = P + D + P , etc. (d) P W = sup{P . n 1 n E w) = sup(P, 0,P P P , . . . ). (e) I . cr = a for all a , but this requires an inductive proof 1.0=0; l.(cr+l)=l-a:+l=a:+l; 1 - a = sup{l - y ( y < a} = sup{y ( y < a ) = a if a is limit, a # 0. (f) 2 W = sup(2 n n E w } = w. Since W . 2 = W + W # W , we conclude that multiplication of ordinals is generally not commutative. m
+
+ +
CHAPTER 6 . ORDINAL NUMBERS
122
For more properties of ordinal products, see Exercises 5.1, 5.2 and 5.7. Ordinal multiplication as defined in 5.6 also agrees with the general definition of products of linearly ordered sets as defined in Chapter 4:
5.8 Theorem Let a and P be ordinal numbers. Both the lexicographic and the antilexicographic orderings of the p7.0duct a X P are well-orderings. The 0 7 d ~ type of the antilexicographic ordering of a X P zs a - 0,while the lexicog7~aphic. ordering of a X 0 has order type 0 . a . Proof. Let 4 denote the antilexicographic ordering of a: X 0. We define at1 isomorphism between (a X P , 4)and a . 0 as follows: for < a and < 0. Irt. f ( < , q ) = a . 7 ) + 5 . T h e r a n g c o f f isthcsct { a , q + ( j q < @ a n d << c l ) = c I . . ~ and f is an isomorphism (we leave the details -- proved by induction to t,litl reader; see also Exercises 5.1, 5.2, 5.7, and 5.8). D
<
5.9 Definition - Exponentiation of Ordinal Numbers
p = 1. (b) Pa+l = Drn . P for all cr. (c) P" = sup{PYI y < a:) for all limit a # 0 .
For all
0:
(a)
5.10 Example = 4 - 4, P3 = P2 . p = 0. p . 0, etc. (a) P' = P, (b) 0" = sup{Pn I n E U);in particular, lW= 1, 2'*)= W , 3W= W , . . . , nW = w for any n G W , n W" = sup{w J n E w) > W . It should be pointed out that ordinal arithmetic differs substantially fro111 arithmetic of cardinals. Thus, for instance, 2" = o and w" are countable ordinals, while 2*0 = H*" is uncountable. One can use arithmetic operations to generate larger and larger ordinals:
The process can easily be continued. We define E = sup(w. d i j , One can then form E + 1, E w , E", E € , E € ' , etc.
+
Exercises
5.1 Prove the associative law (a . P) y = a . (0 . -y). 5.2 Prove the distributive law a - (0+ y ) = cr . P + a . y. 5.3 Simplify: (a) (U l) + W . (b) w + W2. ( C ) ( W + 1) . W 2 .
+
, UJ&-
,
.
.
)
5. ORDINAL ARITHMETIC
123
5.4 For every ordinal a , there is a unique limit ordinal ,8 and a unique natural number n such that a = + n. [Hint:p = sup(? 5 a 1 y is limit).] 5.5 Let a 5 p. The equation f + o = ,8 may have 0, 1, or infinitely many solutions. 5.6 Find the least a > w such that + a = cr for a11 < a. 5.7 (a) If a l , c r 2 , and p are ordinals and # 0, then a ] < a 2 if and only if
<
0
m
crl
< p.a2.
(b) For all ordinals 01
a a , and
01,
P#
0,
a, =
0. a2
if and only if'
= a2.
5.8 Let a , P , and y be ordinals, and let a < 0. Then (a) a + r < B + r , (h> a . 7 5 0 . ~ and 5 cannot be replaced by < in either inequality. 5.9 Show that the following rules do not hold for all ordinals a , P, and 7: (a) I f a : + 7 = B + y, then & = p . (b) If y > 0 and a y = p 7, then a = p. ( C) ( P + y ) . a = P - c r + 7 - a . 5.10 An ordinal a is a limit ordinal if and only if a: = w . B for some 0. 5.11 Find a set A of rational numbers such that ( A , IQ) is isomorphic to ( a , <) where (a) c r = w + l , (b) a: = w - 2 , (C) a = W 3, (d) a = W " , (e) cr = E. [Hint: {n - l/nr I nr, n E N - (0)) is isomorphic to w 2 , etc.] 5.12 Show that (w 2)2 # w 2 22. 5.13 (a) = aD.al (b) (&)7 = & D . Y , 5.14 (a) If cr P then a y 5 ,P. y (b) If a > 1 and if /3 < 7, then aS < a . 5.15 Find the least such that (a) w + c = c. (b) w . = f , t # 0. +
v
m
<
<
c
(C)
WC = f .
co = 0, c,+1
+<,, <
=w = sup{<, I n E U).] 5.16 (Characterization of Ordinal Exponentiation) Let a and P be ordinals. For f : 0 -+ a , let s(f) = {t< p I f(<) # 0) Let S(B,a)= {f I f : L? a: and s ( j ) is finite). Define 4 on S(P, a ) M follows: f ig if and only if there is Jo < /3 such that f (co) < g(JO)and f (<) = g(<) for all > Co. Show that (S@, a),4 ) is isomorphic to (CID, <).
[Hintfor part (a): Let
+
<
5.3 Simplik: (a) (W 1) W. (b) w w2. ( C ) (W l ) . w2.
+ + + +
CHAPTER 6. ORDINAL NUMBERS
124
The Normal Form
6.
Using exponentiation, one can represent ordina1 numbers in a way similar to decimal expansion of integers. Ordinal numbers can be expressed uniquely in normal form, to be made precise in the theorem below. We apply the normal form to prove an interesting result about so-called Goodstein sequences of integers. First we observe that the ordinal functions a + B , a + B ,and a 0 are continuous in the second variable: If 7 is a limit ordinal and B = sup,,, P,, then
This follows directly from Definitions 5.1, 5.6, and 5.9. As a consequence, wv have: 6.2 Lemma (a) If 0 < a < 7 then there is a greatest ordinal (b) If 1 < a 5 7 then there is a greatest ordinal
0 such that a . B 5 y . 17 such that a 0 5 7.
+ > + +
Proof. Since a . (7 1) 7 I > 7, there exists a d such that a d > y. Similarly, because a'+' 2 y 1 > 7, there is a d with ad > 7.The least 6 such that a 6 > y (or that ad > y ) must be a successor ordinal because of (6.I),say 6 = p + 1. Then B is greatest such that a - B 5 7 (respectively, crP < 7 ) . Cl The following lemma is the analogue of division of integers: 6.3 Lemma If y is an arbitrary ordinal and 2f a # 0 , then there exists a unique ordinal B and a unique p < a such that 7 = a . B p.
+
<
Proof. Let B be the greatest ordinal such that a . B y (if a > y then B = O ) , and let p be the unique p (by Lemma 5.5) such that a B + p = y . The ordinal p is less than a, because ot,herwise we would have a (D+ 1 ) = ct - B -tcr 5 cr - 0+ p = 7, contrary to the rnaxirnality of B. To prove uniqueness, let y = a .Bl pl = a p2 with p1, p2 < a. Assume that P1 < B2. T h e n B l + 1 5 andweha~ea./3~+(a+= p ~a). ( p l + 1 ) + p 2 5 a p2 = a Bl P I , and by Lemma 5.4(a), p1 a p2 >_ a , a contradiction. E Thus D1 = B2, and p1 = p2 follows by Lemma 5.5. +
+
+
+
+
+ > +
The normal form is analogous to decimal expansion of integers, with thc* base for exponentiation being the ordinal W : 6.4 Theorem Every ordinal cr > 0 can be expressed uniquely as
where
pl >
> . - . > B,, and kl > 0 , k2 > 0 ,
...
, k,
> 0 are finite.
6. THE NORMAL FORM We remark that it is possible to have a = w a , see Exercise 6. l . We first prove the existence of the normal form, by induction on Proof. a. The ordinal a = 1 can be expressed as l = w0 1. Now let cr > O be arbitrary. By Lemma 6.2(b) there exists a greatest P such that W O 5 a (if a < w then 0 = 0). Then by Lemma 6.3 there exist unique d andpsuch t h a t p < w D a n d a = w 4 . d + p . A s w a 5 a , w e h a v e 6 > O a n d p < a. We claim that 6 is finite. If 6 were infinite, then cw wa . d > - W@ W = WO+', contradicting the maximality of P. Thus let = P and kl = d . If p = 0 then a = wal . kl is in normal form. If p > 0 then by the induction hypothesis, . kz + . . . + . kn p= m
>
a
for some > . . - > lc, and finite k 2 , ., . , kn > 0, As p < w o l , we have wa2 5 p < wal and so pl > h. It follows that cw = wal .kl + w a 2 . k 2 + - .- + d r a v k n is expressed in normal form. To prove uniqueness, we first observe that if < 7 , then wo - k < d r for W'. From this it easily every finite k: this is because wa k < w @. w = do+' follows that if a = - w A kn is in normal form and y > PI! then - kl r a <w . We prove the uniqueness of normal form by induction on a. For a = 1, the expansion 1 = w0 . 1 is clearly unique. So let a = wP1 kl - . - L J ~ * I k, = wY1. e1 - - - wrTr8 - em. The preceding observation implies that Pl = 71. If we let 6 = wD1 = ~ 7 1p, = W h- k2 + . . - f W @. ~kn and a = w'2 . ez + . . . + .ern1 we have cr = 6 . k l + p = 6 - t l +a,and since p < d and a < 6, Lemma 6.3 implies that k l = el and p = a. By the induction hypothesis, the normal form for p is unique, and so m = n , = 72, . .. , = m,k2 = t2,. . . , kn = ln.If follows I7 that the normal form expansion for Q. is unique.
+ +
<
m
+
+ +
o2
on
We use the normal form to prove an interesting result on Goodstein sequences. Let us first recall that for every natural number a 2, every natural number m can be written in base a, i.e., as a sum of powers of a:
>
with bl > > b, and O < k, < a, i = l,... , n . For instance, the number 324 can be written as 44 43 4 in base 4 and 7 2 6 7 . 4 2 in base 7. A weak Goodstein sequence starting at m > 0 is a sequence mo, m1, m2,. . . of natural numbers defined as follows: First, let m0 = m , and write in base 2:
+ +
To obtain
ml,
+
+
increase the base by 1 (from 2 to 3) and then subtract 1:
CHAPTER 6. ORDINAL NUAIBERS
126
In general, to obtain m k + t from m k (as long as mk # O), write m k in base k + 2, increase the base by 1 (to k + 3 ) and subtract 1. For example, the weak Goodstein sequence starting a t m = 21 is as follows: m0
= 21 = 24
m1
= S4
+ 22 + 1
+ 32 = 90
ma=44+42-1 ~ 4 ~ + 4 - 3 + 3 = 2 7 1 m g ~ +55 . ~ 3+2=642 = 64 -16 . 3
m4
= 74
+ 1 = 1315
+ 7 . 3 = 2422
ms=84+8.2+7=4119
+ 9 . 2 + 6 = 6585
m7
= g4
m8
= 104
+ 10.2+5
= 10025
etc. Even though weak Goodstein sequences increase rapidly a t first, we have
6.5 Theorem For each m > 0 , the weak Goodstein sequence starting at Tn eventually terminates with m, = 0 jor some n.
Proof. We use the normal form for ordinals. Let m > 0 and m o , m l .m:!:. . . be the weak Goodstein sequence starting a t m. Its a t h term is written in base a -t2:
Consider the ordinal Q,
= cJb'
. kI +
'
. . + LJb., . k ,
obtailied by replacing base a + 2 by LJ in (6.6). I t is easily seen that a0 > (11 > (12 > . . - > Qa > - . . is a decreasing sequence of ordinals, necessarily finite. Therefore, there exists some T L such that a , = 0 . But clearly ~ n , (1, for ever?. U = 0 , 1 , 2 , .. . ,?L. Hence m, = 0 .
<
-
Remark For the weak Goodstein sequence m o , m l , rn2, . . . starting at 7n = 2 1 displayed above, the correspolldiilg ordinal numbers are u 2+ 1, u4 C d 2 . L J ~ + L J . ~ ~ - ~ , L J ~ + L J . ~ + ~ , W ~ + W . ~ + ~ , W ~ + L J -. ~ . . ,. ~ J ~ + L J . ~ +
We shall now outline an even stronger result. A number 7n is written in yv~-r base U 2 2 if it is first written in base U , then so are the exponents and t h e exponents of exponents, etc. For instance, the number 324 written iu pure base 3 is 33+' + 33+ 1 The Goodstein sequence startillg at m > 0 is a sequence 7~10, v1 1, ~ n 2. .. . obtained as follows: Let m0 = n2 and write m0 in pure base 2. To define m l . replace each 2 by 3, and then subtract 1. In general. to get ?nk+l . write r n k ill
6, THE NORMAL FORM
127
+
+
pure base k 2 , replace each k + 2 by k 3 , and subtract 1. For example, the Goodstein sequence starting a t m = 21 is as follows: m0
= 21 = 222 + 2 ' +
m1
= 33J+ 33
mz=4 m 3
44
7.6
I X
1012
+44- l =444+43.3+42.3+4.3+3-
1 . 3 1~ 0 ' ~ ~
= 5 5 5 + 5 3 . 3 + 5 2 - 3 + 5 . 3 + 2 - 1 . 9 ~102184
r n 4 = 6 6 C + 6 3 e 3 + 6 2 ~ 3 + 6 * 3 + 1 - 2 . 610 x 36 305 etc. Goodstein sequences initially grow even more rapidly than weak Goodstein sequences. But still:
6.7 Theorem For each m > 0 , the Goodstein sequence starting at nz. eventually teminates with m, = 0 for some n. Again, we define a (finite) sequence of ordinals cro > a , > . . > a, > . . as follows: When m, is written in pure base a 2 , wc get n, by replacing each a 2 by w . For instance, in the example above, the ordinals are www +"JW + l , wWW +wW, wWY+ w 3 - 3 + w 2 . 3 + w . 3 + 3 , wWW +w3.3+w2.3+~.3+2, w Y w + u3- 3 + w 2 - 3 W . 3 1 , etc. The ordinals a, are in normal form. and again, it can be shown t h a t they form a (finite) decreasing sequence. Therefore a, = 0 for some n, and since m, <_ a, for all a.,we have m, = 0.
Proof.
+
+
+
+
Exercises
6.1 Show that w E = E. G .2 Find the first few terms of the Goodstein sequence starting at ?n= 28.
This page intentionally left blank
Chapter 7
Alephs 1.
Initial Ordinals
In Chapter 5 we started the investigation of cardinalities of infinite sets. Although we proved several results involving the concept of (XI, the cardinality of a set X , we have not defined itself, except in the case when X is finite or countable. In the present chapter we consider the question of finding "representatives" of cardinalities. Natural numbers play this role satisfactorily for finite sets. We generalized the concept of natural number and showed that the resulting ordinal numbers have many properties of natural numbers, in particular, inductive proofs and recursive constructions on them are possible. However, ordinal numbers do not represent cardinalities, instead, they represent types of wellorderings. Since any infinite set can be well-ordered in many different ways (if at all) (see Exercise 1.1), there are many ordinal numbers of the same cardinality; w, w 1, w 2 , . . . , W W , . . . , w . U, w w + 1, . . . are all countable ordinal numbers; that is, Iwl = lw l ]= Iw + wl = - = No. An explanation of the good behavior of ordinal numbers of finite cardinalities - that is, natural numbers - is provided by Theorem 4.3 in Chapter 4: All linear orderings of a finite set are isomorphic, and they are well-orderings. So for any finite set X , there is precisely one ordinal number n such that InJ= IX I. We have called this n the cardinal number of X. In spite of these difficulties, it is now rather easy to get representatives for cardinalities of infinite (well-orderable) sets; we simply take the least ordinal number of any given cardinality as the representative of that cardinality.
1x1
+
+
+
+
1.1 Definition An ordinal number equipotent to any P < a.
0
is called an initial ordinal if it is not
1.2 Example Every natural number is an initial ordinal. w is an initial ordinal, because w is not equipotent to any natural number. W + 1 is not initial, because IwJ= IW 11. Similarly, none of w 2, W 3, w + w, w - w, w W , . . . , is initial.
+
+
+
C H A P T E R 7. ALEPHS
130
1.3 Theorem Each well-orderable set X is equipotent to a unique initial or&nal number. By Theorem 3.1 in Chapter 6, X is equipotent to some ordinal Proof. cr. Let a0 be the least ordinal equipotent to X. The11 cl.0 is an initial ordinal. because laol= 1/31 for some /3 < cko would imply 1x1 = 1/31, a coiitradiction. If cro # 01 are initial ordinals, they cannot be equipot.ent, because JaoJ = ]trlI and, say, CLO < a1, would violate the fact that a1 is initial. This proves tlw uniqueness. 1.4 Definition If X is a well-orderable set, then the cardinal number of X . denoted is the unique initial ordinal equipotent to X . In particular. / X1 = for any countable set X and = 71 for any finite set of n elerrients, in agreemelit with our previous definitions.
1x1,
1x1
According to Theorem 1.3, the cardinal numbers of well-orderable sets are precisely the initial ordinal numbers. A natural question is whether there art. any other initial ordinals besides the natural numbers and W . The next tlic~,tern shows that there are arbitrarily large initial ordinals. Actually. we prove something more general. Let A be any set; A may not be well-orderable itst~lf. but it certainly has some well-orderable subsets; for example, all finitc s l i b s ~ t s of A are well-orderable.
1.5 Definition For any A, let h ( A ) be the least ordinal number which is llot vquipotent to any subset of A . h ( A ) is called the Hartogs number o f A . By definition, h ( A ) is the least ordinal cr such that
(01
/Al.
1.6 Lemma For an3 A , h ( A ) is an initial ordinal number.
Proof. Assume that 1/31 = Ih(A)( for some /3 < h ( A ) . Then 0 is equipoterit to a subset of A, and 0 is equipotent to h ( A ) . We corlclude that h ( A ) is D equipotent to a subset of A , i.e., h ( A ) < h ( A ) ,a contradiction. So far, we have evaded the main difficulty: How do we know that the Hartogs would cnrlsist llumber of A exists? If all infinite ordilials were cou~ltable, of all ordinals!
1.7 Lemma The Hartogs number oj A exists for all A .
P ~ o o f . By Theorem 3.1 in Chapter 6, for every well-ordered set (W:R ) where W C A, there is a unique ordinal a such that (a.<) is isomorphic to (W,R ) . The Axiom Schema of Replacement implies that there exists a set H such that, for every well-ordering R E P ( A X A ) , the ordinal a isomorphic to it is in H. We claim that H contains all ordinals equipotent to a subset of A. Illdeed, if f is a one-to-one function mapping cr into A , we set W = ran f i~lid R = { ( f ( P )j,( 7 ) ) I D < y < a ) . R 2 A X A is then a well-ordering isomorphir
1. INITIAL ORDINALS
to a (by the isomorphism f ) . These considerations show that h ( A ) = {a E H [ a is an ordinal equipotent to a subset of A), and justify the existence of h(A) by the Axiom Schema of Comprehension. The preceding developments allow us to define a "scale" of larger and larger initial ordinal numbers by transfinite recursion.
1.8 Definition W0
= W;
for all a;
w,+l
=
h(w,)
W,
=
sup{wg I p
< a ) if a is a limit ordinal, 0 # 0.
The remark following Definition 1.5 shows that so Iw,l < lwgl whenever a < 0.
(>
Iw, 1,
for each cr, and
1.9 Theorem (a) W, is an infinite initial ordinal number for each a. (6) If R is an infinite initial ordinal number, th,en Q = W, for some a . Proof. (a) The proof is by induction on cr. The only nontrivial case is when a is a limit ordinal. Suppose that (U,( = ] y [ for some y < W,; then there is p < a such that y wg (by the definition of supremum). But this implies Iw,l = Iyl 5 lwal twQl and yields a contradiction. (b) First, an easy induction show that a 5 W, for all a. Therefore, for every infinite initial ordinal R, there is an ordinal a such that R < W, (for example, a = R 1 ) . Thus it suffices to prove the following claim: For every infinite initial ordinal St < W,, there is some y < a such that iZ = W,.. The proof proceeds by induction on cr. The claim is trivially true for 0 = 0. If a = /3 l , R < W, = h(wg) implies that IRI 5 Iwlo1, so either R = and we can let y = 0,or R < wg and existence of y < 0 < cr follows from the inductive assumption. If a is a limit ordinal, il < W, = S U ~ { W 1 ~0 < cr.) implies that R < wg for some P < cr. The inductive assumption again guarantees the existence of some y < such that R = W,. D
< <
+
+
T h e conclusion of this section is that every well-orderable set is equipotent to a unique initial ordinal number and that infinite initial ordinal numbers form a transfinite sequence W, with a ranging over all ordinal numbers. Infinite initial ordinals are, by definition, the cardinalities of infinite well-orderable sets. It is customary to call these cardinal numbers alephs; thus we define
for each a.
CHAPTER 7. ALEPHS
132
The cardinal number of a well-orderable set is thus either a natural number or a n aleph. In particular, (NI= No in agreement with our previous usage. The reader should also note that the ordering of cardinal numbers by size defined in Chapter 4 agrees with the ordering of natural numbers and alephs as ordinals by < (i.e., E ): If = N, and IYI = No, then 1x1 < IYI holds if and only if N, < No (i.e., W, E W@) and a similar equivalence holds if one or both of (XI and ( YI are natural numbers. In Chapter 5 we defined addition, multiplication, and exponentiation of cardinal numbers; these operations agree with the corresponding ordinal addition. multiplication, and exponentiation, as defined in Chapter 6, as long as the ordinals involved are natural numbers but they may differ for infinite ordinals. For example, WO WO # WO if + stands for the ordinal addition, but wo WO = WO if stands for the cardinal addition. The addition of cardinal numbers is commutative, but the addition of ordinal numbers is not. To avoid confusion, we employ the convention of using the W-symbolism when the ordinal operations are involved, and the aleph-symbolism for the cardinal operations. Thus wo wo and 2"" indicate ordinal addition and exponentiation (WO+ WO = SUP{W n 1 n < WO)> WO, 2W0 = sup(2* ( n < wo) = W O ) , while No + No and 2"" are cardinal operations (NO No = No, 2N"is uncountable).
1x1
+
+
+ +
+
+
Exercises
1.1 If X is an infinite well-orderable set, then X has xlonisomorphic wellorderings. 1.2 If a and 0 are at most countable ordinals then a + 8, a . P , and rrY art! at most countable. [Hint: Use the representation of ordinal operatiorls from Theorems 5.3 and 5.8 and Exercise 5.16 in Chapter 6. Anotlier possibility is a proof by transfinite induction.] 1.3 For any set A, there is a mapping of P ( A X A ) onto h(A). [Hint:Define f(R)= the ordinal isomorphic to R, if R C A X A is a well-ordering of' its field; f (R) = 0 otherwise.] 1.4 J A J< (AI h(A) for all A. 1.5 [h(A)I < IP(P(A X A))( for all A. [Hint: Prove that JP(h(A))I 5 [ P ( P ( A X A))( by assigning to each X E P(h(A)) the set of all wellorderings R C A X A for which the ordinal isomorphic to R belongs to
+
x.1
1.6 Let h'(A) be the least ordinal a such that there exists no function with domain A and range a. Prove: (a) If a 2 h*(A), then there is no function with domain A and range n. (b) h*(A) is an initial ordinal. (C) h(A) h*(A). (d) If A is well-orderable, then h(A) = h*(A). (e) h*(A) exists for all A. \Hint for part (e): Show that a E h'(A) if and only if a = 0 or a = the ordinal isomorphic to R, where R is a well-ordering of some partition of A into equivalence classes.)
<
2. ADDITION AND MULTIPLICATION OF ALEPHS
2.
133
Addition and Multiplication of Alephs
Let us recall the definitions of cardinal addition and multiplication: Let K and X be cardinal numbers. We have defined K A as the cardinality of the set X U Y, where 1x1= K , IYI = A, and X and Y are disjoint:
+
As we have shown, this definition does not depend on the choice of X and Y. The product K A has been defined as the cardinality of the cartesian product X X Y , where X and Y are any two sets of respective cardinalities K and A: m
and again, this definition is independent of the choice of X and Y. We verified that addition and multiplication of cardinal numbers satisfy various arithmetic laws such as commutativity, associativity, and distributivity:
Also, if K and A are finite cardinals (i.e., natural numbers), then the operations A and K . A coincide with the ordinary arithmetic operations. The arithmetic of infinite numbers differs substantially from the arithmetic of finite numbers and in fact, the rules for addition and multiplication of alephs are very simple. For instance: K
+
for every natural number n. (If we add n elements to a countable set, the result is a countable set.) We even have
No +No
= No,
since, for example, we can view the set of all natural numbers as the union of two disjoint countable sets: the set of even numbers and the set of odd numbers. We also recall that No. No = H(,. (The set of all pairs of natural numbers is countable.) We now prove a general theorem that determines completely the result of addition and multiplication of alephs.
CHAPTER 7. ALEPHS
134
2.1 Theorem H, . H, = N,, for every a. Before proving the theorem, let us look at its consequences for addition ancl multiplication of cardinal numbers. 2.2 Corollary For every a and
P such that N,, . No
cr 5 8, we have
= No.
A iso, n . N, = N,, for every positive natural number n. Proof. If a t -. <, then on the one hand, we have H p = 1 N o 5 N a NLj. and on the other hand, Theorem 2.1 gives N, . N o 5 N o . N o = No. Thus by thcb Cantor-Bernstein Theorem N, N p = No. The equality 71 N, = N, is proved similarly. G 2.3 Corollary For every cr and
P such that a <_ P, we have
for ail natural numbers n.
<
+
+
Proof. If a 5 P , then N o Na N o 5 No N o = 2 . Ha = No, and assertion follows. The second part is proved similarly.
t.1~
3
Proof of Theorem 2.1. We prove the theorem by transfinite induc.tion. For every a, we construct a certain well-ordering 4 of the set W, x sl,, . and sllou.. using the induction hypothesis No N p 5 N o for all P < cr. that the orclertype of the well-ordered set (W, x W,,,4) is at most W,. Then it follows that N, . N, 5 N,, and since N, . N, N a , we have N, - H, = H,. We construct the well-ordering 4 of U, X W, uniformly for a11 W,; that is. we define a property 4 of pairs of ordinals and show that iwell-orclers LJ(, X d ( r for every W,. We let ( a l ,a2)4 (P1,P2) if and only if: either max(al, 02) < max{P1 Pz), or max{crl, al) = max{iol, P:!) and t r l PI, or max{al, 02) = mm{Pl, h ) , & I = Pi and < 02We riow show that 4 is a well-ordering (of any set of pairs of ordinals). be such First, we show that 4 is transitive. Let a1, az, PI, 02, +y1! that (crl,cr2) 4 (P1,02) and (P1,02) 4 (71,7 2 ) . It follows frorn the definition of 4 that max{al,cr2) 5 max{Pl, P 2 ) 5 m a x { y l , x ) ; hence nlax{cr~,cr2) 5 m a x { ~+y2). ~ , If rnax{al, a 2 ) < nlax{yl, y2) then (aI 02)4 (71,y l ) . l'hus assume that rnax{al, Q.L) = max{B1,P2) = 1~1ax{y~, ~ 2 ) .Then we have c l 1 5
>
!
2. ADDITION AND MULTIPLICATION OF ALEPHS
135
P1 < 71, and so a, 5 yl .
If al < y l , then ( a l ,a2) + (71,Y ~ )otherwise, ; we have a1 = p1 = 71. In this last case, max{al,a2) = max(P1, P2) = max{yl, 72): < 7 2 , and it follows again that and a1 = = 7 1 , so, necessarily, a 2 < (a142) + ( 7 1 , ~ ~ ) . Next we verify that for any a l , a ~PI, , PZ1either (al,a z j + (P1,P2) or (P1,,02)+ (a1,a2)or ( a l , a2)= (Pr,P2) (and that these three cases are mutually exclusive). This follows directly from the definition: Given ( a l ,oz) allcl (P1, P2),we first compare the ordinals max{al, a;!) and max{P1,P 2 ) , then the ordinals a l and and last the ordinals a 2 and P2. We now show that + is a well-ordering. Let X be a nonempty set of pairs of ordinals; we find the +-least element of X . Let S be the least maximum of the pairs in X; i.e., let S be the least element of the set {mm{&,P ) 1 ( a ,P) E X ) . Further, let Y = { ( a ,p) E X I max(a,fi) = 6 ) . The set Y is a nonempty subset of X , and for every (a,P) E Y we llave maxi&,P ) = S; moreover, S < max{ar, 8') for any (of,0') E X - Y , ancl hence (CL. P) + (a',P') whenever (a,P) Y and (a',P') E X - Y. Thereforc. the least element of Y, if it exists, is also the least element of X . Now let c 1 0 1)t. the least ordinal in the set { a ( (a,P) E Y for sonhe 01, and let
The set Z is a nonempty subset of Y, and we have ( a ,0)+ (Q', $') wllenever (a,p) E Z and (a',P') E Y - 2. Finally, let Do be the least ordinal in the set {P 1 ( a o , P ) E 2 ) . Clearly, (ao,PO)is the least element of Z, and it follows that (ao,0")is the least element of X. Having shown that + is a well-ordering of W,, X W, for every 0, we use t.llis well-ordering to prove, by transfinite induction on a, that Iw, X w,l < H,,, that is, N, , N, 5 HQ. We have already proved that No . No = No, and so our assertion is trur fix a = 0. So let a > 0, and let us assume that No- No No for all /3 < 0 . We prove that Iw, X w,I N,. If suffices to show that the order-type of the well-ord(brc.~i set (W, X W,,, 4)is at most W,. If the order-type of W, X U , were greater thall U,, then there would exist ( a l , a2)E w, X W, such that the cardinality of the set X = ((51t52) E W a X W a I (51152) + { ~ I , w ) )
<
<
is a t least HQ. Thus it suffices to prove that for any ( a l ,a z ) E w, X U,, we have 1x1 < H,. Let = max{al, a2) 1. Then P E W,, and, for every c2) E X . ivt5 have r n a ~ ( < ~ , C 5 ~m)a x ( a l , a 2 ) < ,G', SO E P a i d E 0.In other wor(ls. X 2 9 X ,3. Let y < a be such that 101 5 N,. Then 5 x p I = I0I.IPI 5 N,.N,.. alhd by the induction hypothesis, N, . H, < N,. Hence, / X I 5 N,, and so 1x1 < N,, as claimed.
+
c2
1x1
(c1,
CHAPTER 7. ALEPHS
136
Now it follows that Iw, X w,l 5 N,. Thus we have proved, by transfinite induction on a, that N, . H, 5 H, for all a. Since N, 5 N, . N,, we have H, . H, = H,, and the proof of Theorem 2.1 is complete. C l Exercises
+
2.1 Give a direct proof of N, N, = H, by expressing W , as a disjoint uniori of two sets of cardinality N,. 2.2 Give a direct proof of n - N, = N, by constructing a one-to-one mapping of U, onto n X w, (where n is a positive natural number). 2.3 Show that (a) NE = H,, for all positive natural numbers n. (b) IIN,ln] = Ha, where [N,In is the set of all n-element subsets of H,,. for all n > 0. is the set of all finite subsets of N,. ( C ) IINa]<"JJ = H,, where [Hint: Use Theorem 2.1 and induction; for (c), proceed as in the proof of Theorem 3.10 in Chapter 4, and use No . N, = N,.] 2.4 If a and @ are ordinals and Ial < Ny and _< N,, then la PI N,, la+PI N,, [aa[j H, (where a + p, a - 0, and a D are ordinal operations). 2.5 If X is the image of w, by some function f , then 5 N,. [Hint: Construct a one-to-one mapping g of X into w, by letting g(x) = the least element of the inverse image of {X) by f .] 2.6 If X is a subset of w, such that < N,, then Iw, - XI = N,.
+ <
<
1x1
1x1
Chapter 8
The Axiom of Choice 1. The Axiom of Choice and its Equivalents In the preceding chapter, we left unanswered a fundamental question: Which sets can be well-ordered? Interestingly, the question was posed in this form only at a later stage in the development of set theory. Cantor considered it quite obvious that every set can be well-ordered. Here is a fairly intuitive "proof" of this "fact." In order to well-order a set A, it suffices to construct a one-to-one mapping of some ordinal X onto A. We proceed by transfinite recursion. Let a be any set not in A. Define some element of A
f ( l )=
if A # 0, otherwise,
some element of A - {f(O)) if A - {j(O)) # 0, otherwise,
etc. Generally, some element of A - ran(f
ra)
if A - ran(f otherwise.
r a) # 0,
Intuitively, f lists the elements of A, one by one, as long as they are available; when A is exhausted, f has value a, We first notice that A does get exhausted at some stage X < h(A), the Bartogs number of A. The reason is that, for a < 0, if f(p) # a , then f ( p ) E A - ran(f P ) , f ( a ) ran(f r P), and thus fIa) # f(P). If f ( 4 # a were to hold for all a < h(A), f would be a one-to-one mapping of h(A) into A , contradicting the definition of h(A) as the least ordinal which cannot be mapped into A by a one-to-one function. Let X be the least a < h(A) such that f(a)= a. The considerations of the previous paragraph immediately show that f 1 X is one-to-one. The "proof"
r
CHAPTER 8. THE AXIOM OF CHOICE
138
is complete if we show that ran(f t X) = A. Clearly ran(f t X) g A: if' ran(f 1 X) c A, A - ran(f 1 X) # 8 and f (X) # U , contradicting our definitior~ of X. Our use of quotation marks indicates that something is wrong with this argument, but it may not be obvious precisely what it is. However, if orie t,ries to justify this transfinite recursion by the Recursion Theorem, say in the form stated in Theorem 4.5 in Chapter 6, one discovers a need for a function G s11r-11 that f can be defined by f (0) = G(f 1 a ) . Such a function G should haw. t.ht. following properties: ra) if A - r a n ( j Fa) # 8 , otherwise.
G(f [ a ) A - r a n ( f G(f 1 a ) = a
If A were well-orderable, some such G could easily be defined; e.g. G(x) =
the +-least element of A a otherwise,
-
ranx if x is a function and A
-
ran X # fl.
where 4 is some well-ordering of A. But in the absence of well-orderings on .4. no property which could be used to define such a function G is obvious. More specifically, let S be a system of sets. A function g defined on S is called a choice function for S if g ( X ) E X for all nonempty X E S . If we now assume that there is a choice function g for P ( A ) , we are able to fill the gap in the previous proof by defining G(x) =
is a function and A - r a n x otherwise.
g ( A - ranx) if
X
# B,
We proved the difficult half of the following theorem, essentially due to Ernst Zermelo: 1.1 Theorem A set A can be well-ordered if and only if the set P ( A ) of all subsets of A has a choice function.
Proof. The proof of the second half is easy. If a choice function g on P ( A ) by
+ well-orders A, we defirlc?
the least element of x in the well-ordering
+
if x # B, if :X = 8.
The problem of well-ordering the set A is now reduced to an equivaltmt. question, that of finding a choice function for P ( A ) . We first notice Theorenl 1.2.
1. T H E AXIOM OF CHOICE AND ITS EQUIVALENTS
1.2 Theorem Every finite system of sets has a choice function. Proceed by induction. Let us assume that every system with n Proof. elements has a choice function, and let IS1 = n + I . Fix X E S ; the set S - {X) has n elements, and, consequently, a choice function gx. If X = a, g = g x U { ( X ,0)) is a choice function for S. If X # 8, gx = g x U {(X. X ) ) is a choice function for S (for any X E X ) . The reader might be well advised to analyze the reasons why this proof cannot be generalized to show that every countable system of sets has a choice function. Also, while it is easy to find a choice function for P ( N ) or P(Q) (why?), no such function for P ( R ) suggests itself. Although choice functions for infinite systems of sets of real numbers have been tacitly used by analysts at least since the end of the nineteenth century, it took some time to realize that the assumption of their existence is not entirely trivial. The following axiom was formulated by Zermelo in 1904. Axiom of Choice
There exists a choice function for every system of sets.
Sixty years later, in 1963, Paul Cohen showed that the Axiom of Choice cannot be proved from the axioms of Zermelo-Fraenkel set theory (more on this in Chapter 15). The Axiom of Choice is thus a new principle of set formation; it differs from the other set forming principles in that it is not eflectiz~e.That is, the Axiom of Choice asserts that certain sets (the choice functions) exist without describing those sets as collections of objects having a particular property. Because of this, and because of some of its counterintuitive consequences (see Section 2), some mathematicians raised objections to its use. We next examine several equivalent formulatioils of the Axiom of Choice and some of the consequences it has in set theory and in mathematics. Following that, at the end of Section 2, we resume discussion of its justification. To keep track of the use of the Axiom of Choice in this chapter, we denote the theorems whose proofs depend on it, and exercises in which it has to be used, by an asterisk. In later chapters, we use the Axiom of Choice without explicitly pointing it out each time. 1.3 Theorem The following statements are equivalent: (a) (The Axiom of Choice) There exists a choice function for eve? system of sets. (b) Every partition has a set of representatives. (c) If ( X , i E I ) is an indexed system of nonempty sets, then them is n function f such that f ( i ) E X, for all i E I .
I
We remind the reader that a partition of a set A is a system of mutually disjoint nonempty sets whose union equals A. We call X C A a set of representatives for a partition S of A if, for every C E S, X n C has a unique element. (See Section 4 of Chapter 2 for these definitions.)
CHAPTER 8. THE AXIOM O F CHOICE
140
The statement (c) in Theorem 1.3 can be equivalently formulated as follows (compare with Exercise 5.10 in Chapter 3 ) : (d) If X , # 0 for all i E I, then Xi # 0.
nisr
Proof,
(a) implies (b). Let f be a choice function for the partition S: then X = ran f is a set of representatives for S. Notice that for any C E S, f ( C ) E X n C , but f ( D ) # X n C for D # C [because f ( D ) E D and D n C = 8). So X n C = { f ( C ) )for any C E S. (b) implies (c). Let C, = { i ) x X,. Since i # i' implies C , n C,( = fl. S = { C , I i E I) is a partition. If f is a set of representatives for S, f is a set of ordered pairs, and for each i E I, there is a unique rt: such that ( i , ~E) f n C,. But this means that f is a function on I and f ( 2 ) E X , for all i E I. (c) implies (a). Let S be a system of sets; set I = S - ( 0 1 , Xc = C for all C E I. Then ( X c ( C E I) is an indexed system of nonempty sets. If fE Xc, f is a choice function for S if 0 # S. If 0 E S , then f U ((@,a)) is a choice function for S.
n,,,
Theorem 1.3 gives several equivalent formulations of the Axiom of Choice. Many other statements are known to be equivalent to the Axiom, and we use some in the applications. The most frequently used equivaler~tis the WellOrdering Theorem, which states that every set can be well-ordered (its equivalence with the Axiom follows from Theorem 1.1). Another frequently used version (Zorn's Lemma) is given in Theorem 1.13. But first we present some consequences of the Axiom of Choice. 1.4 Theorem* Every infinite set has a countable subset.
Proof. Let A be an infinite set. A can be well-ordered, or equivalently. arranged in a transfinite one-to-one sequence (a, I cr < 0) whose length fl is a n infinite ordinal. The range C = {a, I cr < U ) of the initial segment ( a , 1 cr < w ) of this sequence is a countable subset of A. n 1.5 Theorem* that JSI= H,.
For every infinite set S there exists a unique aleph H,
such
Proof. As S can be well-ordered, it is equipotent to some infinite ordinal, D and hence to a unique initial ordinal number W,. Thus in the set theory with the Axiom of Choice, we can define for any set X its cardinal number [ X (as the initial ordinal equipotent to X . Sets X and Y are is the same ordinal as IY I (i.e., = IY I). Also, equipotent if and only if the ordering < of cardinal numbers by size agrees with the ordering of ordinals by E: 1x1 < (Y(if and only if ( X (E IY(. These considerations rigorously justify Assumption 1.7 in Chapter 4: There are sets called cardinals with the property that for every set X there and sets X and Y are equipotent if and only if IX I is is a unique cardinal equal to ( YI.
1x1
1x1,
1x1
1. THE AXIOM OF CHOICE AND ITS EQUIVALENTS
141
As E is a linear ordering (actually a well-ordering) on any set of ordinal numbers, we have the following theorem. 1.6 T h e o r e m * For any sets A and B either IAl 5 IBt or JBII[AI
1.7 T h e o r e m * The union of a countable collection of countable sets is countable. Proof. (Compare with Theorem 3.9 in Chapter 4.) Let S be a countable set whose every element is countable, and let A = US. We show that A is count able. As S is countable, there is a one- bone sequence (A, I n E N ) such that S = { A , I n E N ) . For each n E N, the set A, is countable, so there exists a sequence whose range is A,. By the Axiom of Choice, we can choose one such sequence for each n . [That is: For each n, let Sn be the set of all sequences whose range is A,. Let F be a choice function on {S, I n E N ) , and let S, = F(S,) for each n.] Having chosen one S, = (a,(k) I k E N) for each n, we obtain a mapping f of N X N onto A by letting f ( n ,k) = a,(k). Since N x N is countable and A is its image under f , A is also countable. 1.8 Corollary* The set of all real numbers is not the union of countabl3 many countable sets.
P~voof. The set R is uncountable. 1.9 Corollary* The ordinal w l is not the supremum of a countable set of countable ordinals.
Proof.
If { a , 1 n E N } is a set of countable ordinals, then its supremum
is a countable set, and hence a
< wl.
1.10 T h e o r e m * 2Nu > - NI.
Proof.
This follows from Theorem 1.5 and the fact that 2'"
> No.
D
As a result, the Continuum Hypothesis can be reformulated as a conjecture that
zNo= N,, the least uncountable cardinal number.
142
C H A P T E R 8. T H E A X I O M O F CHOICE
1 .l1 Theorem* If f is a function and A is a set, then ( f [ A ] (. I/Al. For each b E f [ A ] ,let Xb = j - ' ( { b ) ) . Note that X b # 0 aiirl Proof. X b , n Xbr = 8 if bl # b l . Take g E n b c f l a l X b ;then g : f [ A ] + A and bl 6 2 implies g ( b l ) E Xb,, g(bl) E X b z . so g ( b l ) # g(b2). We corlclutie thitt is i i one-to-one mapping of f [ A ]into A, and consequer~tlyl f [ A }l < j Al. J
+
L ,-:
1-12Theorem* I f [ S )5 N, and, for all A E S . IAl 5 N,,. then
1Us15
h',,.
Proof. Tliis is a generalization of Theorenl 1.7. We usurne that S # V) at~cl a11 A E S art? nonempty, writct S = { A u I v < N,,) and for e w h v < H,,. c.tloosci a trilnsfinite sequence Au = (u,,(K) I K < N n ) such that A, = {a,(r;) I ti < N,,). (Cf. Exercise 1.9 in Chapter 4.) \YP define a rnapping f on N,, X N,, onto US by f (U,K ) = a , ( ~ ) .B y Theorem 1.11.
a
(the last step is Theorem 2.1 in Chapter 7 ) .
We rlext derive Zorn's Leinrrl:~,iitlother important versiorl of' the Axiorli of Choice.
1.13 Theorem The following statements are epuivc~lent: ( a ) ( T h e A x i o m of C/ioice) T/~err:e:rists a choice f u ~ ~ c t i o~ ~O tT C-I ) P T ? / .sy,st~rr/.of' sets. (6) ( T h e Well- Ordering Principle) Every set can be well-ordered. (c) (Zorn's Lemma) I f e ~ ~ e chain r y in a partially ordered set has an upper 1101lr~d. then the pa?-tiallg ordered set has a maximal element. Wt. remind the reader that :t cllain is 11 lil~earlyorderecl suhset of' 2 t p;trtiallj. orclerecl set (see Section 5 of Chapter 2 for this definition, as well as tllcl t1t:finitiolis of "ordere(l set," "upper bouilcl." ..maximal r!l~rrlent.'' iirltl otllctr. r.t)lic.cq)t s relattlrl to orderirlgs). 1.1: Proof. Equivalence of' (a) tincl (L>) follows immediately from Tht?ol.ibrr~ tllerefore, it is enough to show that (;I) inlplies (c) arld (c) irrlplies ( i t ) . ( a ) implies (c). Let ( A , 4)be n (partially) ordered set in which every chai11 11% ;in upper bound. Out sttategy is to search for a rnaximal element of ( A . $ 1 by corlstructing a 4-incre<.rsingtrnllsfinite sequtlnct?of elerrlents of A. We fix some b 4 A arid a clloice function g for P ( A ) , and define (a,, ] ( 1 <: h ( A ) ) by transfinite recursion. Given (a€ I ( < c r ) , we c:orlsicler two c.ils~s.If' for all [ < a and A , = { a E A j a€ 4 a holds for all [ < c r ) # 8. we let b# ( L ,, = g(A,,); otlletwise we let u, = b. IVi. leave to the reader thr. easy t:wk of justifyirlg this clefillitioll t)!. 'l'lltlort~~~l 4.4 in Chapter 7. We note that, U,,= b for some u < l ~ ( ~ 4ottlerwise. ); ((it I c: h ( A ) ) would be a one-to-one rnappillg of h ( A ) into A. Let X be the [east c) for
<
1. THE AXIOM OF CHOICE AND ITS EQUIVALENTS
143
which a, = b. Then the set C = {a, I < X ) is a chain in (A, 4) and so it. has an upper bound c E A. If c 4 a for some a E A. we have (L E AA # V) and ax = g ( A A )# b, a contradiction. So c is a maximal element of A. (It is r ? x yt.o see that, in fact, X = ,O 1 and c = ao.) ( c ) implies (a). It suffices if we show that every system of nonempty sets S has some choice function. Let F be the system of all functions f for which S and f ( X ) E X holds for any X E S. The set F is ordered by dom f inclusion C. Moreover, if F. is a linearly ordered subset of (F,C) (i.e., either f C g or g C f holds for any f , g E Fa),fa = U F. is a function. (See Theorern 3.12 in Chapter 2 . ) It is easy to check that fo E F and fo is an upper bound on F* in (F,C_), The assumptions of Zorn's Lemma being satisfied, we conclude that (F. has a maximal element 7. The proof is complete if we show that dom f = S. If not, select some X E S - dom7 and X E X . Clearly. f = ~ L { J( X , x ) }E F a l ~ d j 3 7, coritradicting the maxirnality of 7. D
+
c
c)
We conclude this section with a theotem needed in Chapter 11 1.14 Theorem* If ( A , 4 ) is a linear ordering such th.at for all X E A, then 1 AI 5 N,.
E
A 1y
*
;:z}l
< N,
Proof. Exactly as in the proof of Zorn's Lemnla, we construct an ii~cre~zsing sequence (at 1 < X ) of elements of A for which AA = (n, i.e., there is no a E A such that a, 4 n holds for all E A. As ( A , *) is linearly ordered, this ine;iils that, for every a E A , a 4 a( fot some E A . (We say that the sequellce (a, I C < X ) is cofinal in ( A , $).) We have A = UCcA{y E A I y 4 g},1 { y E A ( y + a ( } I < N, by assurnpt.io~,. and X N, (otherwise, w, < X and (at ( < U,) is a one-to-one nlapping of' N, into {y E A ( y a W T ) contradicting , the assumption). By Theorern 1.12. [AI H r . CI
<
<
<
<
<
Exercises 1.1 Prove: If a set A can be linearly ordered, then every system of fii1it.e subsets of A has a choice function. (It does not follow from the ZerrrleloF'raenkel axioms that every set can be linearly ordered.) 1.2 If A can be well-ordered, then P ( A ) can be linearly ordered. I H i l ~ t :Let < be a well-ordering of A ; for X, Y C A define X 4 Y if and only if thc <-least element of X A Y belongs to X.] 1.3* Let ( A , 5 ) be an ordered set in which every chain lias an upytArbouilrl. Then for every a E A, there is a 5-maximal element X of A such that a
5 X.
1.4 Prove t,hat Zorn's Lemma is equivalent to the statement: For all ( A , S ) , the set of all chains of ( A , 5 ) has an C-maximal element. 1.5 Prove that Zorn's Lemma is equivalent to the statement: If .4 is n syst.cn1 of sets such that, for each B C_ A which is linearly ordered by G , B E A. then A has an C-maximal element.
U
CHAPTER 8. THE AXIOM OF CHOICE
144
1.6 A system of sets A has finite character if X E A if and only if every finite subset of X belongs to A. Prove that Zorn's Lemma is equivalent t o the following (Tukey's Lemma): Every system of sets of finite character has an c-maximal element. [Hint: Use Exercise 1.5.1 1.7* Let E be a binary relation on a set A. Show that there exists a function f : A -, A such t h a t for all X E A, (X,f ( x ) ) E E if and only if there is some y E A such t h a t (X,y) E E. 1.8* Prove that every uncountable set has a subset of cardinality t.t l . 1.9* Every infinite set is equipotent t o some of its proper subsets. Equi~illently, Dedekind finite sets are precisely the finite sets. 1.10* Let (A, <) be a linearly ordered set. A sequence (a, 1 n E U ) of elements of A is d e c ~ u s i n gif a,+l < a, for all n E U . Prove that (A, <) is a wellordering if and only if there exists no infinite decreasirlg sequence in A. 1.11* Prove the following distributive laws (see Exercise 3.13 in Chapter 2).
ET S E S
f f S T LET
LET sES
feST t f T
1.12* Prove that for every ordering .;:on A, there is a linear ordering 5 on A such that a 4 b implies a 5 b for all a,b E A (i.e., every partial ordering can be extended to a linear ordering). 1.13* (Principle of Dependent Choices) If R is a binary relation on M $ U such that for each X E M there is y E M for which xRy, then there is a holds for all n E W . sequence (X, I n E W ) such that 1.14 Assuming only the Principle of Dependent Choices, prove t h a t every countable system of sets has a choice function (the Axiom of Countable Choice). 1 . l 5 If every set is equipotent t o an ordinal number, then the Axiom of Choice holds. 1 . l 6 If for any sets A and B either lA( 5 ( B (or ( B [5 IAl?then the Axiorn of' Choice holds. [Hint: Compare A and B = h ( A ) . ] 1.17* If B is an infinite set and A is a subset of B such t h a t [AI < I B \ . the11
\ B - AI = 181.
2.
The Use of the Axiom of Choice in Mathematics
In this section we present several examples of the use of the Axiom of Choice in mathematics. We have chosen examples which illustrate the variety of roles the Axiom of Choice plays in mathematics, but which do not require extensive background outside of set theory. Other applications of the Axiom of Choice ran be found in the exercises and in most textbooks of general topology, abstract
2. THE USE OF THE AXIOM OF CHOICE IN MATHEMATICS
145
algebra, and functional analysis. T h e examples are followed by some discussion of importance and soundness of the Axiom of Choice. 2.1 E x a m p l e . C l o s u r e Points. According t o the usual definition, a sequence of real numbers (X, I n E N ) converges t o a E R if for every positive real number e there exists n, E N such that [X, - a [ < c holds for all natural numbers n 2 n,. (In this example, 1x1 denotes the absolute value of X, and not the cardinality of X.) Let A be a set of real numbers. Advanced calculus textbooks commonly characterize closure points of A in either (or both) of the following ways: (a) a E R is a closure point of A if and only if there exists a sequence (X, I n E N ) with values in A, which converges t o a. (b) a E R is a closure point of A if and only if for every positive real number e there exists X E A such t h a t jx - a1 < c. I t is then necessary to prove that (a) and (b) are equivalent. (U) implies (b). Given e > 0, there is n, E N such that jx, - a1 < E for all n 2 n,. In particular, lxnt - a1 < e and X,= E A. (b) implies (a). The usual proof proceeds as follows: Let X, = {X E A I ( X - a [ < l / n ) . By (b), X , # 8 for all n E N . Let (X, I n E N )be a sequence such that X, E X, for all n E N . Then each X, E A and (X, I n E N ) converges t o a. T h e question usually passed over is, What reasons do we have to assume t h a t any such sequence (X, I n E N ) exists? Notice that we do not give any property P ( x , y) such that P ( n , y) holds if and only if y = X, (for all n E N ) . Such a property can be exhibited in special cases (e.g., if A is open - see Exercise 2.1); however, it has been shown that the equivalence of (a) and (b) for all A c R cannot be proved from the axioms of Zermelo-Fraenkel set theory alone. Of course, if we do assume the Axiom of Choice, the fact t h a t X, # Q1 for all n E N immediately implies that X, # 0.
nnEN
2.2 E x a m p l e . C o n t i n u i t y o f a Function. The standard definition of continuity of a real-valued function of a real variable is as follows: (a) j : R -+ R is continuous at a E R if and only if for every c > 0 there is 6 > 0 such t h a t 1j ( x ) - f ( a)( < E for all X such that [X - a1 < 6. Continuity is also characterized by the following property: (b) f : R -+ R is continuous at a E R if and only if for every sequence (X, ( n E N ) which converges t o a, t h e sequence (f (X,) I n E N ) converges to f (a). I t is easy to see t h a t (a) implies (b): If (X, 1 71 E N ) converges t o a and if E > 0 is given, then first we find 6 > 0 as in (a), and because (X, 1 n N ) converges, there exists ns such t h a t [X, - a1 < 6 whenever n 716- Clearly, If (X,) - f ( a ) [ < e for all such n. If we assume t h e Axiom of Choice, then (b) also implies (a), and hence (a) and (b) are two equivalent definitions of continuity. Suppose that (a) fails, then there exists E > 0 such that for each 6 > 0 there exists an X such that Is - a1 < d
>
CHAPTER 8. THE AXIOM OF CHOICE
146
but (f(X)- f ( a ) ( 2 E . In particular, for each k = 1 , 2 , 3 , .. . , we can choose some X, such that lxk - a1 < l / k and I f (xk) - f (a)l E. T h e sequence (xk I k E N ) converges t o U , but the sequence ( f (:ck) I k E N )does not converge to f ( U ) . so (b) fails as well. U
>
As in Example 2.1, it can be shown that the equivalence of (a) arid ( l ) ) cannot be proved from the axioms of Zermelo-Fraenkel set theory alone. 2.3 Example. Basis of a Vector Space. We assume that the rc:ader is farrliliar with the notion of a vector space over a field (e.g., over the held of rcbiil numbers). Definitions and basic algebraic properties of vector spaces (:an h(. found in any text on linear algebra. A set A of vectors is linearly zndependent if no finite linen^. corribin:tt,ior~ a , v l + - . +a;u, of elements v1, . . . , U, of A with nonzero coefficients CL 1 . . . . . ( l , , from the field is equal to the zero vector. A basis of a vector space V is t t maxinlal (in the ordering by inclusion) linearly independent subset of V. One of the fundamental facts about vector spaces is m
2.4 Theorem* Every vector space has a basis.
T h e theorem is a straightforward application of Zorn 's Lenlnla: Proof. If C is a C -chain of independent subsets of the given vector space. then the union of C is also an independent set. Consequently, a maximal independent r3 set exists. lFor more details, see the special case in the next example.) This theorern cannot be proved in Zermelo-Fraenkel set theory alone, without using the Axiom of Choice. 2.5 Example. Hamel Basis. Consider the set of all real numbers as a vect,or space over the field of rational numbers. By Theorem 2.4 this vector space hiis a basis, called a Hamel basis for R. In other words, a set X C R is a Hamel basis for R if every s E R can l)e expressed in a unique way as
for some rnutually distirict X I ,. . . ,X,, E X and sorrle nonzero l.ittioli;il riull~hers r l , . . . , r,. Below we give a detailed argument that a set X with the List mentioned property exists. A set of real numbers X is called dependent if there are mutually tlistinc:t, X I , .. . ,X, E X and r l , .. . ,rn E Q such t h a t
and a t least one of the coefficients r l , . . . ,rn is not zero. A set which is not dependent is called independent. Let A be the system of all independent sets of real numbers. We use Zorn's Le111mat o show that A llas a maximal elemerit
2. THE USE OF THE AXIOM OF CHOICE IN MATHEMATICS
147
in the ordering by C; the argument is then conlpleted by showing that any &-maximal independent set is a Hamel basis. To verify the assumptions of Zorn's Lemma. corlsider A. A linearls ordered by C. Let X. = U A o ; X. is an upper bound of A. in (A. 2 ) if X. E -4. i.e., if X. is independent. But this is true: Suppose there were X I , . . . ,X, E X. and r l , . . . , r, E Q, not all zero. such that rl . . X I + . . . + r, - 2, = 0. The11 there would be X I , . . . ,X, E A. such that xl E X l , .. . .X, E X,. Since A. is linearIy ordered by C,the finite subset { X 1 , .. . , X n ) of A. would have a 2-greatest element, say X,. But then 2 1 , .. . ,X, E X,, so X, would not be independent. By Zorn's Lemma, we conclude that (A, C) has a maximal element X. It remains t o be shown that X is a Hamel basis. Suppose that X E R cannot be expressed as rl sl + - .. + r, - X, for any r1,... , r n E Q a n d x l , . . . ,X, E X . T h e n a : f X (otherwisex = l q s ) , so X U { X ) 3 X , and X U { X ) is dependent (remember t h a t X is a maximal independent set). Thus there are X I , .. . ,X, E X U { X ) and s l , . . . .S, E Q, not all zero. such t h a t sl 2 1 + - . + S, . X, = 0. Since X is independent. X E {:rl.. . . , X, ): say. X = sZ, and the corresponding coefficient S, # 0. But then
c
where X I , .. . , X,-l, xi+l, . . . , X, E X , and the coefficients are rational ~lumbers. This contradicts the assumption on X. Suppose now that some a: E R can be expressed in two ways: .I: = 7.1 . X I + m . . + rn . X n = S L + + ~k - YC;,where ~ 1 , ... ,X,. y l . . . . . ! l k E X . r l , . . . ,r,, s l , . . . , s k E Q - (0). Then +
If { X I , . . . ,X,) # {yl,. . . ,yk) (say, XI 4 { y l , . . . ,PI;)), (2.6) can be written as a combination of distinct elements from X with a t least one nonzero coefficient (namely, rl ), contradicting independence of X . We can conclude that 72 = k and X 1 = y t l , . , -, X, = y,,, for some one-to-one mapping (il, . . . , in) between indices 1 , 2 , .. . , 71. We can thus write (2.6) in the form ( r l - s , , ) . z 1 + . . . + ( r , - s , , , ) . ~ , , = 0. Since zl , . . . , X, are mutually distinct elements of X , we conclude tllat, 7.1 - szl= 0, . . . . T, - ~ i , ,= 0 , i.e., t h a t rl = s i , , . . . ,r, = St,, . These arguments show that each X E R has a unique expression in thp desired form and so X is a Hamel basis. The existence of a Hamel basis cannot be proved in Zermelo-Fraenkel theory alone; it is necessary to use the Axiom of Choice.
sc3t
2.7 Example. Additive Functions. A function f : R -. R is called additive if f ( z + v ) = f ( z )+ f (y) for all X, y E R. An example of an additive function if provided by f a where f,(z) = a X for all X E R, arltl U E R is fixed.
CHAPTER 8. THE AXIOM OF CHOICE
148
Easy computations indicate t h a t any additive function looks much like f, for some a E R. More precisely, let f be additive, and let us set f ( l ) = a. We then have f(2) = f ( l )
+ f (l)= a
2,
f (3) = f (2) + f (1)= a - 3,
+
and, by induction, f ( b ) = a . b for all b E N - (0). Since f (0) f (0) = f (0+ 0) = f (o), we get f (0) = 0. Next, f (-6) + f(b) = f (0) = 0, so f ( - b ) = -f ( h ) a (-6) for b E N. To compute f ( l / n ) , notice that
-
+
n times
consequently, f ( l / n ) = a . l / n . Continuing along these lines, we can easily prow. t h a t f (X)= a X for all rational numbers X. It is now natural t o conjecture that f (X) = a - X holds for all real numbers x; in other words, t h a t every additive function is of t h e form fa for some U E R . It turns out that this corljecturc cannot be disproved in Zermelo-Fraenkel set theory, but t h a t it is false if we assume t h e Axiom of Choice. We prove the following theorem. 2.8 Theorem* There exists a n additive junction f : R for all a E R.
Proof.
+R
such that f #
fa
Let X be a Hamel basis for R. Choose a fixed Z E X . Define rt
0
i f x = r , . x l + v v . + r i . ~ l + ~ ~ a .n d+x ~, =nT~, ~ n otherwise.
It is easy t o check that f is additive. Notice also t h a t 0 4 X and X is infinite (actually, IX I = 2*0). We have f (Z) = l (because Z = 1 - Z is the basis representation of Z), while f ( 5 )= 0 for any Z E X , 5 # Z (because Z does not occur in t h e basis representation uf 2 = 1 Z of 2). If f = fa were to hold for some a E R , we would have f (T)= 1 = a Z, showing a # 0, and, on t h e other G hand, f(Z) = 0 = a . E , showinga = O . 2.9 Example. Hahn-Banach Theorem. A function f defined on a vector space V over t h e field R of real numbers and with values in R is called a l'i.rw(~,r. functional on V if
holds for aH U,v E V and a , 0 E R. A function p defined on V and with values in R is called a sublinearfunctionul on V if p(u + v) 5 p(u) + p(v) for all U,U E V and p ( & . U) = o p(u) for all
U E
V and
Q
2 0.
2. THE USE OF THE AXIOM OF CHOICE IN MATHEMATICS
149
The following theorem, due to Hans Hahn and Stefan Banach, is one of the cornerstones of functional analysis. 2.10 Theorem* Let p be a sublinear functional on the vector space V , an.d let fo be a linear functional defined on a subspace V. of V such that fo(v) 5 p(v) for all v E Vo. Then them is a linear functional f defined on V such that f 3 fil and f ( v ) 5 p(v) for all v E V .
Proof. Let F be the set of all linear functionals g defined on some subspacc W of V and such that fo G g
g(v) 5 p(v) for all v E W.
and
c).
We obtain the desired linear functional f as a maximal element of (F, To verify the assumptions of Zorn's Lemma, consider a nonempty F. c F linearly ordered by C. If go = U Fo, go is an &-upper bound on F. provided go E F. Clearly, go is a function with values in R and go fo. Since the union of a set of subspaces of V linearly ordered by C is a subspace of V, dorn go = UgEfi, dom g is a subspace of V . To show that go is linear consider U , v E dom go and a , P E R. Then there are g , g' E F. such that U E dom g and v E dorn g'. Since F. is linearly ordered by G , we have either g C g' or g' C g. In the first case. U ,V ,cr.u+/?Uv E domg' and go(cr-U+@-v) = gf(cr-?~+ijl.?)) = a . g ' ( ~ ~ ) + / 3 . g= '(t~) cr . go(u)+ @ go(v);the second case is analogous. Finally, go(u)= g(u) < p(u) for any U E domgo and g E Fo such t h a t u E domg. This completes the check of g0 E F. By Zorn's Lemma, ( F ,2 ) has a maximal element f. It remains t o be shown that dorn f = V. We prove that dorn f C V implies that f is not maximal. Fix u E V - dorn f ; let W be the subspace of V spanned by dom f and U . Sincc2 every W E W can be uniquely expressed as W = X + a . U for some X E dorn f and cr E R?the function f , defined by
>
m
is a linear functional on W and c E R can be chosen so that
fc
3 f . The proof is complete if we show that
for all X E dom f and cr E R. The properties of f immediately guarantee (2.11) for a = 0. So we have t,o choose c so as t o satisfy two requirements: (a) For all a > 0 and a: E dom f , f ( x ) + cr c 5 p(x ~ t . U ) . ( b ) For all a > 0 and y E d o m f , f ( y ) + ( - ~ ) ~ Ir p: ( y + ( - & ) . U ) . Equivalently,
+
+
CHAPTER 8. THE AXIOM O F CHOICE
150 and then (2.12)
f(
Y )
should hold for all
-P(:Y-U)
X ,y E
c P ( . I + ' L )
do111f and cr
f ( V ) + f ( t ) = f (v
> 0. But, for all
U,
-
t E clom f .
+ t ) < y(v + t ) Ip(fJ- U ) + 142 + 4
If A = sup{ f (v) - p(v - U ) 1 .o E tlorrl f ) kind B = inf {p(t+ ZL)- f ' ( t ) I t (10111 f ) . 5 B. By choosing c such that A 5 c 5 B, we (:an make the iclerlt,itj-
we have A
--
(2.12) hold.
-_I
2.13 Example. The Measure Problem. An irrlportant problenl irl analysis is t o extend the riotion of le~lgtliof a11 interval to Inore complic-atetlsets c)f r.c.;ll ilurnbers. Ideally, one would like to have a function p defirled on P ( R ) . with
values ill 10. X ) U {m). and having the followirlg prupert,ies: 0 ) p([a, 61) = 6 - a for arly a. 6 E R. a < b.
ii) If
is a collect.ion of nlutually disjoint subsets of R. then
(This property is ci~lIedcountable additivity or a-additivity of p . ) iii) If a E R, A G R, and A + u = { x + u ( trur~slutior~ i~~uu~'ia7tce of p ) . Several additional propertks of 2.3):
)L
15
E A ) , the11 p ( A + u ) = p ( A )
follow irrlnlediately frortl U).-iii) (Exrrc:isv
iv) If A n B = QI then p(A U B) = p(A) + p(B) (finite additivity).
However, the Axiom of Choic-e implies that nlentioned properties exists:
110
function p witeh t,lw iil)o\.c.-
2. THE USE OF THE AXIOM OF CHOICE IN A4ATHEMATICS 2.14 Theorem* There is no function p : P(R) properties 0) -v).
Proof.
We define an equivalence relation
sxy
if and only if X
4
151
[O.m) U { m ) uith the
= on R by:
- y is a rational number,
and use the Axiom of Choice to obtain a set of representatives X for z . It is easy to see that
R=
(2.15)
U { X+ r I r is rational);
moreover, if q and 1. are two distinct rationals, then X + q and X + r are ciis+joillt.. We note that p ( X ) > 0: if p ( X ) = 0, then p ( X + (I) = 0 for every q E Q, i ~ l l d
a contradiction. By countable additivity, there is a closed interval la, b] such t h a t p ( X n [a.b ] ) > 0. Let Y = X n [a, b). Then
and the left-hand side is the union of infinitely Inany mutually dis.joint set,s Y + g , each of measure p ( Y + q ) = p ( Y ) > 0. Thus the left-hand sid(. of (2.16) has measure m, contrary to the fact t h a t p ( [ a ,b + 1))= 6 + 1 - U. T h e theorem ure just established demonstrates that some of the requirelne~lts on p have t o be relaxed. For the purposes of rnat.henlat,ical arlalysis it is nlost fruitful t o give up t h e condition that p is defined for all subsets of R, arlcrl recluir~ only t h a t the domain of p is closed under suitable set. operations. 2.17 Definition Let S be a nonempty set. A collection 6 P ( S ) is algebra of subsets of S if (a) @ E 6 and S E 6. (b) If X E 6 then S - X E 6. ( c ) If X , E 6 for all n, then U:=o X n E B and Xx, E 6.
ii G -
nr=q
2.18 Definition A a-additive measure on a a-algebra 6 of subsets of S is a function p : G -. 10, m ) U {m) such t h a t
ii) If { X n ) r = o is
(Z
collection of mutually disjoint sets from 6 , then
CHAPTER 8. THE AXIOM OF CHOICE The elements of 6 are called p-measurable sets. T h e reader can find some simple examples of a-algebras and a-additive measures in the exercises. In particular, P ( S ) is the largest U-algebra of subsets of S; we refer t o a measure defined on P ( S ) as a measure o n S. The theorem we proved in this example can now be reformulated as follows. 2.19 Corollary Let p be any a-additive measure on a U -algebra 6 of subsets
of R such that
(0) [a,bj E G and p ( [ a ,bj)
=
b - a , for all a , b E R, a
< b.
(iii) If A E G then A + a E G and p ( A + a ) = p(A), for all
U
E
R.
Then there exist sets o j real numbers which are nut p-measurable. In real analysis, one constructs a particular a-algebra 1137 of Lebesgue measurable sets, and a U-additive measure p on 1137, the Lebesgue measure, satisfying properties (0) and (iii) of the Corollary. So existence of Lebesgue nonmeasurable sets is a consequence of the Axiom of Choice. Robert Solovay showed that the Axiom of Choice is necessary t o prove this result. Other ways of weakening the properties 0)-iv) of p have been considered. and led to very interesting questions in set theory. For example. it is possiblo t o give up the requirement iii) (translation-invariance) and ask sinlply wllet,ht:r there exist any a-additive measures p on R such that p ( [ a ,b]) = b - a for all a , b E R, a < b. This question has deep connections with the theory of large cardinals, and we return t o it in Chapter 13. Another interesting possibility is t o give u p ii) (countable additivity) and require only iv) (finite additivity). Nontrivial finitely additive measures do exist (assuming the Axiom of Choice) and we study them further in Chapter 11. Let us now resume our discussion of various aspects of the Axiom of Choice. First of all, there are many fundamental and intuitively very acceptable results concerning countable sets and topological and measure-theoretic properties of the real line, whose proofs depend on the Axiom of Choice. We have seen two such results in Examples 2.1 and 2.2. It is hard to image how one could study even advanced calculus without being able to prove them, yet it is known that they cannot be proved in Zermelo-Fraenkel set theory. This surely constitutes some justification for the Axiom of Choice. However, closer investigation of the proofs in Examples 2.1 and 2.2 reveals that only a very limited form of tlio Axiom is needed; indeed, all of the results can still be proved if one assumes only the Axiom of Countable Choice. Axiom of Countable Choice
There exists a choice function for every countable
system of sets. It might well be that the Axiom of Countable Choice is intuitively justified, but the full Axiom of Choice is not. Such a feeling might be strengthened by realizing that the full Axiom of Choice has some counterintuitive corlsc?quent.tls.
2. THE USE OF THE AXIOM OF CHOICE IN MATHEMATICS
153
such as the existence of nonlinear additive functions from Example 2.7, or the existence of Lebesgue nonmeasurable sets from Example 2.13. Incidentally, none of these consequences follows from the Axiom of Countable Choice. In our opinion, it is applications such as Example 2.9 which mostly account for the universal acceptance of the Axiom of Choice. The Hahn-Banach Theorem, Tichonov's Theorem (A topological product of any system of compact topological spaces is compact.) and Maximal Ideal Theorem (Every ideal in a ring can be extended t o a maximal ideal.) are just a few examples of theorems of sweeping generality whose proofs require the Axiom of Choice in almost full strength; some of them are even equivalent t o it. Even though i t is true that we d o not need such general results for applications to objects of more immediate mathematical concern, such as real and complex numbers and functions, the irreplaceable role of the Axiom of Choice is t o simplify general topological and algebraic considerations which otherwise would be bogged down in irrelevant set-theoretic detail. For this pragmatic reason, we expect that the Axiom of Choice will always keep its place in set theory. Exercises
2.1 Without using the Axiom of Choice, prove t.hat the two defirlitions of closure points are equivalent if A is an open set. [Hint: X, is open, so X, n Q # 8,and Q can be well-ordered.] 2.2 Prove that every continuous additive function f is equal t o fa for some a E R. 2.3 Assume t h a t p has properties 0)-ii). Prove properties iv) and v). Also prove: vi) p(A U B) = p(A) 2.4*
2.5
2.6
2.7 2.8
+ p ( B ) - p(A n B).
vii) ~ ( U zAn) o 5 Cz=op(An). Let 6 = {X C_ S I 5 N o o r I S -XI 5 No). Prove that 6 is a a-algebra. Let C be any collection of subsets of S. Let G = C I and T is a a-algebra of subsets of S). Prove that G is a a-algebra (it is called the a-algebra generated by C). Fix a E S and define p on P ( S ) by: p ( A ) = 1 if a E A, p ( A ) = 0 if a 4 A. Show that p is a a-additive measure on S. For A 5 S let p(A) = 0 if A = 0, p(A) = m otherwise. Show that p is a a-additive measure on S. For A S let p(A) = IAl if A is finite, p(A) = m if A is infinite. p is a D -additive measure on S; it is called the counting measure on S.
1x1
n{l) c
c
This page intentionally left blank
Chapter 9
Arithmetic of Cardinal Numbers 1.
Infinite Sums and Products of Cardinal Numbers
In Chapter 5 we introduced arithmetic operations on cardinal numbers. It is reasonable t o generalize these operations and define sums and products of infinitely many cardinal numbers. For instance, it is natural t o expect that
1
+ 1+
=
No
NI, times
or, more generally, X times
was defined .Sthe cardinality of The sum of two cardinal numbers I C ~and A I U A2, where A1 and A2 are disjoint sets such that A1 1 = ~1 and I A2 1 = K Z . Thus we generalize the notion of sum as follows. 1.1 Definition Let ( A i 1 i E I) be a system of mutually disjoint sets. and Ict /All = K, for all i E I. We define the sum of ( K , I i E I) by
T h e definition of C l En,I uses particular sets A, (i E I ) .In the finite case. when I = {l,2) and n1 tcz = (Ar U Azl, we have shown t h a t the choice of A1 and A2 is irrelevant. We have proved that if A', , A; is another pair of disjoint sets such that [ A \ I = n l , IAil = n2, then JA; U Ail = / A 1 LJ A2J.
+
CHAPTER 9. ARITHME;TIC OF CARDINAL NUMBERS
156
In general, one needs the Axiom of Choice in order to prove the corresponding lemma for infiuite sums. Without the Axiom of Choice, we cannot exclude t,he following possibility: There may exist two systems ( A , ( n E N ) , (A; [ n E N ) of mutually disjoint sets such that each A, and each A; has two elements, but , :U A , is not equipotent to U%oA;! For this reason, and because many subsequent considerations depend heavily on the Axiom of Choice, we use the Axiom of Choice from now on without explicitly saying so each time.
1.2 Lemma If (A, I i E I) and ( A : I z E I) are systems of mutually disjo2n.t sets such that /All = IA:l for all i E I , then /UicfA,/ = Ail. Proof, Then f =
For each i E I, choose a one-to-one mapping fz of A , onto A : .
UXEI is a oneto-one mapping of Uie,A, onto UiEfA:. D This lemma makes the definition of C,,, legitimate. Since infinite unions fL
of sets satisfy the associative law (Exercise 3.10 in Chapter 2), it follows that the infinite sums of cardinals are also associative (see Exercise 1.l ) . The operation C has other reasonable properties, like: If K., X, for all i E I, then CisrK' < C,,I X, (Exercise 1.2).However, if K, < Xi for all i E I, it does not necessarily follow that C, K, < C, X, (Exercise 1.3). If the summands are all equal, then the following holds, as in the finite case: If K~ = K for a11 z E X, then
<
= K+&+... 1E X
= K .
X.
X times
(Verify! See Exercise 1.4.) It is not very difficult to evaluate infinite sums. For example, consider
It is easy to see that this sum is equal to No. In fact, this follows from a general theorem. 1.3 Theorem Let X be an infinite cardinal, let numbers, and let n = SUP{K, ) 0 < X). Then
C n,
=X
H
= X-sup{n,
K,
l
(0
n
< X) be nonzero cardinc~l
< X}.
-<X
On the one hand, K, 5 K for each a < X, and so C,,x K, < C,,, n = n . X. On the other hand, we notice that X = C,,* l n,. nQ is an upper bound o f t he nags and We also have n 5 C,,* n,: the sum K. is the least upper bound. Now since both K and X are 5 C,,,,, K,, it follows C,,, K,. The conclusion that n . X, which is the greater of the two. is also -of Theorem 1.3 is now a consequence of the Cantor-Bernstein Theorem.
Proof.
<
<
-
1. INFINITE SUMS AND PRODUCTS OF CARDINAL NUMBERS 1.4 Corollary If then
(i E I ) are cardinal numbers, and if 111
C
K,
157
< supin, 1 i E I ) ,
= sup n, % EI
if l
(In particular, the assumption is satisfied
.if all
the
K, 'S
are mutual19 distinct.)
The product of two cardinals n1 and n2 has been defined as the cardinality of the cartesian product AI X A2, where AI and Aa are arbitrary sets such that IAI J = n1 and ] A 2( = n2. This is generalized as follows. 1.5 Definition Let ( A i I i E I ) be a family of sets such that [Ail = K , for all i E I. We define the product of (tCi J i E I) by
We use the same symbol for the product of cardinals (the left-hand side) as for the cartesian product of the indexed family ( K , 1 i E I). It is always clear from has. the context which meaning the symbol
n
Again, the definition of n , , , n, does not actually depend on the particular sets A i . 1.6 Lemma If ( A i ] i E I ) and (AI i E I , then I f l i e r Ail = I n , E IAil.
Ii
E I ) are such that IA,I = IA:l for all
Proof. For each i E I , choose a one-to-one mapping f, of A, onto A:. Let f be the function on Ai defined as follows: If X = (xi I i E I ) E H,, I A,. let f ( X ) = (fz(x,) I i E I). Then f is a one-to-one mapping of A, onto A:.
niE1
ni,,
n,€,
The infinite products have many properties of finite products of natural numbers. For instance, if a t least one K, is 0 , then na = 0. The products also satisfy the associative law (Exercise 1.7); another simple property is that if n, < X, for all i E I, then ni 5 hi (Exercise 1.8). If all the factors K , are equal t o K , then we have, as in the finite case,
H,,,
nlEI n,,l
(Verify! See Exercise 1.10.) The following rules, involving exponentiation, also generalize from the finite to the infinite case (see Exercises 1.11 and 1.12):
158
CHAPTER 9. AR.iTHMETIC O F CARDINAL ,VUhJBERS
Infinite products are more difficult t o evaluate than infinite sums. In s o ~ n c ~ special c u r s , for instance when evaluating the product H,,, tiI, of iill iur:reilsirlg sequence (K.,* 1 CT < A) of c:ardinals, some simple rules <:anbc provcrl. \Ibre c , o l l s i r I t l r only the following very special cnsr:
,
n
11.
1 . 2 . 3 . , . . . n. . . .
(n E N ) .
First, we note that 00
;X]
Conversely, we have
:m
m
W
ancl so ure conclude that
We now prove an important theore~n.wllicll can be used t o derive various illequalities in cardinal arithmetic. 1.7 Kiinig's Theorem K , < A, f 0 7 . ( ~ 1 1i E I , then
If
K,
u n d X, (i f I ) uw curdznal n u m t ~ ~ rc~n.d s . rf
Pmof. First, let US show that K , 5 H,,, A,. Let ( A , I i E I) iulcl ( B , 1 i f I ) be such that [ A , ]= ti, HI^ IB, I = X, for all i E I iilld t,llv -4, 'S iiro mutually disjoint. Wt. may further assume that A, C B, for all i E I. \Ir(. fillcl ii one-to-one mapping f of UIElAI into H,,, B,. We choose d, E B , - Ai tor each i E I, and define a function j' ;is follon-S: For each r E U I E Ai, I let i, ire the unique i E I such t h a t .C E A , . L(.t f ( x ) = ( ( L ~ 1 i E I ) , where
-
If .r # yl let f ( L ) = U arld f (y) = b aud let us show that u # 6. If i:,. == 1 , = r . t,hen U, = X while b, = y. If i , # i, = i, then U, (1, gL A while 6, = y E A . 111 (!ither ~ : a s e!(X) , # f (g) and hence f is one-to-one. Now let us show that 1, n, < H, X,. Let B, ( i E I ) be such that. \ B ,I = h , . for all i E I. If the product X, were equal t o the sun1 C2 ti,. w r ct~ulcl find i~lutuallyclisjoiilt subsets X, of t h e cartesin11 product, B, such t l ~ i i t [X,[= ti, for all i and
n,
n,,,
1. INFINITESUMSANDPRODUCTSOFCARDINALNUMBERS
159
Figure 1 We show that this is impossible. For each i f I, let ( 1-8)
A, =
I
{U, a
E
X,)
(s~e Figurr 1).
For every i E I, we have Ai C B,, since IA,i 5 [X,/ = 6, < X, = IB, 1. Hrrict. there exists b, f B, such that b, 4 A i . Let b = ( h , ( i E I ) . Now W(, C i L l l c'iisily show that 1) is not a member of any X, (i E I): For any 1 € I , h, $! A , . atit1 so Bi, a contradiction. by (1 .g), b gl X,. Hence UiErX iis not the whole set We use Konig's Theorem in Section 3; at present, let us just mention that the theorem (and its proof) are generalizations of Cantor's Theorem which st.atvs that 2h > K for a11 K . If we express K as the infinite s u r n K =
l+l+.--
(K,
times)
(K
times).
and 2" as the infinite product 2" = 2 . 2 . 2 . . - .
we can apply Konig's Theorem (since l
< 2) and obtain
Exercises
1 .l If J , (i E I) are mutually disjoint sets and J = are cardinals, then
(assoczutzvity of C). 1.2 If K, X, for all i E I, then
CiE,ni 5 xi,,X,.
UIEI .I2. and
if
6 ,(
j E .I)
CHAPTER 9. ARITHMETIC OF CARDINAL NUMBERS
160
1.3 Find some cardinals K,, X, (n E N) such that C r = o nn = C= :o XXn (X times) = X . K. 1.4 Prove that K + n + 1 . 5 Prove the distributive law:
1.6 l U,,, A21 5 C.EI 1A'I. 1 . 7 If .l, ( i E I ) are mutually disjoint sets and J = are cardinals. then
K,
Ui,,
< X, for all
J , , and if
Kj
r ~ but ,
( jE J )
n)
( associativlty of . 1 . 8 If K, . IX, for all i E I , then
1.9 Find some cardinals K,, X, ( n E N ) such t h a t K, < An for all n, but K n = n r = o An. 1.10 Prove that K . . K.. . - - (X times) = l .l1 Prove the formula K,)= ~ n,,,(n:). {Hint: Generalize the proof = (rcX)p,given in Theorem 1.7 of Chapt,er 5.1 of the special case 1.12 Prove the formula n ( . * - ) = ,&.E l 4 .
nzo
(ntE1 l€
I
[Hint: Generalize the proof of the special case K" 61' = l i A + p given i r i Theorem 1.7(a) of Chapter 5.1 1.13 Prove that if 1 < n, 5 X, for all i E I, then E,,,n, 5 X,. [Answer: 2 H 1.l 1 . 1 4 Evaluate the cardinality of U. 1.15 Justify existence of the function f in the proof of L e m n ~ a1.2 in detail by the axioms of set theory.
no,,a<wl
2.
n,,,
Regular and Singular Cardinals
Let (a,1 v < 6) be a transfinite sequence of ordinaI numbers of length g . Wt. say that the sequence is increasing if a , < cr, whenever v < p < 29. If O' is ii limit ordinal number and if ( U , ( v < 6) is an increasing sequence of ordinals. we define U = lim a , = sup{a, I v < 6) v49
and call CY the limit of the increasing sequence.
2.1 Definition An irkfinite cardinal n is called szngular if there exists an illcreasing transfinite sequence (cl, I U < 6')of ordinals U , < K whose length 6 is
2. REGULAR AND SINGULAR CARDINALS a limit ordinal less than K , and singular is called regular.
A subset X
K
a,. An infinite cardinal that is not
= limv,s
c K is bounded if sup X <
16 1
K,
and unbounded if sup X
= K.
2.2 Theorem Let K be a regular cardinal. (a) If X C K is such that 1x1 < K , then X is bounded. Hence every unbounded subset of K has cardinality K . (b) IjA < K and f : X 4 n, then f h] is bounded. Proof. (a) This is clear if X has a greatest element. Thus assume that the order type of X is a limit ordinal, and let ( a , I v < 19) be an increasing enumeration < K , we have I9 < K , and because K is a regular cardinal. of X . As 161 = it follows that sup X = lim,,e a, < K. (b) As I f [All 5 X < K , this follows from (a).
1x1
An example of a singular cardinal is the cardinal N,; we have
N,
=
lim N,,
n-U
where w < N, and N, < N, for each n. Similarly, the cardinals NW+,, H,.,, NW, are singular: = lim NW+,,
NW, = lim
Na.
a - W1
On the other hand, No is a regular cardinal. T h e following lemma gives a different characterization of singular cardinals.
2.3 Lemma An infinite cardinal K is singular i f and only i f it is the sum of less than a smaller cardinals: n = K ~ where , 1 1 1 < K and n, < K for al! i € I.
xiEr
Proof. If K is singular, then there exists an increasing transfinite sequence such t h a t K = lim,,s a,, where t9 < K, and a, < K for all v < 6. Since every ordinal is the set of all smaller ordinals, we can reformulate this as follows:
If we let A, = a, - U<, a t , then (A, I v < 6) is a sequence of fewer than 5 10, 1 < K , and since the K sets of cardinality K, = lAvl = - U(<,
CHAPTER 9. ARITHMETIC OF CARDINAL NUMBERS
162
AV's are mutually disjoint, this shows that n = CvCB KY as required. Thus thtl condition in Lemma 2 . 3 is necessary. To show that the condition is sufficient. let us assume t h a t h: = C,,,, K,,. where X is a cardinal less than K , and for all a < X, K , are cardinals srnallt?r than K . By Theorem 1 . 3 , K = A . sup,,A K,, and since X < K . we necessaril). have h: = supaCAK a . Thus the range of the transfinite sequence ( K , , I cr < X) has supremunl K , and since K, < h: for all a < X, we can find (by tr:~nsfinitf. recursion) R subsequeilce which is i11c:reasingand has limit K . Clearly. thr, length of the subsequence is a limit ordi~lal,B < X , (2nd it follows thilt h: is sitlguliir. (El
A n infinite cardinal N, is c:nllecl a successor. curdinal if its i~lclexc.r is il successor ordinal, i.e., if N, = N3+] for some B. If K = Ns then we call N,.,,] the successor of K and denote it K * . If ul is a limit ordinal, then H, is c a l l ~ da limit cardinal. If a > 0 is a limit ordinal, then N, is the limit of the sequei1t.c. ( N o 1 0 < cl). 2.4 Theorem Every successor car.dina1 N a t l
Proof. cardinals:
as a regular cardinal.
Otherwise, H,,+] would be the sum of a smaller number of smaller
where III < N,+l, alld all i E I, and we have
h:,
< Nu+, for all i
This is a contradiction, and hence
E I. Then 111
< N,,
and
h-,
< N,,
ti)r
is regular.
By Theorem 2.4, every singular cardinal is a limit cardinal. Let us look 11ow at limit cardinals.
2.5 Lemma There urn urbztrur.il?/ lurge szngulur. curdznuls. Proof.
Let N, be an arbitrary (:arttinal. Consider the sequence
Then and hence N,+,
is a singular cardinal greater than N,.
All uncountable limit cardinals we have seen so far were singular. A question naturally arises whether there are any uncountable regular limit cardintils. Suppose that N,, is such a cardinal. Since cr is a limit ordinal, we have
N,, = lirrl No, 3+a
2. REGULAR AND SINGULAR CARDINALS
163
i.e., N, is the limit of an increasing sequence of length cr. Since N, is regular, we necessarily have (Y 2 N,, which together with n < N, gives
Already this property suggests that t4, has t o be very large. Although the condition (*) seems t o be strong, it is not a s strong as it looks. 2.6 Lemma There are arbitrarily large singular curdinals N, such thut N, = n .
Proof. sequence:
Let N, be an arbitrary cardinal. Let us consider the following
012
'W,, = WUWY
for all n f N ,and let a = lim,,, Q,. N) has limit N,. But then we have
It is clear that the sequence (N,,,, ]
71
E
N, = lim No,, = lim an+,= a. n-+w
n-+w
Since N, is the limit of a sequence of smaller cardinals of length
W,
it is singular.
An uncountable cardinal number N, that is both a limit cardinal and regular is called inaccessible (it is often called weakly inaccessible to distinguisll this kind from cardinals t h a t are defined by a stronger property - see Section 3). It is impossible t o prove that inaccessible cardinaIs exist using only the axioms of Zermelo-Fraenkel set theory with Choice. 2.7 Definition If a is a limit ordinal, then the cofinality of Q , cf ( a ) ,is the least ordinal number 6 such that a is the limit of an increasing sequence of ordinals
of length 6.
<
(Note that c f ( a ) is a limit ordinal and c f ( c ~ ) a . ) Thus N, is singular if cf(w,) < w, arld is regular if cf(w,) = W,. Let a be a limit ordinal which is not a cardinal number. If we let K. = 101. there exists a one-to-one mapping of K onto a , or, in other words. a one-to-one sequence (a, I u < K ) of length K such that {o, [ u < K ) = a. NOW WP call find (by transfinite recursion) a subsequence which is increasing and has limit a . Since the length of the subsequence is a t most K , and since K = [ N I is less than cr (because a is not a cardinal), we conclude that c f ( c ~< ) a. Thus we have proved the following:
CHAPTER 9. ARITHMETIC OF CARDINAL NUMBERS
164
2.8 Lemma If a limit ordhal a is not a cardinal, then cf(a) < a .
D
As a corollary, we have, for all limit ordinals a, cf ( a )= a if and only if a is a regular cardinal. 2.9 Lemma For every limit ordinal a , cf(cf(a)) = cf(a).
Proof. Let 29 = cf(a). Clearly, 29 is a limit ordinal, and cf(29) 5 B. We have to show that cf(6) is not smaller than 6. If y = cf(29) < 6, then there exists an increasing sequence of ordinals (P( I < y ) such that lime,, ut = . I ? . Sirice B = cf(a), there exists an increasing sequence of ordinals (a, I u < 0 ) such that limUdda, = a. Then the sequence (a,, I < y) has length 7 arid lim<,, a,, = a. But y < B, and we reached a contradiction, since B is supposetl to be the least length of a n increasing sequence with limit a . 0
<
2.10 CoroIIary For eve y limit ordinal a, cf(a) is a regular cardinal.
0
Exercises
2.1 cf(N,) = cf(H,+,) = W . 2.2 cf(N,,) = w l , cf(N,,) = W * . 2.3 Let a be the cardinal number defined in the proof of Lemma 2.6. Show that cf(a) = W . 2.4 Show that cf(a) is the least y such that a is the union of 7 sets of cardinality less than 101. 2.5 Let H, be a limit cardinal, a > 0. Show that there is an increasing sequence of alephs of length cf(N,) with limit H,. 2.6 Let K be a limit cardinal, and let X < K be a regular infinite cardinal. ) cardinals Show that there is an increasing sequence (a, [ u < c f ( ~ )of such that lim,,,f(K) a, = K and cf (au) = X for all U.
3.
Exponentiation of Cardinals
While addition and multiplication of cardinals are simple (due to the fact that H, + No = H, No = the greater of the two), the evaluation of cardinal expanentiation is rather complicated. Here, we do not give a complete set of rules (in fact, in a sense, the general problem of evaluation of K' is still open), but, prove only the basic properties of the operation K'. It turns out that there is a difference between regular and singular cardinals. First, we investigate the operation 2"e.. By Cantor's Theorem, 2"- > N,; ill other words, m
Let us recall that Cantor's Continuum Hypothesis is the conjecture that 2N" = N 1 . A generalization of this conjecture is the Generalized Continuum Hypothesis : 2 ~= " N , for all a. As we show, the Generalized Continuum Hypothesis greatly simplifies the cardinal exponentiation; in fact, the operation K' can then be evaluated by very simple rules. The Generalized Continuum Hypothesis can be neither proved nor refuted from the axioms of set theory. (See the discussion of this subject in Chapter 15.) Without assuming the Generalized Continuum Hypothesis, there is not much one can prove about 2N- except (3.1) and the trivial property: (3.2)
2N,v < 2'li
whenever a 5
P.
The following fact is a consequence of Kijnig's Theorem.
3.3 L e m m a For every a, ~ f ( 2 ~ d>. )El,. Thus 2Nncannot be N,, since ~ f ( 2 ~ = w )No, but the lemma does not prevent 2N11 from being N,, . Similarly, 2" cannot be either N,, or N, or etc.
Proof. Let I9 = ~ f ( 2 ~ "I9) ;is acardinal. T h u ~ 2 ~is 1the ~ limit o f a n increasing sequence of length 6, and it follows (see the proof of Lemma 2.3 for details) that = K" ,
r
where each K , is a cardinal smaller than 2N1p. By Konig's Theorem (where we let X, = 2'- for all u < IY), we have
and hence 2Naj < (2'1.)'.
a contradiction.
Now if
29
were less than or equal to H,, we would get
0
The inequalities (3.1), (3.2), and (3.4) are the only properties that can be proved for the operation 2N- if the cardinal N, is regular. If N, is singular, then various additional rules restraining the behavior of 2N1.are known. We prove one such theorem here (Theorem 3.5); in Chapter 11 we prove Silver's Theorem (Theorem 4.1): If H, is a singular cardinal of cofinality cf(N,) 2 N1,and if 2Nt = for all < a, then 2Na = Nocl
CHAPTER 9. ARITHMETIC OF CARDINAL NUMBERS
166
3.5 Theorem Let fi, be a singular cardinal. Let us assume that the 71al7ir o] 2 ' ~ is the same f o ~all 5 < C), say 2N' = No. Then 2N17 = N N .
For
Note that it is implicit in the theorem t h a t No is greater tlian N,. instance, if we know t h a t 2N11= Nu+:, for a11 n < w , the11 2'u =
Since N, is singular, there exists, by Lenlnia 2.3, a colle(:tion (ti, 1 Pr.ooJ i E I } of cardinals such that ti, < N, for all i E I, arid (I1 = H, is n cardilia1 less than H,, and N,, = C,,I K , . By the assumption, we have 2"* = N , j for i E I, irnd iilso 2H7 = NLj. SO
U
1% riow approach the p r o b l ~ r ~o fi evalui~tir~g N,,". whored N,, aiitl N., i~1.t. arbitrary infinite cardinals. First. we make the following obsckrvat,iori.
Proof.
Clearly,
2Hli
5 NE". Since N, 5
W l i e ~ trying i t o evaluate
2 H 1 1 ,
we also have
NE" for a > 8. we find the folluwi~lguseful.
>
3.7 Lemma Let a B and let S be the set of all subsrlts X / X i = N,j. Then IS/ = N?'.
C
SIL(:II. tlrt~t
<
We first show that H": IS]. Let S' be the set of' all subsrts X c i ~ Xo U, such t h a t [ X1 = No. Since No . N, = N,! WC have ISf(= 15'1. Sow (.very furlction f : wg + w, is a rriernber of the set S' and hctncr: W,"" S'. r ~ ~ ~ t ? r e fN:o;' r e ,5 IS/. Coliversely. if X E S, the11 thrzrc exists a functiorl f on wa such that X is tlitb rilnge of f . \Ve pick ont? f for ear:h X E S arid let f = F ( X ) . Clearly. if X # 1' i nf =F c =F . V . I 1' C 1 = I . L SO !I S , Thus F is a orlc5-to-one nlapping of' S into U:". atld t,lirtrefot.rl l.Yi (. N,, .
Pruof.
,
1%'~: are riow in a positio~it o (:\rtilu;rtr N": for regular c.arrlill;ils N,,. u l l d ( ~t itssurriptiorl of the Generalized Coiitfinuum Hypothesis.
1
~
3.8 Theorem Let us assume thc Generalized Continuum H?lpotlle,sis. If N,, 1,s
a r~gularca~mdir~al, then
1
3. EXPONENTIATION OF CARDINALS
a>
l67
Proof. If a , then N : ~ = 2'0 = Nofl by Lemma 3.6. So let i j < o and let S = {X C w, I = No}. By Lemma 3.7. IS[ = N:". By Theorrnl 2.2(a), every X E S is a bounded subset of LJ,. Thus. let. B = UbCY.,, P(6) h t ~ the collection of all bounded subsets of W,. W(>will show t h a t IB1 5 N,,; as S C B, it then follows that NE1' = N,. Since L( = U6
1x1
However, for every cardinal N, < N,, we have 2'7 = N,+I for every 6 < U,, and we get
< N,
and so 2161 < N,
WC?provtb a similar (but a little more conlp1ic:itecl) fornlula for sirlgular N,, . but first we need a generalization of Lemma 3.3. 3.9 Lemma For e v e m cardinal
K
> 1 and evew ( 1 . c : f ( ~)~>- N,,
P r ~ o f . Exact,ly like the proof of Lemma 3.3, exrc?pt that 2N1vis rep1ac:cd by tcH,.. U 3.10 Theorem Let us assume the Generalized Continuum Hppothesis. If N, 2.9
a singular cardinal, then
Proof. If p > a,then N," = 2'li = N4+1 . If No < cf(N,), then every = H, subset X G U,,such that ]XI = No is s bounded subset, and we get by exact.1~the same argument as in the case of regular N, . Thus let us assume t h a t cf(N,) 5 No 5 N,. On the o i ~ ehand, we have N
NE"
O n the other hand, cf (H:") > Hg by Lemma 3.9. and since No _> cf(N,). wr have cf'(N;") # cf (N,), and therefore # N,. Thus ~lrccssarilyH:.' = N,,+ I .
NE"
0
If we do not assume the Generalized Contirluum Hypothesis. tlw sit.uatior1 becolnes much Inore complicated. We only prove the foilowing theorem.
CHAPTER 9. ARITHMETIC OF CARDINAL NUMBERS
l68
3.11 Hausdorff ' S Formula
For every cu and e u e q fl,
N
Proof, I f ~ > ( 1 + 1 , t h e n ~ $ ~ , = 2 ~ f i , N ~ " = 2 ~ ~ j , a< nN dd N < 2, +F ~~ ; N .l hence the formula holds. Thus let us assume that 5 a. Since < N,,+, and Na+ 5 H aNlf + , , it suffices t o show that < N,"H . N a + l . Each function f : W O -+ w,+l is bounded; i.e., there is y < w,+l such that f ( S ) < for all < wa (this is because W,+ l is regular and < W ,, + l ). He~icc.
NZ"
N:;,
-,
Now every y < 1 U T < W , . + I yW" I
has cardinality 171 5 N o , and we have (by Exercise 1 . G } C,<,.+, J y l u o . Thus
wa+l
N
This theorem enables us to evaluate some simple cases of N," (see Exercise 3.5). An infinite cardinal N , is a strong limit cardinal if 2'1j < N , for all 0 < 0 . Clearly, a strong limit cardinal is a limit cardinal, since if N , = N,+l, then 2'7 2 N,. Not every limit cardinal is necessarily a strong limit cardinal: If 2Nu is greater than N u , then H , is a counterexample. However, if we assume the Generalized Continuum Hypothesis, then every limit cardinal is a strong limit cardinal. 3.12 Theorem If N , is a strong limit cardinal and if cardinals such that K < N , a71d X < N,, then K' < N,.
(K - X)K.'
= p.A< N,.
K
and X are infinztr
C!
An uncountable cardinal number K is strongly inaccessible if it is regular and a strong limit cardinal. (Thus every strongly inaccessible cardinal is weakly inaccessible, and, if we assume the Generalized Continuum Hypothesis, every weakly inaccessible cardinal is strongly inaccessible.) T h e reason why such cardinal numbers are called inaccessible is t h a t they cannot be obtained hy thv usual set-theoretic operations from smaller cardinals:
3. EXPONENTIATION OF CARDINALS
3.13 Theorem Let K be a strongly inaccessible cardinal. ( a ) If X has cardinalzty < K , then P ( X ) has cardinality < K . (b) If each X E S has cardinalzty < K and IS1 < K , then U S has cardinalzty
< K.
(c)
If 1x1 < K
and f : X
--+ K ,
then sup f [ X ]< K.
Pro0f. (a) K is a strong limit cardinal. (b) Let X = ] S )and p = sup(lX1 ) X E S). Then (by Theorem 2 . 2 ( ~ )p) < K because K is regular, and I U S( 5 X - p < K. (c) By Theorem 2.2(b). D Exercises
>
3.1 If 2Nff N,, then N": = 2 ' ~ . 3.2 Verify this generalization of Exercise 3.1: If there is y < a such that N:" 2 N,, say N": = Na, then N,' H = Ns. 3.3 Let a be a limit ordinal and let No < cf(N,). Show that if N": 5 N, for all < a , then = N,. [Hint: If X W, is such that = No. then X C W E for some < a.] 3.4 If N, is strongly inaccessible and 4 < a , then = N,. [Hint:Use Exercise 3.3.1 3.5 If n < W , then N": = N, -2N13.[Hint: Apply Hausdorff's formula t~ times.] 3.6 Prove that Nn = [Hint: Let Ai (i < W) be mutually disjoint infinite subsets of W. Then
<
c
NE"
1x1
NE"
nn,,
NE[).
The other direction is easy.] 3.7 Prove that
N;' [Hint: N': 2N1)= (H,,,
=
(C,,, H,) .
< H?
=
N,
Nlr
. 2N1
n
~ n ) ~ (' n n < , ~ n ) ~ ' )N. = . .l
p
n<w
N:'
=
R,<w(N..
This page intentionally left blank
Chapter 10
Sets of Real Numbers 1.
Integers and Rational Numbers
In Chapter 3 we defined natural numbers and their ordering and indicat.ed how arithmetic operations on natural numbers can be defined. T h e next logical step in the development of foundations for mathematics is to define integers arid rational numbers. The guiding idea in both cases is t,o make an arithmet,ic, operation that is only partially defined on natural nuinbers (subtraotion ill the case of integers, division in the case of rationals) into a total operation, rtritl belongs more properly in the realm of algebra than set theory. We thus limit ourselves to outlining the main ideas, and leave out nlnlost all proofs. Those can be found in most textbooks on abstract algebra. Better still, the reader may work out some or all of them as exercises. In Exercise 4 . 3 of Chapter 3 we defined subtractio~zfor those pairs (71. r n ) of natural numbers where n 2 m. In this case, n - m is the unique natural number k for which 7.1 = m k . If n < m, no such natural number k exists. and 7 1 - t n is undefined. If n - m is to be defined, it has to b r ~a 'bnew" ohjcct; for thcb timtx being, we represent it simply by the ordered pair ( t 1 , n l ) . However, intuitivrly Familiar properties of integer arithmetic suggest that different ordered pairs n1a-y have t o represent the same integer: for example. ( 2 . 5 ) iiritl (6.9) both rcpl.c>si~nt - 3 12 - 5 = G - 9 = -31. In general, ( n l , m l ) and (n2,7n2)represent the same integer if and only if nl - m1 = n2 - m2. At this point, this makes sense only intuitively, but it can be rewritten in the form
+
which involves only addition of natural numbers (which h,= bee11 previously defined). These remarks motivate the following definitions and results. Let Z' = N X N. Define a relation on 2' by ( U , b) z (c, d} if ancl only if a + d = b + c. T h e relation FS is an equivalence relation ori 2'. (This has tro be checked, of course.) Let Z = Z'/ == be the set of all equivalence classes of 2' rriodulo =. PVC call Z the set of all integers; its elenltlnts are intqqor:~.
172
CHAPTER 10. SETS OF REAL NUMBERS
One immediate consequence of the definition is of set- t heoretic: interest . 1.1 Theorem The set of all integers Z is countable.
Theorem 3.13 in Chapter 4. Theorem 3.12 in Chapter 4 gives Proof. another proof. Next, define a relation < on Z as follows: [(a, b ] < [(c, d)] if and only if a + d < b + c. [Recall that intuitively (a, b) represents a - b and ( c , d) represents c - d, so a - b < c - d should mean a + d < b + C.] One can prove that < is well defined (i.e., truth or falsity of [(a,b)] < [(c,cl) l does not depend on the choice of representatives (a, b) and (c, d) but only on their respective equivalence classes) and that it is a linear ordering. Finally, we observe that for each integer [(a,b)] either a 2 b, in which (:me (a, b) == ( U - b, 0) (here - stands for subtraction of natural numbers, which is defined in this case), or a < b, in which case ( U , b) = (0, b - a ) . It follows that each integer contains a unique pair of the form (n. O), n E N or (0.71). n E N - (0). So [(n,O)] are the positive integers and [(O, n)] are the negative ones. The mapping F : N -+ Z defined by F ( n ) = [(71:0)] is one-to-one and order-preserving [i.e., 7n < n implies that F ( n t ) < F ( n ) ] . We identify each integer of the form [(n,0)] with the corresponding natural number n, and denotcl each integer of the form [(0, n)] by -71. So, e.g., - 3 = [(O,3)] = [(2.S)] = [(F. g)]. as expected. The rest of the theory is now straightforward. One can prove that (2.<) has no endpoirlts and that {X E Z I a < X < b), a , b E 2. U < b. llas a finit.? number of elements. Also, every nonempty set of integers bounded from above* has a greatest element, and every nonempty set of integers bounded from below has a least element. One can define addition and multiplication of integers by [(a, b)J + [(c, 41 = [(a + c, b + 41, [(a?b)] . [(c, d)] = [(ac+ bd, ad + bc)] and prove that these operations satisfy the usual laws of algebra (commutativity, associativity, and distributivity of multiplication over addition), and that for those integers that are natural numbers, addition and multiplication of integers agree with addition and multiplication of natural numbers. We define subtraction by
where -[(c, d)J = [(d,c)] is the opposite of [(c,d)]. Notice that - [(n.O)] = [(O, 71)] = -n and -[(O, n ) ]= [(71,0)] = n, in agreement with our previous notation. Absolute value of an integer (1, [a1, is defined by
1.
INTEGERS AND RATIONAL NUMBERS
173
Additional properties of these concepts can be proved as needed. Addition, subtraction, and multiplication are defined for all pairs of integers. There remains the inconvenience that division cannot always be performed. We say that an integer a is divisible by an integer b if there is a unique integer X such that a = b - X ; this unique X is then called the quotient of a and b. We would like to extend the system of integers so that any a is divisible by any b and all useful arithmetic laws remain vdid in the extended system. Sow. if 0 . X = 0 is to be valid in the extended system for all X , we see immediately that no number a can be divisible by 0; the equation a = 0 . X has either none or many solutions, depending on whether a # 0 or (L = 0. The best we can hope for is an extension in which for all a and all b # 0 there is a unique X such that a = b.x. Let Q' = Z X [Z - {O}) = {(a,b) E Z2 I b # 0). We call Q' the set of fractions over Z and write a/b in place of ( a ,b) for (a, b) E Q'. We define an equivalence x on the set Q' by c a - % if and only if a . d = b - c. b d Let Q = Q'/ E be the set of equivalence classes of Q modulo NU, Elements of Q are caHed rational numbers; the rational number represented by u/b is denoted Wbj. [Later, the brackets are habitually dropped and we do not distinguish between a rational number and the (many) fractions that represent it,.] There is an obvious one-to-one mapping i of the set Z of integers into the rationals: [Later, we identify integers with the corresponding rationals.) We now define addition and multiplication of rationals:
a -c
[:l [fl l,[ =
m
In order to lay the foundations satisfactorily, one should prove the following: (a) Addition and multiplication of the rationals are well defined (i.e.. indeperident of the choice of representatives). ( b ) For integers, the new definitions agree with the old ones: i.e., i(a + b ) = i ( a )+ i(b) and i[a . b) = i ( a ) . i(b) for all a , b E Z. (c) Addition and multiplication of rationals satisfy the usual laws of algebra. (d) If A E Q, B E Q, and B # [0/1], then the equation A = B . X has a unique solution .Y E Q. Thus division of rational numbers is defined, as long as the divisor is not zero; we denote this operation t: X = A + B. Finally, we extend the ordering of integers to the rationals. First, notice that each rational can be represented by a fraction a/b where the denominator b is greater than 0:
[i]
=
I<]
and either
b
> 0 or
-
b > 0.
CHAPTER 10. SETS O F REAL NUMBERS
174
We now define the natural ordering of rationals: If b > 0 a n d d > 0, let
[X] < l[:
if and only if
U
.d
< b . c.
tVe again leave t o the reader tllc proof t h a t the defirlitiori dots not (lepc~licl on the choice of representatives r u long as h > 0 arid d > 0. t h a t < is rc>aIIy;i linear ordering, t h a t for a, b E Z, cr < b if and only if [ a / l ] < [b/l].arltl t.h;~tt usual algebraic laws (such as: if U < h then a c < b c, etc.) hold. LVP hnvc. again
+
+
1.2 Theorem T h e set of ratior~alsQ is countable.
P7.oof. Tlleorern 3.13 in Cllapter 4. Theorem 3.12 in Cliapter 4 gi\.c>s another proof. i . T l l t next result llas Ixen referrecl t o in Cllapter 4 . 1 . 3 Theorem ( Q , <) i s U dense li~lea7.lyorde7-ed set and has fact, for' every 7 - f Q there exists 7 1 E N such that r < n.
710
e7~dpoi7~ts. Irr
Prnoj. Q is infinite; sirice u / b - 1 < a / b < u / b + 1. Q has t i o er1dpc1itits. If r 5 0 we (:an take n = 1. If 7' > 0, we write r = u / b where U > o. I) > 0. { L . b E N . ;tnd take 71 = U + 1. I t renlains t o show t h a t (Q. < ) is dense. Let 7*, .C be rtitionals suc:li t h a t I . < s: ilssurne t1l;it .I+ = u / b ancl .S = c./tl. where h > 0 :~n(ld > 0. Now wt. Itht
\$'c coliclude this sectiori with il few remarks or1 decinlal (or, in gener;ll, 1);lscl p ) expansions of' ratiorials. Every intuger p > 1 car1 serve ;is ;ibase of ;L nuiilbol. system. T h c one generally used has p = 10. Allother useful (:;BC? is ZJ = 2. 1.4 Lemma Given a 7.ational I L . u T ~rL, ~there ~ ~ ' is a unique integer c r < e 1. W e call e the integer part of r , e =
<
+
IT].
P T D O ~ . Let
CL/^ 5 CL
7'
=
U/b.
1) > O .
ASSUIII~? that
(L
> O . 1 < b.
<
SO ( I
fi
s,uc:fr tfrrrt
< (1 . I)
illl~I
tlliit S = { X E Z I X 7') Z 1 1 : ~ill1 U I ) E ) { ' ~ . t~ouird(1 in 2. If U < 0, then O is ar1 upper bound on S. 'Therefor(>.S !I;LS iI gr.catest elerr~enttl. Tlitln e 5 I . < c. + l , i t r ~ r lclearly r is tlic>u~iiclueil1tclfic:r ivitll this property. LJ I.
=
E
Z. I t
I ~ O Wfollows
7
The expansions of integers in base p are sufficiently well krlown. By Lrri111iiL 1.4, r- = [[rl-t- q where Ir] is ;ul iiitvger and q f Q, 0 . I < 1. c.olic*erit,r:itc. c ~ r lt h e expansion of g. Curistruct i i sequence of' digits 0. 1, . . . . p - l by recursion. ns follo\vs:
2. REAL NUMBERS
175
+
<
Find a ] E ( 0 , . . . , p - 1) such that a l / p q < ( a l l ) / p (let a1 = [ q . p ] ) . Then find a2 E ( 0 , . . . , p - I ) such t h a t al/p+a2/p2 < q < a l / p + (a2 l ) l p 2 (let a2 = [(q - a1lp) p 2 ] ) . In general, find ak E ( 0 , . . . , p - 1) such t h a t
+
We call the sequence (a, I i E N ) the expansion u f q in base p. When p = 10, it is customary t o write q = 0 . a l a 2 a s . . . One can show (a) There is no i such t h a t aj = p - I for all j i . (b) There exist n E N and I > 0 such t h a t a,+! = a, for all n >_ 720 (the expansion is euentually periodic, with period l ) . Moreover, if q = u / h , then we can find a period 1 such t h a t 1 5 [ b ( .Coiiversely. each sequence (ai ] i E N ) with the properties (a) and ( b ) is a11 expansion of some rational number q (0 5 q < 1).
>
Exercises
1.1 Prove some of the claims made in the text of this section. 1.2 If r E Q, r > 0, then there exists n E N - (0) such t h a t l / n < r. [Hint: Use the fact mentioned in Theorem 1.3.)
2.
Real Numbers
T h e set R of all real numbers and its natural linear ordering < have ber-trl defined in Chapter 4, Section 5, as the completion of the rationals. In particular. every nonempty set of real numbers bounded from above has a suprernum, and every nonernpty set of real numbers bounded from below has an infimum. H t ~ r e we consider the algebraic operations on real numbers. We first prove a useful lemma. 2.1 Lemma For every X E R and n E N - ( 0 ) there exist r, S E Q such that r < X _< s and s - r l/n.
<
Proof. Fix some TQ,so E Q such that T O < x < so, and some k E N - (0) such t h a t k > n.(so - r o ) . Consider the increasing finite sequence of ratioliiil numbers (7.1)t=owhere r , = TO i / n . Let j be the greatest i for i\~tiichr, < .lh: notice t h a t j < k. Now we have rj < X rj+l and rj+l - rj = 1 / 7 1 . It s ~ f f i ~ t ~ s t o take r = r j , s -- r j + l .
+
<
We now define the operation of addition on real numbers.
CHAPTER 20. SETS O F REAL NUMBERS
176
2 . 2 D e f i n i t i o n L e t x , y ~R. W e l e t x + y = i n f { r + s ( r , s ~ Q, r S r . y < + S ) . (The symbol on t h e right-hand side refers to the addition of rational numbers.)
+
We note that the infimum exists, because the nonempty set in question is bounded below (by p + q , for any p, q E Q, p < X , q < y). It is also clear that if both X and y are rational numbers, then I: y, under the new definition. is equal t o X + y, where is the previously defined addition of rationals.
+
+
2.3 L e m m a Let X , g, z E R.
(zv) There exists a unique the opposite of X . (U)
If X < y then
X
W
E R such that
X
+ W = 0.
We denote
W
= -X,
+ z < y + z.
Proof. We use some simple properties of suprema and infima (see Exercise 2.1). (i), (ii), and (iii) follow immediately from the corresponding properties o f rational numbers. (iv) We recall that X = inf(s E Q I X < s) = sup{r E Q 1 r < X ) . This suggests letting W = inf{ -r 1 E Q, r < X). We have X W = inf {S- 1. I r , s E Q, X S , r < X). A S r < X , r < S imply S - 7. > 0, it follows X + W 2 0. Assume X + W > 0; by density of Q in R and Exercise 1.2, thew exists n E N - (0) such that l / n < X + W . But, Lemma 2.1 guarantees existence of some r, S E Q , r < X S, such that s - r < l / n . Therefore X + W 5 l / n , a contradiction. This shows the existence of a W with thtl property X + W = 0. If also I: U = 0, we have (using (i), (ii). arid (iii)) w = w + O = O + w = (v+x)+w=u+(x+w) =u+O=u,sou)is uniquely determined.
+
<
<
+
(v) If X < y then X + z 5 y + z is immediate from the definition of additiori (and Exercise 2.1). If X + z = y + s, we have X = x + O = x + ( z + (-2)) = ( X + Z ) + ( - z ) = ( y + z ) + ( - z ) = y + ( z + ( - z ) ) = y+O = y , a contradiction.
G The operation of multiplication on positive real numbers can be defined in a n analogous way. We let R+ = {X E R 1 X > 0). 2.4 D e f i n i t i o n L e t x , y E R'. W e 1 e t x . y = inE{r.s I r , s E Q , r 5
13.
y
<
.S}.
2, REAL NUMBERS
2.5 Lemma Let X ,y, z E R+.
(X') There exists a unique the reciprocal of X . (xi 1 ) I f x
W E
< y then X - z < y
R+ such that s . w = 1. We denote
l/r,
z.
Pmof.
W
W =
Entirely analogous t o the proof of Lernma 2.3. For = i n f { l / r j r E Q , 0 < r < X).
(X')
we let
It is now a straightforward matter to extend multiplication t o all real nunlbers. We first define the absolute value of X f R:
and notice that for -X).
X
2.6 Definition For
# 0, 1x1 E R+ (if X < 0 then
X ,y E
0 =X
+ (-X)
0 + (-X)
=
R we let
We leave t o the reader the straightforward but tedious exercise of showing that this definition agrees with the definition of multiplication of rationals from Section 1, and proving the following lemma. 2.7 Lemma Let
( X ) For each X.Ul=
X,
X E
l.
y, z
E R.
R, i f
X
# 0, then there exists a unique
W
E
R such th)at
CHAPTER 10. SETS OF REAL NUMBERS
178
As before, we denote the unique U?in ( X ) by 1 /X. We also defi~iediviszo71 Inr a norlzero real number X : y + 3: = y . (l/a). A structure U = ( A ,<, ., 0 , l ) where < is a linear ordering. ancl . arc3 binary operations, and 0, I are constants, such t h a t all the properties (i)-(\-) and (vi)-(xi) are satisfied is called an ordered field in algebra. T h e conterits ot Lemmas 2.3 and 2.5 can thus be summarized by saying t h a t the real numt>ers (with the usual ordering and arithmetic operations as defined above.) ;Ire i i l l ordered field. As the ordering of the real rlurrlbers is c:onlplete, they ;tro i l cmplete ordered field.
+,
+
2.8 Theorem The structure 93 = ( R ,<, +, ., 0 , l ) zs a complete clridered field.
It is possible t o prove t h a t the c:ornplete ordered field is unique, i.c*.. if' U = ( A , <, +, ., 0 , l ) is also a complete ordered field, then U a n d 93 are isomory,hic:. As t h e proof requires a fairly heavy dose of algebra, we do not present it her(. (but see Exercise 2.5). We conclude this section by mentioning a well-known fact about real numbers t h a t we use in Section 6 of Chapter 4. We leave the proof as an exercise. 2.9 Theorem (Expansion of real numbers in base p ) Let p 2 2 be a nutural number. For every real nun~ber0 <_ U < 1 there is a ur~iqupS P ~ U C T L ( - Pof ,.et11 numbers (an):= l such that (a) 0 5 a , < p , for eachn = 1.2 , . . . (b) There zs no no such that a , = p - 1 for all n > no. ( c ) u ~ / p . - . a,/pn 5 a < a l l y + . . . ( U , l ) / p n , for each 7 1 1. The real number a is rational 2j and only i f is eventually periodic.
+ +
+
+
>
Exercises
<
2.1 Let A 5 B 2 R, A # 8 . If B is bounded from below then inf B inf' A . If B is bounded from above then sup A 5 sup B. If, in addition, for every b E B there exists a E A such t h a t a 5 b . then inf B = inf A. Similarly for sup. 2.2 For every X E R+ and 71 C N - (0) there exist r , s E Q such t,l~iit 0 < r < X S and I < s/7, 1 1 / n . [Hint:Fix ro E Q such t,hat O < ro < X , and k E N - (0) such t h a t kro > n. U S E ! Lcrnma 2.1 t o fiml r, S E Q , T O < r < X 2 S , SO t h a t S - r < l / k , and estimate s / r . ] 2.3 Prove t h a t if a , b E R, b # 0, then the equation a = b - X has a unique solution. 2.4 Prove t h a t for every a E R+ there exists a unique X r R+ such t h a t X ' X = U. 2.5 Prove t h a t every complete ordered field is isomorphic t o R. [Hints:
<
< +
3. TOPOLOGY OF THEREALLINE
179
1. Given a complete ordered field U = (A, <, +, . , 0 , 1 ) consider the closure C of { O , l ) under + and -. Let C be the structure obtained by restricting the ordering and the operations of U to C. Show that C is (uniquely) isomorphic to D = (Q, <, +, *, 0 , l ) . This is the
"algebraic" part of the proof. 2. Show that for every a E A there is c E C such that u 5 c. (If ~ i o t . then S = { a E A I a > c for all c E C} is nonempty arld bounded below, but does not have an infimum, because a E S implies a - 1 E S.) 3 . Use 2. to show that C is dense in (A, <). 4. Use 3. to show that the isomorphism between L31 and C extends uniquely to an isomorphism between % and a.]
2.6 (Complex Numbers) Let C = R mult,iplication on C as follows:
Show that 2.7.
3.
+ and . satisfy (i)-(iv)
X
R, and let us define addition and
of Lemma 2 . 3 and (vi)-(X) of Lenlnia
Topology of the Real Line
The main result of the preceding section is a characterization of the real number system as a complete ordered field. This is the usual departure point for the study of topological properties of the real line. We give some basic definitions and theorems of the subject in this section. Our objectives are to justify the claim that set theory, as we developed i t so far, provides a satisfactory foundation for analysis, t o show some consequences of the previously proved results, and to provide a convenient reference for some concepts and theorems used t:lsewhere. Readers who studied advanced calculus should be familiar with most of this material. In any case, one can skip this section and refer to it only as needed. We begin with a definition of a familiar concept. 3.1 Definition Let (P,<) be a linear ordering, a, b E P, a < b. A (bounded) open interval with endpoints a and b is the set (a, b) = {X E P ] a < a < b ) . A (bounded) closed interval with endpoints a and b is the set [a,b] = { X E P ) a 5 X .Ib } . We also define unbounded open and closed i n t e n ~ a l s( [ I . . ,X). (- m,a ) , [ U ,m ) , ( - m , a ] , and (-m, m) = P in the usual way: for cxnmple. (a,m) = {s E P I a < X ) . We employ the standard notation ( a ,b) for open intervals, even though it clashes with our notation for ordered pairs; the meaning should always be clear from the context.
CHAPTER 10. SETS OF REAL NUMBERS The next theorem is a consequence of the existence of a countable dens^ subset in R. 3.2 Theorem Every system of mutually disjoint open intervals i n R i s at m o s t countable.
Proof. Let P = Q n U S where S is a system of nlutually disjoint open intervals in R and Q is the set of all rational numbers. As Q is dense in R, each open interval in S contains a t least one element of P. As elements of S are mutually disjoint, each element of P is contained in a unique open interval from S. The function assigning t o each element of P the unique open interval from S t o which it belongs is a mapping of an a t most countable set P onto .5'. Hence S is a t most countable, by Theorem 3.4 in Chapter 4. This is a convenient place to mention a famous problem in set theory dating from the beginning of the century: Let (P,<) be a complete linearly orderrrl set without endpoints where every system of mutually disjoint open int~rvalsis a t most countable. Is (P,<) isomorphic t o the real line? This is the Susli7r's Problem. Like the Continuum Hypothesis, it remained unsolved for decades. With the help of models of set theory, it has been established that it, like the Continuum Hypothesis, can be neither proved nor refuted from the axioms of Zermelo-Fraenkel set theory. The reader can find out more about these n ~ a t t e r s in Chapter 15. 3.3 Definition A system of sets S lias the finite intersection property if ever), nonempty finite subsystem of S has a nonempty intersection.
An example of a system of sets with the finite intersectiol~property is t,llcl range of a nonincreasing sequence of nonempty sets, i.e., S = {A, 1 7 1 E N ) where A, # 0,A, A,) for all n , n' E N . n 5 n,'. The next theorem is a consequence of the completeness of the real line.
>
3.4 T h e o r e m Any nonempty system of closed and bounded intervals in R with the Jinite intersection property has nonempty intersection.
Proof. Let S be such a systern. Let A = { X E R I [X,y] E S for some y E R) be the set of all left endpoints of intervals in S. We show that A is bou~icletl from above. Fix [a,b] E S , then certainly X 5 b, because in the opposite c:;isrs we would have a < b < X < y and the intervals [X,y ] , [a,b] would be disjoint. violating the finite intersection property for the 2-element subset { [X.y]. [U.h]) of S. So b is an upper bound on A. By the completeness property, A has :L supremum a. If [a,b] is a n interval in S, we have a 5 C because a E A. ancl E 5 b because b is an upper bound on A. So a E [a,b] for any [a,b] E S arid 0S is not empty. Completeness of the real line is also the reason why the notion of limit works ,as well as it does. We recall some very familiar definitions from calculus.
3. TOPOLOGY O F THE REAL LINE
181
3.5 D e f i n i t i o n Let ( a , I n E N ) be a n infinite sequence of real numbers. (a) (a,) is nondecreasing if a, 5 an) holds for all n , n' E N such t h a t n. < n'. I t is increasing if a, < a,, holds whenever n < n'. (b) (a,) is bounded from above if its range { a , I n E N ) is bounded from above. Similarly for bounded from belour. (a,) is bounded if it is bounded from above a n d below. (C) {a,) has a limit a [convergesto a] if for every E > 0, E E R , there is no E N such that, for all n no, la, - a1 < E . (d) (a,) is a Cauchy sequence if for every E > 0. E E R, there is no E N such t h a t , for all m,n > no, [ a , -a,[ < E .
>
Here are a few fundamental theorems about convergence. Note t h e crucial role the completeness of R plays in the proofs. 3.6 Theorem Every nondecreasing sequence of real numbers boundedfrom abone has U limit.
Proof. If { a , I n E N ) is bounded from above, then i t has a suprernunl G. We prove t h a t (a, I n E N ) converges t o E. Let. E > 0 be given. Since G - E < a, and a is t h e least upper bound on { a , 1 n E N ) , there is no such t h a t E - E< a,,,. We now have, for all n > no. 7i- E < U,,, < a , 7i < at^, C l i.e., [ U , - a1 < E . This shows t h a t (a,) converges to E.
<
3.7 D e f i n i t i o n Let ( k , ] n. E N ) be a n increasing sequence of natural numbers. T h e sequence (ak,, 1 n f N ) is called a subsequence of ( a , I n E N ) . (Note t h a t it is just the composition of (a,) o (k,).)
3.8 Theorem Every bounded sequence of real numbers has a convergent subsequence.
Proof. Assume t h a t (a, I n E N ) is bounded. Let b, = inf(ak I k > n ) a n d observe t h a t the sequence {b, I n E N) is nondecreasing and bounded from above (every upper bound on (a,) is an upper bound on (b,)). By Theorem 3.6, (b,) has a limit 7i. We construct a subsequence of (a,) with limit 7i by recursion: ko = 0, i.e., ak,, = ao; given k,, let kn+ l b e the least k E N such t h a t k > ko and
We have to show t h a t such k's exist. But iZ = sup{b, of Theorem 3.6) and so there is i > k, such that
In
E N ) (see the proof
3. TOPOLOGY O F THE REAL LINE
183
Proof. S u p p o s e t h a t f # g ; then f ( a ) # g(a) f o r s o m e u E R. Let E = If ( a ) - y(a)1/2. As both f and g a r e assumed continuous a t a, there exist S,, h2 > 0 such t h a t if [ X - a1 < J1, then I f ( x ) - f ( a ) / < E . and if 1.r - a1 < S2. then ( g ( s ) - g(a)l < E . T h e density of D guarantees existence of .r C D such t h a t lx - UI < min{6r,62). But now If(a) - g(u)I ] f ( u ) - f(s)l I f ( . r ) g(x)l + )g(x)- g(u)I < E + 0 E = l f ( a ) - g ( o , ) ) :a cont.radiction. [We have used G the fact t h a t f (X) = g(x) for X E D in replacing the ri1icldlt1 tern1 l)y 0.1
<
+
+
Similarly, open and closed sets are relatively siiriple and well-behavcvl. I t is obvious from t h e definition t h a t a set is open if and orlly if it is a uuiori of a system of open intervals. Consequently, the uriiori of any systetli of opeli sets is open and t h e intersection of any system of closed sets is closerl. O p e r ~ intervals are the simplest examples of open sets; closed intervals, finite sets. alid { l / n ( n E N - ( 0 ) ) U (0) are typical examples of closed sets. Other exarliples can be obtained by using t h e next lemma.
3.12 Lemma T h e intersection of a finite system of open sets is open. Tiso union o j a finite system of closed sets i s closed, Proof. Let AI and A2 be open sets. If a E A I r'l A2, tl.ierc? are 6 , . h2 > 0 such t h a t ) X - a[ < S1 implies X E A I and 12: - a1 < d2 implies :r E A2. Put, 6 = rnin{dl. b 2 } ; then 6 > 0 and [ X- a J< d implies .c E Al A2. Tliis ;irgurncblit proves t h a t t h e intersection of two open sets is open. The claini for ally f i n i t c h system of open sets is proved by induction, and t l ~ cclaim for clost1d sets 11)~ taking rc?lat,ivccornplernents in R and using the DV hlorgan Laws (S[>"c:t,iorl4 i l l Chapter 1) .
3.13 Theorem T h e following statements about f ( a ) f is co7~tinuous. (b) f - ' [ A ] is open wheneoer A G R is open. ( c ) f [ B ]is closed whenctrer B G R is closed.
:
R -, R are eqili7!nlen,t:
( a ) implies (b). Assume t h a t f is continuous. If (L E f - [ A ] . Proof. f (a) E A and so there is E > 0 such t h a t ly - f(a)l < E implies y E A (\,Pc.~LIIs~' A is open). By definition of continuity there is 6 > 0 such that. IT - trl < n' implies 1 f ( X ) - f ( a ) l < E . S O we found 6 > 0 sr~clithat ( X- a / < S i1ny1ic.s f ( x ) E A , i.e.! z E f [A],showing f-'[A] open. (b) iniplies (a). Let a E R and E > 0. T h e assumption (b) implies that f [(f ( a ) - E , !(a) E )] is open (and contains a). T h u s there is 6 > 0 sucli that Is-U/ < d i m p l i e s s ~f - ' [ ( f ( a ) - ~ , f ( a ) + ~ ) ] , i . ef (. ,z ) E ( f ( a ) - E . f ( u ) + ~ ) . establishing the continuity of f at a. The equivalence of (b) and (c) follows immediately from Exerciscb 3.G(b) i l l Chapter 2. L C '
-' +
In t h e rest of this section we prove some results about open aiid closecl sets. T h e following lemma is used in Chapter 5 t o determine the cardinalitv of tlie systern of a11 open sets.
CHAPTER 10. SETS OF REAL NUMBERS
184
3.14 Lemma Every open set is a union of a system of open intervals ~ 1 2 t h rational endpoints.
Proof. Let A be open and let S be the system of all open intervals with rational end points included in A. Clearly, U S G A; if a E A, then (a - 6, a + 6 ) c A for some 6 > 0 and we use the density of Q in R t o find r l , rz E Q such that a - 6 < rl < a < r2 < a + b . Now a E ( r l , r 2 ) C A , so a E U S , showing that US=A, Q We have defined closed sets as colnplements of open sets. It is useful t,o h a w a characterization of them in terms of the behavior of their points. 3.15 Definition a E R is an accumz~lationpoint of A C R if for every (i > U there is X E A , X # a, such that Jx- a[< 6 . a E R is an isolated poznt of A R if a E A and there is 6 > 0 such that Ix - a1 < 6 , X # a, implies X 4 A.
c
It is easy t o see that every element of A is either a n isolated point or ali accumulation point, and that there may be accumulation points of A that clo )lot belong t o A. 3.16 Lemma A to A.
R is closed if and only if all accumulation points of A belong
Proof. Assume that A is closed, i.e., R - A is open. If a E R - A , then there is 6 > 0 such that Ix - a / < 6 implies X E R- A, so a is not a n accumulation point of A. Thus all accumulation points of A belong to A. Conversely, assume that all accumulatiorl points of A belong t o A. If a E R - A, then it is not a n accumulation point of A, so there is 6 > 0 such tliat there is no X E A, X # a, with Ix - a1 < 6. But then lx - a1 < 6 implies X E R - A, so R - A is open and A is closed. 0 As an application, we generalize Theorem 3.4. 3.17 Theorem Any nonempty systent of closed and bounded sets with the finitc intersection property has a nonempty intersection. Let S be such a system. We note that the intersection of any PmoJ nonempty finite subsystem T of S is ii nonempty closed and bounded set. So S= T I T is a nonempty finite subsystem of S) is again a nonempty system of closed and bounded sets with the finite interse~t~ion property, and the atldit,ion;~l property that the intersection of any nonempty finite subsystem of S actunllv belongs t o S . Since obviously S = 3, it suffices t o prove that S # B. The proof closely follows that of Theorem 3.4. Let A = {inf F 1 F E S). 13'~ show that A is bounded from above. Fix G E 3; then for any F E S we L~avc. F n G E 3 and inf F 5 inf(F n G ) 5 s u p ( F f~G) 5 sup G , so A is bounded by s u p G . By the completeness property, A has a supremunl a. The proof is concluded by showing 7i E F for ally F E 3.
{n
n
n
n
3. TOPOLOGY OF THE REAL LINE
185
Suppose that E 4 F. For any 6 > 0 there is H E S such that E - 8 < inf H 5 E, by definition of E. Since H n F E S and inf H 5 inf(H ll F ) , we have also E - 6 < inf(H n F) 5 E < iz + S. It now follows from the definition of infimum that there is X E H n F for which inf(H n F) < X < E + 8. We conclude that for each 6 > 0, there exists X E F such that X # Z (we assumed that a 4 F) and (X- El < 6. But this means that E is an accum~~lation point of F, and as F is closed, Si E F , a contradiction. Cl Closed sets contain all of their accumulation points; in addition, they may or may not contain isolated points. An example of a closed set without isolat.ed points is any closed interval. [O, l] U N is an example of a closed set with (infinitely many) isolated points. Nonempty closed sets without isolated points are particularly well-behaved; they are called perfect sets. We use them in the next section as a tool for analyzing the structure of closed sets. We conclude this section with a n example showing that, their good behavior notwithstanding, perfect sets may differ quite substantially from the familiar examples, the closed intervals. 3.18 Example Cantor Set. Let S = Seq((0, l}) = UnEN{O, l } be the srt of all finite sequences of 0's and l's. We construct a system of closed intervals D = (D, I s E S) as follows:
Do = [O. l];
-
1
D(o,o) - 101 g]'
2 1
D(o.1) =
8
2 7
D(l,o) = I-, 3 -1, 9
D(1,1)=
I ] , etc.
+
In general, if D(,,,,,.. ,S,,- 1 ) = [a, bl, we let D(,,,,.. . ,S. - 1 ,O) - [a,a $ ( b - a)] and D(s, , , , , , l = [a + - a), b] be the first and last thirds of [a, b]. The existence of the system D is justified by the Recursion Theorem, although it takes some thought t o see: we actually define recursively a sequence (D n f 71 E N ) , where each D" is a system of closed intervals indexed by (0, l } " : D:, = [O, l];having defined D n , we define D"+' so that for all (so, - . . , sn- 1 ) E (0, l } " .
i(b
CHAPTER 10. SETS OF REAL NUkIBERS
l86
+
a
[ a ,a 5 ( b - a ) ] and D"+' = [a + ( h - ( 1 ) . 61 rnlwrr. (S,,,. . 1 .l} 1% bl = D: ,,.,. , S , , , and then let D = UnE D" . NOW let Fn = U{Ds I S E {O. l)" ), SO that F. = [O, l], FI [O. U .1; I]. etc. Notice that each Fn is a union of a finite system of closer1 iiltervals. alltl hence closed (see Lemma 3.12). Tlle set F = F,, is tllus also closed: it is called the Cantor set. Df For any f E 10, 1 ) We let D j = We note that D, C F i L ~ l c I D j f 0 (Theorem 3.4). Moreover, the length of the interval D / is 1/3". so the infimum of the lengths of the intervals in the system (Df ( n C N ) is zero. It follows that D j contains a unique element: Df = {clf). Conversely. for any a E F there is a unique f E (0, 1 I N such that a = dj: let f = U { s C Seq((O.1)) I a E D,). (Compare these arguments with Exercise 3.13 i l l Cl~npter2.) d = (df / f E {O. 1 I N ) is thus a ulie-to-one mapping of {U. l j N tr~lto F, and we conclude that the Cantor set has the cardinality of the (:ontinuu~r~ 2N0 ~ n + l (so,
-
,S,*-
= ~$0)
,S,,-
-
1
i]
nnEN nnEN
We prove two out of many other interesting properties of the set F. (1) The Cantor set is a perfect set.
Proof. We know that F is closed and nonempty already. so i t rernains t o prove that it hiu no isolated points. Let a E F and 6 > 0; we have t o tint! X E F, X # a , such that JJ:- a1 < 6. Take 7 1 E N fbr which 1/3n < 6 . As we know, a = d j for some f E (0, 1IN: let X = d, where g is any sequ~rlcrof 0's and l's such that g n = f / 72 i\11(1 f # g. Then a # X , but (1.1 E Dlrr2= Dvl,, which has length 1/3" < 6, so ( X - a1 < 6, as required. L3 ( 2 ) The relative complement o f the Cantor set i n (0, l] is dense in [O. l ] .
P r o Let 0 5 a < b 5 1; we show that (a, 6) contains elements not in F. Take 72 E N such that & < < ( b - U). Let k be the least natural nunlhrr fi)r k (k+l) < b. The open interval which $ 2 U ; we then have a 5 F <7 ( ,-,+l 3k+2)
3n+l
'
3~+1
(the middle third of
is certainly disjoint from F. and we see that ( a , b) contains ele~nentslot
irl
F.
Exercises
3.1 Every system of mutually disjoint open sets is a t most countable. The statement is false for closed sets. 3.2 Let S be a nonempty system of closed and bourided intervals in R. If irlf{b - a I [ a ,61 E S ) = 0, then contains at most one elemcwt.
nS
3. TOPOLOGY OF THE REAL LINE
187
3.3 Let ( P ,<) be a dense linearly ordered set where every nonempty system
3.4 3.5 3.6 3.7
of closed and bounded intervals with the finite intersection propertj* has a nonempty intersection. Then P is complete. (This is a converse to Theorem 3 . 4 , ) ( u , ) ~ =converges ~ to a if and only if for every k E N , k > 0 . there is no E N such that, for all n no, [a, - a1 < l/k. If (a,)r=o converges to a and to h, then a = b (the limit is unique). 'l'tlis justifies the usual notation lirn,,, a, for the limit, if it exists. Every convergent sequence of reals is a Cauchy sequence. The number ii = sup{inf{ak 1 k n ) I n E N } used in the proof of Theorem 3.8 is called the lower limit of (a,) and is denoted linl inf,,, U,,. Similarly, 6 inf{sup{ak I k n } I n E N ) is called the upper Pmit of (a,) and denoted lim sup,,, a,. Prove that (a) 6 exists for every bounded sequence and h 5 6. (b) If c is a limit of some subsequence of (a,}. then ii 5 r 5 6. (c) (a,) converges if and only if B = 6 (and if it docs, Ti is its linrit). There is a sequence (a,) with the property that for any a E R, (a,) h a s a subsequence converging to a. [Hint: Consider an enumeration of all rational numbers.] Show that f : R -, R is continuous at U E R if and oilly if for every n E N there is k E N such that for all X E R, ] X - a ) < 1/k implies
>
>
-
3.8
3.9
>
< <
3.10 Let f : R -+ R be continuous, a,b, y R, a < b and / ( U ) y f(b). Prove that there exists X E R, a 5 X h, such that f ( X ) = 9 (the Intermediate Value Theorem). [Hint: Consider .X = sup{z E [a. b] I f ( 4 5 ?l)-/ 3.11 Let f' : R -+ R be continuous, a, b E R, a < b. Prove that the image of [a, b] under f is closed. 3.12 Let f : R -+ R be continuous, a, b E R , a < b. Prove that thcrr? is X E [a, b] such that f ( X ) f (z)for all z E [a, b]. Therefore, the image of [a,b] under f is bounded. 3.13 If A is open and (X,) converges to a E A, then there is 710 E IV such t,llkit for all n no, a,, E A. 3.14 a E R is a closure point of A R if for every d > 0 there is X E A such that ( X - a1 < 6. a E R is a closure point if and only if it is either an isolated point of A or an accumulation point of A. Let A be the set of all closure points of A. A is closed if and only if A = A . 3.15 Every open set is a union of a system of mutually disjoint open intervals. [Hint: If A is open and a f A, U { ( x , g ) I a E ( X , ? / ) C A ) is an open interval.] 3.16 Give an example of a decreasing sequence of closed sets with empty intersection. 3.17 Show that Exercises 3.11 and 3.12 remain true for any closed and bo~rndr?tl set C in place of the closed interval [a, b]. 3.18 We describe an alternative construction of the completion of (Q. <). A sequence of rational numbers is a Cauchy sequence if for each
<
>
>
c
CHAPTER 10. SETS O F REAL NUMBERS
188
rational p > 0 there is no such that la,, - a,,,J < p whenever n 2 n.0. Let C be the set of all Cauchy sequences of rational numbers. We define an equivalence relation on C by (a,):!, = (bn)r==oif and only if for each p > 0 there is no such that [a, - b,l < p whenever n 2 no. and ;I preordering of C by {an)?=, 4 {bn)F=oif and only if for each p > 0 there is no such that a, - b, < p whenever n >_ no. Prove that is an equivalence relation on C. (a) (b) + defines a linear ordering of C/ =. ( c ) The ordered set C/x is isomorphic to R.
4.
Sets of Real Numbers
The Continuum Hypothesis postulates that every set of real numbers is either at most countable or has the cardinality of the continuum. Although the Continuum Hypothesis cannot be proved in Zermelo-Fraenkel set theory, the situation is different when one restricts attention to sets that are simple in some sense. for example, topologically. In the present section we look a t several results of this type. 4.1 Theorem Every nonempty open set of reals has cardinality 2N0.
Proof. The function tarix, familiar from calculus, is a one-to-one mapping of the open interval (-.rr/2,~/2) onto the real line R; therefore, the interval ( - ~ / 2 ,n/2) has cardinality 2'". If (a,b ) and (c, d) are two open intervals. tllth function d-C f ( X ) = c -(X - a) b- a is a one-to-one mapping of (a, b) onto (c, cl); this shows that all open intervals have cardinality 2'". Finally, every nonempty open set is the union of a nonempty system of open intervals, and thus has cardinality at least 2N0.As a subset of R, it has cardinality at most 2N"as well.
+
4.2 Theorem Every perfect set has cardznalzty 2N0. We first prove a lemma showing that every perfect set contains two disjoint perfect subsets (the "splitting" lemma). Let P denote the system of all perfect subsets of R. 4.3 Lemma There are functions Go and G1 from P into P such that, for each F E P, Go(F) G F, G I ( F ) F and Go(F) n G l ( F ) = 8.
c
Proof. We show that for every perfect set F there are rational numbers r. S, r < S, such that F f~ ( - - m , r ]and F f l [ S , +m) are both perfect. Let a = inf F if F is bounded from below (notice that in this case u E F ) and a = -m otherwise. Similarly, let P = sup F if F is bounded from abovtb (p E F) and p = +m otherwise. There are two cases to consider:
4. SETS OF REAL
NUMBERS
189
(a) ( a ,P) C F . In this case, any r, s E Q such that a < r < S < P have the desired property: for example, if a E R, then F n ( - m , r ] = [a,rj; otherwise, F n ( - m , r] = (-m, r] (and we noted that all closed intervals are perfect sets). (b) (a,0) g F. Then there exists a E ( a ,P), a # F. The set F is closed (i.e., its complement is open), and thus there is 6 > 0 such that ( a - 6, a + S) n F = 0. We note that a < a - d (because a = inf F, there exists X E F, a < s < a ) and similarly a d < P. Any r, s E Q such that a - d < r < a < .s < u + 6 have the desired property: for example, F n ( - m , r] is clearly closed and nonempty ( F n (-m, r] = F n ( - m , a ] ) . If it had an isolated point h, we could find 6' 6 such that X E (b - h/, b + 6'), X # b, implies X # F n (-m, r] = F n ( - m , a+&). But for such X, X < b+df 5 r+d' < a+d' 5 u + d , so X E ( - m , a + 6 ) . We conclude X @ F, showing that b is an isolated point of F . This contradicts the assumption that F is perfect. To complete the proof of the lemma, we fix an enumeration ((r,, S,) I n E N) of the countable set of all ordered pairs of rationals, and define G o ( F ) = F n ( - 0 0 , r,], G 1( F ) = F n [S,, m), where (rn1 S , ) is the first (in our enumeration) pair of rationals for which r, < S, and both F n ( - m , r,] and F n [S,, m ) are perfect.
+
<
One more result is needed to prove Theorem 4.2. For any bounded nonenlpty set A C R, we define the diameter of A, diam(A), as sup A - inf A. The next lemma shows that each perfect set has a perfect subset of an arbitrarily small diameter. 4.4 Lemma There is a function H : P X ( N - (0)) --+ P such that. for each F E P and each n E N, n # 0,H(F,n) G F and diam(H(F, n)) l / n .
<
Proof. Let F E P, n E N , n > 0. There exists m E Z such that F n ( m / n , (m + l ) / n ) f ; 0. (Otherwise, F {m/tz ) m E Z),so all points of F would be isolated.) Take the least m 0 with this property, or. if none exists. the greatest m < 0 with this property. Let a = inf F n (m/n. (m + l ) / n ) and b = sup F n ( m / n , (m l ) / n ) . Clearly, m/n 5 a 5 b 5 ( n +~ 1)/)1. so b - a 5 l / n . We define H ( F , n ) = F n [a, b ] . Clearly, F is closed, F # 0. The definition of infimum implies that a is not an isolated point of H ( F , n). n ~ i d neither is b. An argument similar to the one used in ctue (b) in t.hc proof' of the previous lemma shows that no X E (a, b ) C (m/n, (m l ) / n ) is an isolated point of H ( F , n ) either, so H ( F , n) is perfect. D
c >
+
+
Proof of Theorem 4.2. This proof is now easy. It closely resembles the argument used in Section 3 to show that the Cantor set has the cardinality z N U . Let F be a perfect set; we construct a system of its perfect subsets (F5( S C S), where S = Seq((0, l ) ) ,as follows:
190
CHAPTER 10. SETS O F REAL NUMBERS
nnE
For any f E (0.1) we let Ff = F l r n Fj # 0 1)y Throrrrn 3.1T hforeov~r.Ff contains a uniqut~eltt~nent:If' X. y E Ff, / X - :(/l 5 rliiull( F! ) diam(Fjrn) 5 l / n for all n E N , 71 # 0, so X = g. Let df bc the unique eleiiiv~lt of Fj. The function ( d j I f E (0. l }N , is n o~le-to-onemapping of (0.1 illto (although not necessarily onto) F. showing that the cltrdirlality of F is at lr-tt~st 2*". Since F C R, it is also at most 2N". Theorem 4.2 now follows from tlirh Cantor-Bernstein theorem.
IN
Our next goal is to show that closed sets also behave in ac:cordance with thr> Conti~iuumHypothesis: They are either at ~ n o s countable t or have carclinalit y 2*". This is an immediate consequence of Theorern 4.2 tincl the following: 4.5 Theorem Every uncou~~table cmlosed set. c o ~ ~ t a i na sperfccf. s~rbset.
4.6 Corollary Every closed set of reals is either at nlost cou7ttatle ~{irdinality2'".
07.
/sus thc~
L5'e give t,wo proofs of Theoreni 4.5. The first proof is quite si~nple:howevr.~. it uses the fact that the urliori of ariy couritable system of at rnost cormtable sets is at most countable. The proof of this requires the Axiom of Choice (stit* Chapter 8 ) . Our second proof does not depend on the Axioru of Choice at ;ill: it also provides a deeper analysis of the structure of closecl sets, which is of iriclcpendent interest. It uses trilllsfi~iiterecursion (see Chapter 6).
Proof of IIYteo~.em :L5. Let A I>e a set of real numbers. We call (I E R it condensation point of A if for ('very d > 0 the set of all :r E -4 sui:li t l ~ i l t 1: - ul < 6 is ullcountablc (i.e., ilclit1it.r firiitc nor c:ountable). It, is obvious f r c ~ i i i the definition that any co~~der~satior~ poi~itof A is i t r l accur1~u1:~tion poilit of' .-l. l f ' ~dc?llotethe set of all coridensaticm points of A by A". (1) A C is a closed set. By Lemma 3.16 it suffices to show that every accumulation poitlt PmoJ of A" belorlgs to A C. If a is an ticcumulation point of A" and 6 > 0, the11 then: is X E A c such that Is - a1 < 6 . Let E = 6 - Is - a1 > 0. There t i ~ . t . uncountably rnany g E A such that ( y - sl < E ; but our choice of E guilraritclos t h ~ ift J y- s.1 < E , the11 ly - a ] 5 / g - z J+ ]X- a1 < E + )X - (LJ = 6, SO there ;ire 3 ~rncourltablynlarly y E A sut'h tt~iit13 - a1 < (5. ancl (L E A C . Now let F be a closet1 set; thc.11 Fc ol>servationis
CF
(sr~caLerii~riti3.16). O u r
~ic.~st
(2) C = F - F' is a t most c:ountt~ble.
If a E C , then a is not R condensation point of F. so thew is Proof. d > 0 such that F n ( a - 6, a + h ) is a t most countable. The density of t11c rationals in the real lirle guararltees the existerice of rational numbers r , S st11.11 that u - 6 < 7 - < a < S < (L + d . This shows that for ttac:h (L E C' thtt1.c. is
4. SETS OF REAL NUMBERS
191
an open interval ( r , S ) with rational endpoints such that a E F n (r.S) and F n (r, S ) is at most countable; in other words, C 2 U{ F n (r, S ) I r, S E Q. 7- < S, and F n ( r ,S) is a t most countable). But the set on the rjght side of the Irtst. inclusion is a union of an a t most countable system of a t rnost countable sets. so it, and consequently C, too, is at most count.nble. (Here we IISP the fiirt, dependent on the Axiom of Choice.) (3) If F is an uncountable set, then F" is perft?c.t,.
We know FC is closed [observation ( l ) ]a ~ l dilorlernpty [observat.ion Proof. (2)]. It, remains t o show that F' has no isolated poir1t.s. So ;fisunle that. ( I E FC is an isolated point of F'. Then there is 6 > 0 such that lx - a1 < 6. .X # ( I , implies X $ F C . But then, for this 6 , [ X - U ) < 6 ancl .T E k' imply .r ( I or C X E F - F , which is a n a t most countable set bj- observ:ition (2). Thus tIlt.rt1 are at most countably many X E F for which lx - U I < h, contradicting the assumption a E F C .
-
Observation (3) completes the proof of Theore1114.5. For an alternative proof of Theorem 4.5, at1c1 for its own sake, wt. lwgi11 a somewhat deeper analysis of the structure of closed sctts. In gerieral. c.losc(l srts differ from perfect sets in that they may have isolated points. We first provtb 4.7 Theorem E v e q closed set has at most countubl?j ntany isolated poir~ts.
Proof. If a is an isolated point of a closed set F, t,lic?n there is 6 > 0 such t h a t the only element of F in the open interval ( U - 6: u + 6) is a. Using thc density of the rationals in the reals, we can fincl r. .s C Q such that (L - h < r- < a < s < a + 6. so that a is the only element, of F ill t,he opcrl i n t c r ~ i l( v , S) with rational endpoi~lts.The function that assigns to ctilcll open interval with rational endpoints containing a unique element of F that element :LS R value, is a mapping of all a t rrlost countable set onto the set of ill1 isolntclrl points of F. The cot~clusionfollows from Theorem 3.4 in Chapter 4. 4.8 Definition Let A C R; the derived set of A. tlerioterl by A'. is tlw sct of' all accumulation poirlts of A.
c
It is easy t o see that a set A is closed if alld only if A' A ant1 pclrfrc.t if' and only if A # 0 and A' = A. Also, the derived set is itself a closet1 set. (Set. Exercise 4.4 for these results.) Theorem 4.7 says that for any closed F. tllo s ~ t F - F' of its isolated points is a t most countable. Tllese considerations suggest a strategy for a n investigation of the cardinality of closed sets. Let F # 8 be a closed set. If F = F', F is perfect and IF1 = 2N". Otherwise, F = (F- F') U F' ant1 ( F- F ' [ < No, so it remains to determint. the ct~rc1irialit.t. of the smaller closed set F'. If F' = 0 then F = F - F' h i ~ only s isolat.ed points and JFI 5 No.
f
CHAPTER 10. SETS OF REAL NUMBERS
192
If F' is perfect, then IF'J = ZNO, so J F J= 2N0. It is possible, however, that F' again has isolated points; for an example, consider F = N U { m- !I nj..71 E N , m 2 1 , n > l}. In that case we can repeat the procedure a n d examirlr F" = ( F ) ' . If F'' = 0 or F" is perfect, we get ( F ( N o or IF1 = ' L ~ o . But F" may again have isolated points, arld the analysis must continue. We (:an t l e t i ~ l r an infinite sequence (Fn I n E N ) by recursion:
<
All F, are closed sets; if for some n C N Fn = 0 or Fn is perfect, we get, I F1 < No or I F1 = 2N". Unfortunately, it may happen that all of the Fn's have isolated points. We could then define F, = Fn ; this is again a closed set, smaller than all Fn. If F, is perfect, we get IF1 2 IF,J = 2")' and if F, = g. F = U n E N ( F nis a countable union of countable sets, which car1 bi.
nnE
proved to be countable (without use of the Axiom of Choice). The problelri is that even F, may have isolated points! (Construction of w set F for which this happens is somewhat complicated, but see Exercises 4.5 and 4.6.) So wc3 have to consider further F,+ 1 = F:, F,+z = F:+l, . . . , and, if necessary. c.vt.11 F,+, = F,+, and so on. As the sets are getting smaller. there is goo(1 hope that the process will stop eventually, by reaching either a perfect set or V). This was the original motivation that led Georg Cantor to the development of his theory of transfinite or ordinal rlumbers.
on,,
4.9 Definition Let A be a set of reals. For every ordinal a we define. 1)). recursion on cr.
( A ( " ) ) '= the derived set of A("): A(") = A ( € ) ( a a limit ordinal). =
n
(
4.10 Theorem Let F be a closed set ofreals. There exists a n at most countablr ordinal 8 such that (a) For every a < 8 , the set F(") - F("+') is nonempty and at most countublr. (b) ~ t e + = l )~ ( 0 ) . ( c ) P = F ( ~is) either empty or perfect. (d) C = F - P is at most countuble.
In particular, the theorem gives a decomposition of each uncountable closcci set F into a perfect set P and an at most countable set C.
Proof. For each a, the set F ( ~is) closed. This is easjly seen by inductiorl. because the derived set of a closed set is closed, and the intersection of closecl - F ( ~is ~ the~set) of all isolatecl sets is a closed set. For each a,the set points of F ( ~ )and , so is at most countable by Theorem 4.7.
4. SETS O F REAL NUMBERS
193
As long as F(,+') # F ( ~ )the , transfinite sequence (F("))is decreasing (i.e., F(,) 2 when a < p). By the Axiom Schema of Replacement there exists an ordinal number 0 such that = [To see this, assume that F(,) # 8 for all a and let y be the Hartogs number of P ( F ) . The function g(a) = F(,) - F("+'),a < y, is a one-to-one mapping of 7 into P ( F ) , a contradiction.] Let €3 be the least such ordinal. [As a matter of fact, a consequence of the argument given below is that 8 is an ordinal less than w1.j Let P = We have ( ~ ( ~ 1 =) ' F(e),so the set P is either empty or perfect. Let C = F - P. Let (Jo,J1,. . . , J,, . . . ) be a sequence of all open intervals with rational endpoints. For each a f C, let o. be the unique o < €3 such that a E F(,) and let f (a) = the least n such that J,
n F ( ~ "=) { a ) .
Since a is an isolated point of F ( " ~ ~ there ) , is an interval J, that has only the point a in common with F("'.), so f is well defined. The function f maps C into w . We complete the proof by showing that f is one-to-one. Thus let a , b E C, and assume that f (a) = f ( b ) = n. Let us say that a, 5 ab Then F(,") C and b E J,, n F(Q'') C J,, n F(*.I), SO b = U . Hence f is one-to-one and so C is a t most countable. Every uncountable closed set has a perfect subset. This suggests that there may not be a simple example of an uncountable set of reals that does nut have a perfect subset. And indeed, it is necessary to use the Axiom of Choice to find such an example. 4.11 E x a m p l e An uncountable set without a perfect subset. Using the Axiom of Choice, we construct two disjoint sets X and Y , both of cardinality 2Nu,such that neither X nor Y have a perfect subset. We construct X and Y as the ranges of two o n e - b o n e sequences (X, I a < 2"9, (y, I a < 2'0) to be defined below. As proved in Chapter 5, the number of all closed sets of reals is 2Nu.Thus there are only 2'" perfect sets of reals, and we let (P, I a < 2'") be some one-to-one mapping of 2H0onto the set of all perfect. reals. To construct {X,) and {y,), we proceed by transfinite recursion (for a < 2N()). We choose xo and 30 to be two distinct elements of PO.Having constructed < a) (where a < ZN"),we observe that the set (X€ I E < a ) and (gE 1
has cardinality 2H[1(because IPal = ZHO and < 2'") and we choose distinct X,, y, E P, such that both X, and y, differ from all X € , yt < a). The sequences (X( 1 < 2H0)and (yc I c < 2H") are one-to-one and their ranges X and Y are disjoint sets of cardinality 2'(). Neither X nor Y have a perfect subset: if P is perfect, then P = P, for some o < 2Nu,and P n X # 0 # P n Y becausex, f P n X andy, E P n Y .
(c
CHAPTER 10. SETS OF REAL NUAlBERS
194 Exercises
4 . 1 Every perfect set is either an interval of the form [a,h ] . (-X. CL]. [h. +,X). or R = ( - m , + m ) , or it is the union of two disjoint perfect sets. 4.2 Assume that the union of any countable system of coilntable sets is courltable. Prove that every unc:o~~ntable closed set cont;~iiist,wo tlis,joirlt r i r i countable closed sets (a "splitting" lemma for closed sets). 4.3 Use Exercise 4.2 to give yet another proof of the fact t h a t t2ver.v ~ i n c o u ~ l t able closed set has cardinality ZN". 4.4 Let A' be t h e derived set of A R. T h e set A' is closed. A is clostd i t and only if A' A. A is perfect if and only if A' = A and A # Q). 4.5 'I'he set F = { l }U { l - 1/2"1 ( n l 2 l } U { l - 1/2"1 - 1/2'"+"2 1 711 2 1 , n.2 2 1 ) is closed, F' = { l ) U { l - l / Z n ' I n l 2 1). F" = { l ) , i i t i ~ l F"' = 0. 4.6 T h e set F = { l )U { l - 1/2"l - 1 / 2 " ~ + "-~ - . . - 1/2nl+n3+ . + n ~1 k. <: nl a n d n , 1 for a l l i s k ) 11% F, Fn = { l ) . 4.7 Tlle decomposition of a closed set F into P U C where P n C = 8. P is perfect, and C is a t most countable, is unique; i.e.. if F = P, U Cl whert. P1 is perfect and PI and Cl are disjoirlt and [C11 No, then Pi = P ail(1 Cl = C. [Hint: Show that P is the set of all condensation points of F.]
c
>
<
5.
Borel Sets
Tlieorem 4.1 and Corollary 4.6 show that open and closed sets are either at. most countable or have cardinality ZN". I t is tempting t o t r y t o prove siniilar results for niore conzplicated, but still relatively simple. sets. How rlo wcl 01)tain such sets'! One idea is to use set-theoretic operations, such ;is u~zioilsai~tl intersections. Now, a union or an ititersection of a finite system of open sets is again open, so no new sets can be obtained in this way, and the s;inie lloltls true for closed sets. T h e next logical step is to consider ulliorls arid intersc:c.t.iolls of couiltable systems. We define Dore1 sets as those sets of r e d s that call obtained from open and closed sets by means of repeatedly t.akirig c . o I I I I ~ ~ I ~ ) ~ ( ' unions and intersections. 5.1 Definition A set B C_ R is Borel if it belongs t o every systern ot' S P ( R ) with t h e following properties. (a) All open sets and all closed sets belong t o S. (b) If B, E S for each n E N , then U= :=, B, and Bn belong to S. B denotes t h e system of all Borel sets.
sets
n:Lo
In other words, B = ={S C P(R) I S has properties (a) and ( b ) ) is t,he sr~lnllestcollection of sets of reals that contains all open and closed sets mid is "closed" under couiltable unions and intersections. [Note that P ( R ) 1ia.s properties (a) and (b), so the intersection is defined.] It is possible to pt.ovtB seine properties of Borcl sets using this definition. We give one ex2tmple (sec& also Exercises 5.1, 5.2).
5. BOREL SETS T h e notion of a U-algebra of subsets of S is defined in 2.17, Chapter 8. 5.2 Theorem The collection B of all Borel sets i s a a-algebra of subsets of R.
Let C = {X C R 1 R - X E B). We show that C has properties Proof. (a) and (b) from Definition 5.1. If X is open, then R - X is closed, so R - X E B and X E C. Similarly, X closed implies X E C. This shows C has property (a). If B, E C for each n E N , then R - B, E B for each n E N. lienre CY) R - U= :=, Bn = n z o ( R- 8,) E B , and U,==, B, E C . This, and a similar argument for intersections, shows t h a t C has property (b). We conclude that B C C. Hence X E B implies X E C, whicli i ~ n p l i ~ s R - X EB . a Theorem 5.2 shows that the collection B of all Borel sets is the a-algebra generated by a11 open sets (or, by all closed sets); see Exercise 2.5 in Chapter 8. However, for a deeper study of Borel sets it is desirable t o have some more explicit characterization of them. Let us consider whicli sets are Borel o i l r e more. First, all open sets and all closed sets are Borel by clause (a). Wc! define
C: = { B C R / B is open);
R I B is closed).
H: = {B
c
Thus C ! C B , H: B. Next, countable unions and intersections of open w t s must be Borel by clause (b). Countable unions of open sets are open (set3 the remark following Theorem 3.11) and do not produce new sets; so we define
Clearly E : H; and it is possible t o prove that the inclusion is proper (see Exercise 5.31, so we obtain some new Borel sets in this way. Similarly. one defines 00
E: = (B
RIB=
U Bn where each B, E ny}
and proves that H: C C; 2 B. It is also true that rIy C II; and C: C E; (see Exercise 5.3). T h e process can be continued recursively. We define 00
C: = { B
c R I B = U B,
where each B, E H;}
n=O
and similarly m
n: = { B c R I B = r) B, n=O
where each B,, E
!$:l.
CHAPTER 10. SETS OF REAL NUMBERS
196
In general, for any k
> 0, m
= { B C RI B =
U B, whereeach B,
E H:}
n=O
and
m
H:+, = { B c R I B =
r) Bn where each B,
E E:).
n=O
c
B, for all k E N , k > 0. It can also be shown By induction, C: C B and (with the help of t h e Axiom of Choice; we omit the proof) t h a t
so t h e hierarchy produces new Borel sets at each level. One might hope t h a t all Borel sets a r e obtained in this fashion, t h a t is, t h a t B = C: = UE, H:, but it is not so. There are sequences (B, I n E N )such t h a t B, E C:,, CB M B, does not belong to any C: or II! B, or for each n E N, but (although it is of course Borel). We see again t h e need for a continuation of the construction of the hierarchy into "transfinite." 5.3 D e f i n i t i o n For all ordinals a < w lwe define collections of sets of reals Z !! and 11° as follows: (a) C[= { B 2 R ( B is open), II - { B C R 1 B is closed). b - = { B C_ R I B = Ur=oBnwhere each B, E IIO,), (b) E,+1 ,:l = { B R I B = nr'D=, Bn where each B, E CO,). : = { B C R I B = U z o B n where each B, E IIi for some 4 < a } , ( C) X II: = { B R I B = where each Bn E X i for some D < rr) ( t t ii limit ordinal). [In view of the next lemma, one could use clause (c) for successor ordinals i i S well, instead of (b).]
n;=fiB,
c
5.4 Lemma (2)
For every a < w l , C: C H:+, and IIL G
( i i ) For every a
E
+
(iii) I f B E X:,
< w 1, and II: 2 11:+1 then R - B E IT:; z f B E l:,
then R - B E C:.
(iv] i f B, E EL for each n E N , then UF.oBn E X:; n E N , then Bn E IT:.
nEo
zf B,,
E
II: for eacl~
5. BOREL SETS
Proof. (i) Trivial: in Definition 5.3(b), let Bn = B for all n. (v) Every open set is in C: (see Exercise 5.31, and similarly, every closed set is in II!. Hence 23; C C: and I l y IT;. Also, IT: C 23; and C: C IT:. by (i). Then we proceed to prove t h a t the assertion holds for every a < p, by induction on 0.
c
(ii) Follows from (v). (iii) By induction on a . (iv) This innocuous statement requires the Countable Axiom of Choice: For =:, B,, for some countable collection each n, if B, t C:, then B, = U {B,, I m E N ) of sets B, E U,, Il;. For each n, we choose one such collection (along with its enumeration (B,, I m E N ) ) , and then
is the union of the countable collection {B,,
1 (71. m ) E N
X
N ) and thus
in C:. T h e proof for IT: is similar. I t is clear (by induction on a ) t h a t each B: and each II: consists of Borel sets. W h a t is important, however, is that this hierarchy exhausts all Borel sets.
5.5 Theorem A set of reals is a Bowl set if and only i f it belongs to C: for some a < wl (if and only if it belongs to ll: for some a < wl).
Proof.
It suffices t o show that the set
is closed under countable unions and intersections. Thus let { B , ( n E N ) be a countable collection of sets in S. We show that, e.g., ,=, Bn is in S . For each n let an be the least cr such that B, E II,. T h e set ( a , ) n E N ) is an a t most countable set of ordinals less than wl and thus (by the Countable Axiom of Choice) its supremum a = sup{an ( n E N } = U { a n I n E N } is a n ordinal less than wl. By Lemma 5.4 we have Bn E IT! for each n. Consequently, C l Bn belongs t o C:, and hence t o S.
Vrn
We note, without proof, t h a t Theorem 4.5 can be extended to all Borel sets: Every uncountable Borel set contains a perfect subset. In particular, every uncountable Borel set has cardinality 2H0. In Chapter 8 we mentioned the (Ialgebra C of Lebesgue measurable sets of reals; of course, B g C, but there
CHAPTER 10. SETS OF REAL NUMBERS
198
exist also Lebesgue measurable sets which are not Borel. In fact, one cannot, prove in Zermelo-Raenkel set theory t h a t all urlcountable Lebesgue nleasural)lc. sets have cardinality 2N". T h e construction of the Bore1 hierarchy is just a special case of ;t very ge11c.ti-il procedure. We observe t h a t the infirlitary unions and intersec:tions usecl iri it are. in some sense, generalizations of' the usual operations U arld n. t,Vo t l ~ f i i i t ~ : F is a (countably) infinitary oycratio~~ on A if F is a function on H suI>sttt of' A N (the set of all infinite sequences of elellients o f A ) into A, aiid i1itrc1tluc.t. structures with infinitary operutions. This car] bt: done forrnallv bv exteii(liil:: the definition of type (see Section 5 in Chapter 3) by allowing fi E N u { N } i l t ~ t l postulating t h a t F, is an infinitary operation wherlever f;, = N . Tlle dc~finit,ioll of' 21 set B C A being cEosed reinailis ttsserltially uncllallgeil; if f, N wt. 1.tbcluil.tt F,((u, ( i E N ) ) E i? for ali ( a , ( i E N ) such t h a t u I E i? for ;ill i E N allcl F, is defined. T h e closure 5;; of it set C C A is still t h e I e t ~ c:los~rl ~t s ~ c,ontiii~ii~ig t ;ill tllements of C. 111this terminology, the systerrl of Borel sets is just the closltre of t.he systtltii of a11 open and closed sets under the irifirlitary uniorl allcl intersec-tion. 51oi.tb precisely, we let U = (A, F l , F 2 ) w11t:re A = P ( R ) , Fl({B, I -i E N ) ) = U'1 = I ) L?, and F2((B, i E N ) )= B,, i ~ Cd = { B 2 R I B is open or B is (.k)stbrl). Then B = C. Exercise 5.9 provides another example of lt structure with infil~itaryc)l,cbl.iltiorls arld a closed subset of somc! ~nathematicaliuterest. One important general questiorl about the closure is t h a t of its c:i.trdii~iilit.~-. \jre proved the simplest result prcwicling ari a1lswt.r iri 'rlic?orcni 3.14 o f Cliilptc~~' 4. There it uras slssurned tellat C' is ;it most countable, and all operat,ioils ii1.c' fi~iitary. We flaw generalize this, first. t.o i~rbit~rary C, ii11c1 tlit:~lt o strll(.tltrcss with infinitary operittions.
-
nEo
I
5.6 Theorem Let 2l = ( A , (Ru,. . . . R ,,,-l ) , ( F u , .. . . F,- 1)) 116 U s ~ T u ( : ~ (171(i u~.~J let C -4. Let T]; be the closurc: of C. If C is finite, then z s at .moat cot~~ltahlo; i f C is 27~finite,then /C1= /Cl.
c
-
,,: C, wlrt.i.cb Proof. We proved in Theoreril 5.10 in Chapter 3 t h a t C = U CO= C and C,+* = C,UF~[c("]u. - u F , ~ _ ~ [ c [ If" -C' ]is. finite, t h c ! ~~karll ~ C', is finite, and No. If C is illfinite and ICI = N,; then wr prove by in(lu(:tioil ! t h a t lCkl = N,, for a11 k: Assume that it is true for C,; the11 IC,'J ' I Ntr' = N,,. . I IFJIC,']I < No. tint1 IC,+ll = N,,. Hence (Cl = N u . N , , = K,,.
<
<
When 2l is a structure with ii~finitaryoperations, Theorem 5.6 (:an be gerlt~ralizedbut tlie cardinality estin~titeis different,. Let U t)e a st,rut:ture, iir~tl for simplicity Let us assume t h a t U llas just one operation F a n d t h a t F is i i i l infinitary operation (the result can of course be generalizecl t o structures with inore than one operation). IU llas: universe A , and F is a function on a subsilt o f A" into A, where A W is the set of all sequences (a, ] n E N ) with values in A .
5. BOREL SETS
199
5.7 Theorem Let 54 = ( A ,F ) be a structure where F is a function o n a .~ubset of A" into A , and let C C A. Let C be the closurc of C i n B, i.e.,
-
C=
n { xG A 1 F [ X w] G X
If C has at Least two elements, then Proof,
and C 2 X}.
5 ICIHil.
We construct an w 1-sequence
of subsets of A as follows:
C, = C; C,+l - C, U F [C:]; C, =
U Cc
if a is a limit ordinal.
€
Let C be the closure of C in U and let D = U,,l C,. By induction on a we have C, C_ C , so D C C. On the other hand, if ( a , I n E N) is a sequence in D , then (a, 1 n E N ) E C, for some a < w l : for each 71 let c, be the least such that a , E C(, and let a = sup{Cn I n E N ) . And i f (a, I n E N) E dom F then F ( ( a , I n E N))E C,+l; hence F[D Y] 2 D, so D C. Thus C = ,,U ,, C,. To estimate the size of C, we first prove, by inductioll on a, that /C, I 5 ]C(Ni1 . 2'1). This is certainly true for a = 0. If the estimate is true for a, then
<
>
If a is a limit ordinal, and if t h e estimate is true for all
c < a, then
(because lal _< No). Finally, we have
because NI . 2N" = 2'0. And if [Cl (CIN(l.
> 2, then
> 2'0, and we have
IC[N4l
<
U
Exercises
+
+
5.1 If X is Borel and a E R then X a = {X a 1 r E X ) is Borel. [Hint: Imitate the proof of Theorem 5.2.1 5 . 2 If X is Borel and f : R -, R is a continuous function, then f - [ X ] is Borel.
'
CHAPTER 10. SETS O F REAL NUMBERS
200
5.3 Prove t h a t every open interval is a union of a countable system of closed intervals a n d conclude from this fact t h a t C: C C!. Since II: 2 C!. cooclude that. E: C C: and IIY C E :. This yit:lrls i ~ n n l e d i a t ~ lt1l;it y II: c II: and E: C H:. 5.4 Prove by induction that C: II: C C:+, n no,+,for all n E N . 5.5 Prove t h a t for each U. both X : and II: are closed under filiite unions and intersections, i.e., if BI and Bz E C:. then BI LJ B2 E C: iiri(1 BI n Bz E C:, and sinlilarly for H:,. 6.6 Let f : R R be a continuous function. If B E C:, then f - l [ B ]E C::. Similarly for II;. 5.7 Sliow t h a t tht? cardinality of the set B of all Borel sets is 2N". 5.8 Prove t h a t the Borel sets tire the closure of the set of all open intervals with rational enclpoirits unc.lc:r irlfinitary unions and iritersectio~ls. 'I'his shows t h a t in a structure with infinitary operatioris, the closure of ;L r.ourltable set Inay be urico~intable. 5.9 Let F n be the set of a11 f'~~llctions from a subset of R into R. ' I ' l i t l infinitary operation l i ~ n(limit) is defined on F n as follows: lirri(( f; 1 i c: N ) )= where f ( X ) = linli,, f , ( x ) whenever the lirnit on the right sitle exists, and is undefined otherwise. T h e elements of the closure of the set of all continuous furlctions under lim are called Baire functions. Sliow t h a t Baire functions need not be continuous. I n particular, show t h a t the characteristic functions of integers and rationals are Baire functions. 5.10 For cr < wl , we define functions (on a subset of R into R) of class tt as follows: the functions of class 0 are all t h e c:ontinuous functions; f' is of. class t r if f = linl,,, f, wliere each f, is of some class < Q.. Show t1i;lt Btiire functions are exactly the functions of class cr for some u < w l . 5.11 S t ~ o wt h a t tile cartlinality of the set of all Baire f u ~ ~ c t i o is n s2*".
-
Chapter 11
Filters and Ultrafilters 1.
Filters and Ideals
This chapter studies deeper properties of sets in general. Unlike earlier c:llapt.ers. we d o not restrict ourselves to sets of natural or real numbers, and nctitller do we concert] ourselves only with cardinals and ordinals. Here we deal with abstract collections o f sets and investigate their properties. We iutroducc several (.onc~~pt,s t h a t have become of fundamental importance in app1ic:atioris of set. theory. It is interesting t h a t a deeper investigation of such concepts leads t o the theorv of large cardinals. Large cardinals are studied in Chapter 13. We start by introducing a filter of sets. Filters play an important role ill many mathematical disciplines. 1.1 Definition Let S be a nonempty set. A filter subsets of S that. satisfies the following conditions: (a) S E F and Q) $ F. (b) If X E F and Y E F, then X n Y E F. ( c ) If X E F and X C Y S , t h e n Y E F.
011
S is a colltsct,ion F of
A trivial example of a filter on S is t h e collect,ion F = { S ) that c:oilsist.s only of t h e set S itself. This trivial filter on S is the sniallest filt,er on S; it, is included in every filter on S. Let A be a nonempty subset of S, and let us consider the collectiol~
T h e collection F so defined satisfies Definition 1.1 and so is a filter on S. I t is called the principal filter on S genemted by A. If in this example, the set A has just one element U , i.e., A = { a ) where a E S, then t h e principal filter
202
CHAPTER 11. FILTERS AND ULTRAFILTERS
is maximal: There is no filter F' on S such t h a t F' 2 F (because if X E F' - F then a 4 X , but { a ) E F' as well, and so Q1 = X n { a ) E F'? a contradiction). We shall prove shortly t h a t there are maximal filters t h a t are not principal. To give an example of a nonprincipal filter, let S be an infinite set, arid let (1-2)
F = {X
c S I S - X is finite).
F is the filter of all cofinite subsets of S . ( X is a cofinite subset of S if S - X is finite.) F is a filter because t h e intersection of two cofinite subsets of S is a cofinite subset of S. F is not a principal filter because whenever A E F tfierl there is a proper subset X of A such t h a t X E F (let X be any cofinite subset of A , X # A ) . 1.3 D e f i n i t i o n Let S be a nonempty set. An ideal on S is a collection I of' subsets of S t h a t satisfies the following conditions: (a) @ EI and S 4 I. (b) If X E I and Y E I, then X U I.' r I. (c) If Y E I a n d X C Y, then X E I.
T h e trivial ideal on S is the ideal (8). A principal ideal is a n ideal of form
t,tlt~
where A G S. To see how filters and ideals are related, note t h a t if F is a filter on S then
is a n ideal, and vice versa, if I is :in ideal, then
is a filter. T h e two objects related by (1.5) and (1.6) are called dual to eiic.1~ other. T h e filter of cofinite subsets of S is the dual of the ideal of finite subsets of S. In Exercises 1.2 and 1.3 the reader can find other examples of nonprincipal ideals. We recall (Definition 3.3 in Chapter 10) t h a t a nonempty collection G has the finite intersection property if every nonempty finite subcollectior~{ X l .. . . . X,, ) of G has nonempty intersection X I n . . . n X, # 0. I t follows from (a) and (b) of Definition 1.1 t h a t every filter has the f i ~ l i t c ~ intersection property. In fact, if G is any subcollection of a filter F, then G litis the finite intersection property. Conversely, every set that has the finite intersection property is a subc:ollet,tion of some filter. 1.7 Lemma Let G
# 0 be a collection of subsets of S and let G have the finzte
intersection property. Then there is a filter F on S such that G
C F.
1 . FILTERS AND IDEALS
203
Proof. Let F be the collection of all subsets X of S with the property t h a t there is a finite subset {X1,. . . , X n ) of G such that
Clearly, S itself is in F, and Q) is not in F because G has t h e finite intersection property. T h e condition (c) of Definition 1.1 is certainly satisfied by F. As for the condition ( b ) , if X I) X I n n Xn for some X I , . . . ,X, E G , and if Y 2 Y1n...nl7,f o r s o m e Y l , . . . ,Y, E G, then X n Y 2 X l n . - , n X , n Y l n - . . n Y , , , . so X Y E F. Thus F is a filter. T h e filter F constructed in Lemma 1.7 is the smallest filter on S t h a t extends the collection G. We say t h a t G generates the filter F . See Exercise l .G. 1.8 Example Let S be a Euclidean space and let (L be a point in S. Consider the collection G of all open sets U in S such t h a t a E U. Then G has the finite intersection property and hence it generates a filter F on S. F is the neighborhood filter of a . 1.9 ExampIe. Density. Let A be a set of natural numbers. For each n E N , let A ( n ) = I A n ( 0 , ..., n - 1 ) 1 denote the number of elements of A t h a t smaller t h a n n. T h e limit
d(A) = lim -,A ( n ) n+m n if it exists, is called the density of A. For example, the set of all even numbers has density 1 / 2 (Exercise 1.7). Every finite set has density 0, and there exist infinite sets t h a t have density 0. (For example, the set {2n [ n E N ) of all powers of 2 - Exercise 1 . 8 . ) Let A and B be sets of natural numbers. If A B then A(n) B ( n ) for all n , and so if both A and B have density then d ( A ) < d ( B ) . In partic.ular, if d ( B ) = 0 then d ( A ) = 0. Also, for every n, ( A U B ) ( n ) A(n) B ( n ) , and if A a n d B are ~iisjoit~t then ( AU B ) ( n )= A(n) B ( n ) . Hence d ( A U B ) <_ d ( A )+ d ( B ) (providrti t,llat, t h e densities exist), and d ( A B ) = d ( A )+ d ( B ) if A and B are disjoint. If both d ( A ) and d ( B ) are zero, then d ( A U B ) = 0 . This gives us a n example of a n ideal on N, the ideal of sets of density 0:
<
+
<
+
Id(because d ( N ) = 1 ) . If A C B and B r Id then We have 8 E Id and N A E Id, and if both A E Idand B E Idthen A U B E I d . Thus Id is an ideal. As noted above, Idcontains all finite sets, but also some infinite sets. As a final remark, not every set A C N has density. It is not difficult to construct a set A such t h a t the limit lim,,, A(n)/71 does not exist (Exercise 1 .g).
CHA PY'ER 1 1. FILTERS AND ULTRA FILTERS
204
We. conclucle this section with the irltroduction of an important corlcept that has Inany applications in analysis; 1.10 Definition A measu7-c on a set S is a rc!al-valuer1 functioll jj.2 0efi11c~l olr P(S)that satisfies the following conditions: (a) m(0)= 0, m ( S ) > 0. ( h ) If A B then m(A) m ( B ) . (c) If A and B are disjoint then TU(AU B) = m(A) rn(B). (Notice that the density function oil P(N)satisfies the conclitions. except t,liat it is riot defined for all subsets of N . )
c
<
+
It follows that m(A) 2 0 for every A , and that m ( S - A) = Propertdy (c) is i:allecl finite additi~~itp, ancl clearly
7 4 s )
-
rrz(.4).
m(A1 U - - U A , ) = rn(Al) + . . . + m(&) for any disjoint finite collection (A 1 , . . . , A,,). At this point. we can only give two trivial examples of measures: 1.11 Example Let S be a firlite set, and let ?n(A) = IA( for -4 c S. 'l'liis is the co~r~lting measure on S.
1 .l2 Example Let S 11e a norlenlpty set and let a E S. Let m(A) = 1 if ancl m(A) = 0 if a 4 A. ( A trivial measure.)
U E
A.
III the next sectiurl, we construct nontrivial measures on N. Exercises
1.1 If S is S: fi~iitenoneirlpty set, then every filter on S is a principal filter. 1.2 Let S be an uncountable set, and let I be the collection of' all X c S such that 1 x1 < No. I is a rlonprincipal ideal on S. 1.3 Let S be an infinite set n11d let Z C_ S bc such that both Z ;uld S - Z ilrrb infinite. The collection I = {X S I X - Z is finite) is a rlorlprirlcipal ideal. 1.4 If a set A C_ S has more than one element, then the prinr:ipal filt,clr generated by A is not maximal. 1.5 If F is n nonempty set of filters on S , then n ( F [ F E F) is a filter or)
c
S. 1.6 The filter constructed in the proof of Lemma 1.7 is the smallest filter otl S that includes the collection G. 1.7 Let A be the set of all rlatural nurrlbers thilt are tlivisible 1 ) ~ . ;I gi1*(811 number p > 0. Show that d(A) = l / p . 1.8 Prove that the set (2'' 1 71 E N ) has density 0. l .g Collstxuct a set A of natural riurrlbers such that
linlsupA(n)/7i = 1 and liminf A ( r ~ ) / n= 0. n+cu
n-+m
2. ULTRAFILTERS
2.
Ultrafilters
2.1 Definition A filter U on S is an ultrafilter if for every X or S - X U.
C S, either X
E
U
The dual notion is a prime ideal:
2.2 Definition An ideal I on S is a prime ideal if for every X
C S. either
Xf IorS-XfI. 2.3 L e m m a A filter F on S is an ultrafilter if and only zf it is a maxi~naljilter on S .
Proof. If F is an ultrafilter, then it is a maximal filter, because if F' > F is a filter on S, then there is X C S in F'- F. But because F is an ultrafilter, S - X is in F , and hence in F', and we have 0 = X n (S - X ) E F', a contradi~t~ion. Conversely, let F be a filter but not an ultrafilter. There is some X S such that neither X nor S - X is in F. Let G = F U{X). We claim that U h ~ s the finite intersection property. I f X 1 , . . . , X , a r e i n F , t h e n Y = X I n . . . n X , E F , ; s n d Y n X # 0 because otherwise (S - X ) 2 Y and so S - X E F, contrary to the assumption. Hencc X l n.. . n X , n X # 0,which means that G = F U { X ) has the finite int,ers~c.tio~l property. Thus there is a filter F' on S such that F' F U { X ) . But t.hat ~ ~ l c a that ns F is not a maximal filter. S
c
>
We have seen earlier that there are principal filters that are rnaximal. 111 other words, there exist principal ultrafilters. Do there exist nonprincipal ultrafilters? Let S be an infinite set, and let F be the filter of cofir~itesubsets of S. If U is an ultrafilter and U extends F, then U cannot be principal. Thus. to fillcl a nonprincipal ultrafilter it is enough to find an ultrafilter that extellds the filter of cofinite sets. of' Conversely, if U is a nonprincipal ultrafilter, then it ext.ends t,lle filtc>~. cofinite sets because every X E U is infinite (Exercise 2.2). The next theorem states that any filter can be extended to an ultrafilt,<~r. However, the proof uses the Axiom of Choice, and i l l fact it is kriown that t11c. theorem cannot be proved in Zermelo-Fkaenkel set theory alone. 2.4 T h e o r e m Evemj filter on a set S can be extended to an ultrafittcr 07,. S .
The proof uses Zorn's Lemma. In order to apply it we need the following fact: 2.5 L e m m a If C is a set of filters on S and if for e7ier-y Fl. F - E C , cithcr Fl c F2 07. F2 c FI, then the union of C is also a jilter on S .
C H A P T E R 11. FILTERS A N D U L T R A FILTJ7R.S
206
Proof. This is a matter of simple verification of (a), (b), and (c) in Dt?finition 1.1, 0 Proof of Theorern 2.4. Let F. be a filter on S ; we find a filter F 2 F. t.liat is maximal. Let P be the set of all filters F on S such t h a t F 2 Fo, and let us consider the partially ordered set (P,C ) . By Lemma 2.5, every chain C in P 11;s iLt1 upper bound, namely UC . Thus Zorn's Lemma is applicable, and (P,C ) l l w i~ maximal element U . Clearly, U is a maximal filter on S and U 2 F,. By Lemma 2,3. U is a n ultrafilter. i A
There is a natural relation between ultrafilters and measures. Let us call n nietsure m on S two-valued if it only takes values 0 ancl 1: for rvery A S. either m ( A ) = 0 or m ( A ) = 1.
c
2.6 Theorem (a)
If m is a two-valued measure on S then U
=
{A
c
m ( A ) = 1) is an ultrafilter. (6) If U is an ultrafilter on S , then the function m on P ( S ) defined by m ( A ) if A E U and rn(A) = 0 2j A 4 U is a two-valued measure 071 S .
S = 1
Proof. Compare t h e definitions of ultrafilter and measure. Note t h a t if' ,4 and B are disjoint, then a t most one of them is in an ultrafilter (or has irlth;wul.tb 1). C1 We now present one of many applications of ultrafilters. This particular zipplication is a generalizatiori of the concept of limits of sequences of real numbers. 2.7 Definition Let U be an ultrafilter on N, arid let (a,):., be ;I l~oun(lt~cl sequence of real numbers. We say t h a t a real number a is the U-limit of' t . 1 1 ~ sequence, U = lim a,,
[r
if for every E
> 0 , { n I la,
- a1
<E)
E U.
First we observe t h a t if a U-lirnit exists then it is unique. For, assume that u < b and both U a n d b are U-limits of (a,)?==,. Let E = (b - a ) / 2 . Then t h t ~ sets { n ( (a, - a ( < E ) and {n ( lan - b( < E ) are disjoint and so cannot both 110 in U, If U is a principal ultrafilter, U = {A I no E A) for some no. then limr: a,, =: U,,,,for any sequence ( a n ) ~ ~ =This o . is because for every E > 0, { r 1~ Iw, - r ~ , , , , l < E ) 2 {no) U. If (a,):!, is convergent and lirn,,, a, = U , then for every nonprincipal illtrafilter U . limll a, = U. This is because for every E > 0, tilere exists sorrlr k such t h a t ( n I \a,, - a1 < E ) ( 7 1 1 ?L 2 k ) , and { n I n > k ) E U . T h e important property of ultrafilter lirnits is t h a t the U-limit exists for every bounded sequence:
>
2.
ULTRAFILTERS
207
2.8 Theorem Let U be an ultrafilter on N and let quence of real numbers. Then limu a, exists.
be a bounded se-
Proof. As is bounded, there exist numbers a. and b such that a < a, < b for all n. For every X E [a, b], let
A, whenever X 5 y. Therefore A, Clearly, A, = 0, Ab = N , and Ax Al,f U. and if A, E U a n d x 5 y , then A, E U. Now let C
= SUP{X(
A,
4
U,
U);
we claim that C = limu a,. As for every E > 0, A,-, 4 U while A,-+, E U . i d because A,+, = A,- E2 U {n I c - E < a, < c + E ) , it follows that ( 7 1 ( [a, - cl < E) E U. As an application of ultrafilter limits, we construct a nontrivial measure on
N. 2.9 Theorem There exists a measure m on N such that m(A) = d ( A ) for every set A that has density.
Proof.
Let U be a nonprincipal ultrafilter on N . For every set A
c N.
m(A) = lim 1J n where A(n) = ( A n nl. Clearly, if A has density then m(A) = d(A). Also, it is easy to verify that m(0) = 0 and m ( N ) = 1, and that A B implies m ( A ) < m ( B ) . If A and B are disjoint then (A U B)(n) = A(n) + B ( n ) , and the additivity of m follows from this property of ultrafilter limits:
c
lim(a, U
+ bn) = lima, + limb,. U U
We leave the proof, which is analogous to that for ordinary limits, as an exercise L' (Exercise 2.5).
Exercises 2.1 If U is an ultrafilter on S, then P ( S ) - U is a prime ideal. 2.2 If U is a nonprincipal ultrafilter, then every X E U is infinite. 2.3 Let U be an ultrafilter on S. Show that the collection V of sets X C S X S defined by X E V if and only if {a E S I { b E S 1 ( a , b ) E X ) E U} E U is an ultrafilter on S X S. 2.4 Let U be an ultrafilter on S and let f : S -+ T. Show that the collection V of sets X c T defined by X E V if and only if f-'[X] E U is an ultrafilter on T. 2.5 Prove that limu (a, + 6,) = limu an + limlJ b,.
CHAPTER 1 1 . FILTERS AND ULTRAFILTERS
208
Closed Unbounded and Stationary Sets
3.
In this section we introduce an important filter on :i regular uncountable carclirial - the filter generated by the closed unbounded sets. Although the results of this sectiorl car] be formulatecl and provecl for any regular uncou~lt~able c.arcliilal. n-t1 restrict our investigations to the lei~stuncountable cardirial N I . (See Exercischs 3.5-3.9.)
c
3.1 Definition A set C wl is closed unbounded if (a) C is unbounded in w l , i.e.. s u p C = w l . (b) C is closed, i.e., every increasing sequence
of ordinals in C has its suprenlurn sup{a,
I n E W ) E C.
An in~portant,albeit simple, fact about closed unbounded sets is that tl~c>?. have the finite intersection property; this is a consequence of the following: 3.2 Lemma If Cl and Cz are closed unbounded subsets of closed unbounded.
wl,
then C l n C ,
,.v
Proof. It is easy to see that Cl n C2 is closed; if a1 < a:! < . . . is a sequence in both Cl and CL, t,hen cr = sup{n, ) n W ) E Cl n C2. To see that CLnC;! is unbounded, let y < w l be arbitrary and lct us find r k in Cl n C2such that a > y. Let us construct an increasing sequence
of countable ordinals as follows: Let a 0 be the least ordinal in Cl above y. let Do be the least 3 > 00 in C2. 'l'llen a ] E C l ,P1 E C 2 ,tincl so 011. Let a be the suprernum of { o , , ) , ~ it ~ ;is also the suprernum of ( J n The ordinal a is in both C, nrld C2 and therefore cu Cl n C2.
'l'lit*~i
Inch.. 0
The set w l of all countable ordinals is closed and unbounded. (Howevc.r. here we use a consequence of the Axiom of Choice, namely the regularity of L J :~ the supremum of every countable sequence of countable ordinals is a c*ountabltb ordinal.) Another example of a closed unbounded set is the set of all limit c:ountable ordinals (Exercise 3.2). For other, less obvious examples see Exerriscbs 3 . 3 and 3.4. As the collection of all closed urlbounded subsets of wl llas the finite int,vrsection property, it generates a filter on wl: (3.3)
F = {X
c
I
L J ~
X
> C f ~ some r (:losecl unbounded C )
The filter (3.3) is called the closed unbounded filter or1 w l .
3. CLOSED UNBOURrDED AND STAT1ONAR.YSETS
209
3.4 L e m m a If {Cn)nEwis a countable collection of closed unbounded sets, then Cn is closed and unbounded. Conseguentlp, zf {Xn)nEd is a countabl~ collection of sets i n the dosed unbounded filter F , then X. E F.
nZz0
nzo n;=,
C, is closed. ?b Proof. It is easy to verify that the intersection C = prove that C is unbounded, let y < wl; we find cr E C greater than y. We note that C = n r = o D,, where for each n, D,, = COn- .nC, ; the sets D, are closed unbounded and form a decreasing sequence: Do 2 D1 2 D2 2 . . . . Let ( a , I n E W ) be the following sequence of countable ordinals: y < a0 < (11 < . - . , ancl for each n , cxn+l is the least ordinal ~ I ID, above G , , . Let rr LW the supremum of To show that a E C, we prove that CL E D, for each n. But for any n , CY is the supremum of {akI k > 72 + l ) , and all thc a k for . . . 2 Dk 2 . . . ( k > n ) . k > n + 1 are in D, because D, 2
>
Closely related to closed unbounded sets arc stationary sets. Stntionaq sets are those sets S g wl which do not belong to the ideal dual to the c:losc(l unbounded filter. A reformulation of this leads us to the followiilg definition.
C W * is
3.5 Definition A set S C, S n C is nonempty.
stationary if for every closed unbou~ldetlwt.
Clearly. every closed unbounded set is stationary, and if S is stnt,ionary arid S C_ T C wl, then T is also stationary. Later in this section we present a n example of a stationary set that does not have a closerl unbounded subset. But first we prove the following theorem. A function f with domain S C_ wl is reyressive if f ( a ) < u for all (L # 0 .
c
3.6 Theorem A set S wl is stationary i j and onlp zj r7~e7yregressi~lefunction. j : S -+ wl is constant on an unbounded set. i n fact, f has a constan.t value on. a stationary set,
This theorem is a good example of how an ul~countablealeph such as N I differs substantially from No. On uncountable cardinals there is no analogue of the function f : w -+ w defined by f(n) = n - 1 for n > 0 , f ( 0 ) = 0 , wl~ich has each value only finitely many times. A consequence of 'I'heoreni 3.6 is that. on wl! a function that satisfies f ( a )< a repeats some value ~ l n c o u n t a boften, l~ unless the domain of f is "small," that is, not stationary. One direction of the theorem states that if S C wl is not statiorlary, t11en there is a function on S such that f ( a )< a for all a # 0 , and f takes each value a t most countably many times. Such a function is corlstructed in the following example. 3.7 E x a m p l e Let A C wl be a nonstationary set; hence there is a closed unbounded C such that A n C = 0. Let f : A -+ wl be defined as follows:
f (CL)
=
sup(C n a ) ( a E A).
CHAPTER 1 1. FILTERS AND ULTRA FILTERS
210
Because f(a)E C (see Exercise 3.1) and C n A = 0, we have f ( a ) < CL. And for each y < wl, when a E A is greater than the least element of C above 7 . thcrl f ( a )> y and so f does not have the same value for uncountably many a ' s . To prove the other direction of the theorem we need 3.8 Lemma Let {Ce I set C C wl defined by
1emm:s.
< < wl } be a collection of closed unbounded sets.
a E C i f and only if a E Ccfor all
(3.9)
A
< a (a E
Thr
wl)
called the diagonal intersection of the C ( , is closed unbounded. Proof. First we prove that C is closed. So let a0 < a1 < . - . < cr, < . - . be a11 increasing sequence of elements of C and let a be its supremurn. To prove that a is in C we show that a f C( for all < a . Let. < a . Then there is sorne k such that < ~ k k and . hence < ck,, for all 71 k . Since each a , is in C, it satisfies ( 3 . 9 ) , and so for each n k we hi~vc, a , E C€. But C( is closed and so cr, the supremum of { a n ) n ~isk in , C€. Next we prove that C is unbounded. Let y < wl and let us find ( 1 E C' greater than y. We construct an increasing sequence a0 < a1 < 02 < . . . itS follows: Let a0 = y. The set C( is closed unbounded (by Lemma 3.4) i~rldso we let crl be its least element above ao. Then we let a2 be the least element of' Cc above a1, and in general,
>
<
>
n(,,,,
nECn,
7 let us prove that cr E C. The let a be the suprernum of { a n ) n E wand We are to show that a E C€ for all < a . So let < a . There is k such t l ~ t < ak, and for all n 2 k we have a,+ l E C € ,by (3.10). But cr is the suprerrlunl of { a , and hence is in C c .
Proof of Theorem 3.6. Let S be a stationary set and let f : S -+ L J ~1)e ii regressive function. For each y < wl , let A, = f - l [{y )]; we show that for some y the set A, is stationary. Assuming otherwise, no A, is stationary and so for each -j < w l there is n closed unbounded set C, such that A, n C, = 0. Let C be the diagol~al intersection of the C,, that is, (3.11)
a E C if and only if cr E C, for all
y
< rr.
If a E C,, then a pl A, and so f ( a ) # y. Thus (3.11)means that if a E C, then f ( a ) # y for all y < a ; in other words, f ( a ) is not less than a . But C is r:losecl unbounded and hence it intersects the stationary set S. But for all 0 E S, f ( t r ) C? is less than cr. A contradiction.
3, CLOSED UNBOUNDED AND STATIONARY SETS
2 11
As an application of Theorem 3.6 we construct a stationary set S G wl, whose complement is also stationary. Note however that the construction uses the Axiom of Choice. (It is known that the Axiom of Choice is indispensable in this case.) 3.12 Example A stationary set whose complement is stationary. Let C be the set of all countable limit ordinals. For each E C there exists a n increasing sequence X, = (X,, I n f W) with limit cr. For each 7 1 . we let, f, : C --+ wl be the function f n ( a ) = X,,. For each n , because f,,(a) < a on C, there cxist,s 7, such t,hat the set is stationary. Sn = {a E C I fn(o) = We claim t h a t a t least one of the sets S, has a stationary complement. If not, then each S, contains a closed unbounded set, so their irltersection does too. Hence S,, contains an ordinal a that is greater than the supremum of But in that case the sequence X, = (X,, I 12 E U ) = ( f n ( a ) I the set n E w) = (7, I n E w) does not converge to a, a c~ntradict~ion.
f)r=O
{
~
~
)
~
~
~
a
As another application of Theorem 3.6 we prove a combinatorial theorem known as the A-lemma. Although it is possible to prove the A-lemma direct,ly, the present proof illustrates the power of Theorem 3.6. 3.13 Theorem Let { A , I i E I} be an uncountable collection of finite scts. Then there exists an uncountable J C_ I and a set A such that for all distinct i , j E J , Ai n A, = A.
We may assume that I = wl, and since the union of N I finite sets Proof, has size N I ,we may also assume that all the A, are subsets of wl. Clr:arly, uncountably many A, have the same size and so we ,assume t h a t we have a collectioi~{A, I cr < wl) of subsets of L J ~ each , of size 71, for some fixed riurnber n. Let C be the set of all a < wl such t h a t maxA( < a whenever < a . The set C' is closed unbounded (compare with Exercise 3.4). For each k < 7 1 , let Sk= {cr E C 1 /A, n a [= k ) ; there is at least one k for which Sk is stationary. For each m = 1 , . . . , k , let f,(cr) = t h e m t h element of A,; we have {,,(Q) < c k on S k . By k applications of Theorem 3.6, we obtain a stationary set T 2 Sk and a set A (of size k) such that A, n a = A for all a! E T. Now when a < 0 are in T, then A, C 0 (because 0 E C) and A, n 0 = ADnp = A, and it follows that A, n Aq = A Ibecause (A, - a) n A3 B]. Thus {A, I CY E T) is an uncountable subcollection of {A, I a < wl) which satisfies t h e theorem. 0
c
-
Exercises
3.1 An unbounded set C c w l is closed if and only if for every X C C, if sup X < w l , then s u p X E C. 3.2 T h e set of all countable limit ordinals is closed unbounded.
CHAPTER 11. FILTERS AND UL'IRAFILTERS If X is a set of ordinals, then cr is a limit point of X if for every y < tr tllerc* is E X such that y < P < ck. A countable a is a limit point of X if i111cl oilly if there exists a sequence a 0 < 01 < - . . in X such that sup{cr,, I 7 1 E ~ j = ) (I. Every closed unbounded C 2 w l r:ontains all its countable limit points. 3.3 If X is an unbounded subset of w l , then the set of all cour~ta~blc! limit points of X is a closed unbounded set. If f : w l if f
(c) < cu
is ail increasing function, then wlientlver ,€ < a. -4
w l
CY
< W ] is
it
closu~ep o i r ~ tof' f'
3.4 The set of all closure points of every increasing furiction f : closed unbounded. be a regular uricountable cardinal. A set C C_ sup C = K arlcl s u p ( C n a ) E C for all a < K. Let
K
3.5 If Cl
arid
K
W , -+ ~ J J
is
is closed unbounded if'
C2 are closed unbounded, then Cl n C2 is closed unbouacli~l
The closed unbounded filter on bounded sets.
K
is the filter generated by the c1osc:ti
url-
3.6 If X < K and each C,, a < X, is closed urlbounded, then n a < x Cc?is closed unbounded.
A set S C h: is stationary if S n C # 0 for every closed unbounded
set, C' C
K
3.7 The set { a < K I U is a lirnit ordinal and c f ( a ) = w ) is statiorlary. It' h: > NI, then the set { C L < K ) cf(a) = W I ) is stationary. 3.8 If each C.',,, 0 < K. is closecl urlboundecl, then the dtt~yonal~ Y L ~ ~ ~ . . ~ . c I ( . ~ / . o I I ( 1 1 < ti I (L E C( for it11 €, < cr) is closed unbounded.
A function with dom f
c ti is ~ g ~ e s s i uifef (a)< n for. all
(1
E don1 f . (1
# 0.
c
3.9 A set S K is stationary if and only if every regressive function on S is constant on an unbounded set.
4.
Silver's Theorem
Using the techniques introduced in Sections 2 and 3 we now prove the followirlg result on the Generalized Contiriuurn Hypothesis for singular cardinals.
-
4.1 Silver's Theorem for every n. < X, 2'<* ,N,
-
Let Nx bc a singular cardinal such that cf X > then zN" Nx+i .
Ij
We prove Silver's Theorem for the special case NA = N,, , using the theory of stationary subsets of N l . The full result can be proved i11 a similar way, usirig
4. SILVER'S THEOREM
213
the general theory of stationary subsets of K = cf X (see Exercise 4.1). Thus assume that 2 N 1 1= N,+l for all a < ol. One consequence of this assumption, to be used repeatedly in this section, is that N,N1 < Nu, for all a < wl (see Theorem 3.12 in Chapter 9). Let f and g be two functions on wl. The functions f and g are almost disjoint if there is some rr < w l such that f ( P ) # g ( 0 ) for all /3 2 a .
4.2 Lemma Let {A, I a < wl} be a family of sets such that [A-I every a < wl, and let F be a family of almost disjoint functions,
< N,
for
Then JFJ 5 NW,. Proof. Without loss of generality we may assume that A, 2 w,, for each a < wl. Let So be the set of all limit ordinals 0 < tr < wl. For f E F a1111 a E So, let f' ( a ) denote the least ,!3 such that f ( a ) < wg . As f * ( a ) < (I for every a E So,there exists, by Theorem 3.6, a stationary set S C Sosuch that f * is constant on S. Therefore f 1 S is a function from S into i J n . for soirle ,d < wl. Let us denote this function f 1 S by c p ( f ) . If f and g are two different functions in F, then p( f ) and ( ~ ( g are ) also different: even if their domains are the same, say S, t,hc?n f S # g I S becbaust. f and g are almost disjoint. Thus cp is a one-to-one rnapping with donlail~F. The values of p range over functions defined on a subset S of w l into some wp < W,, . Thus we have
A slight modification of Lemma 4.2 gives the following:
n,,,, A,
4.3 Lemma Let F C that the set T = { U < wl
I
be a fumilp of almost disjoint functions, sl~ch. lAal 5 N), is stationary. Then IF1 5 N,, .
Proof. In the proof of Lemma 4.2, let Sobe the stationary set So= ( n E T ( a is a limit ordinal). The rest of the proof is thc same. 0 Using Lemma 4.3, we easily get the following:
4.4 Lemma Let f be a function o n ol such that f ( 0 ) < N,+, Let F be a famzly of almost disjoznt functions on w , and let
Ff = {g E F I for some stationary set T Then (Ff(lNW,.
c u l , g(a) < f ( C L )
for all cr < w l . for all
tr
E
T).
CHAPTER I l. FILTERS AND ULTRAFILTERS
214
Proof. For a fixed stationary set T, the set { g E F I g(a) < f ( a )for all (L c T} has cardinality at most N,, , by Lemrna 4.3. Thus IF\ 2'1 - N,, = N,, .
<
G We now prove the crucial lemma for Silver's Theorem:
4.5 Lemma Let { A , ( a < w l ) be a family of sets such that IA,I every a < w l , a n d let F be a fanlily of almost disjoint functions,
5 N,+I
f u ~
Proof. Let I/ be an ultrafilter on wl that extends the closed unbounderl filter. Thus every set S E I/ is stationary. Without loss of generality we may assume that A, W,+ 1, for each I V < . Let us define a relation < on the set F as follows: LJ~
f
if and only if
{ a < wl I f ( a ) < g ( l u ) ) E W .
We clairn that < is a linear ordering of F . If f < y and g < II then f < because
11.
If f , g E F and f # g, then ( a ) f ( a )= g ( a ) )is at most countable and therefore not in U , and so one of the sets { a I f ( a ) < g ( a ) ) ,{ a ( y(a) < f ( 0 ) )belongs to U. Thus either f < g or y < f . It follows that < is a 1ir;ear ordering on F. Now, if f , y E F and g < f , thcri g E Ff where
F f = { g E F I for some stationary T, g ( t r ) < f ( a ) for all [r E T).
<
and by Lemma4.4, ) F f l N W , . Thus for every f E F, E F 1 g < f ) ( 5 N;,. As < is a linear ordering of the set F, it follows from Theorem 1 . l 4 in Chapter 8 that IF ( N w l + l .
<
Silver's Theorem (for X = w l ) is rlow an easy consequence of Lemma 4.5:
Proof of Silver's Theorem. For each a < wl , let A , = P(w,); as 2"'. = For every set X C W,,, let f s E ,, A, IIP tliv N o + , , we have A,I = function defined by
n
f x ( 4 = X nu,. If X # Y then f x # f v , and moreover, f x and f u are almost disjoint. This is because there is some a < wl such that X n wa # Y n for all 0 > (L. T h u s F = { f x ( X E P(w,,)) is a fanlily of a l ~ ~ i odisjoint st functions, i ~ l ~byd Le111111it
4.5, JFI < NWl+1.Therefore, 2N-1 =
G
4. SILVER'S THEOREM
Exercises
4.1 Prove Silver's Theorem for arbitrary X of uncountable cofinality. [Throughout Section 4, replace wl by K = cf X, and replace t,he sequence {N, I a < wl)by a continuous increasing sequence of cardinals {X, I a < K). Use Exercise 3 .g.]
This page intentionally left blank
Chapter 12
Combinatorial Set Theory 1.
Ramsey's Theorems
The classic example of the type of question we want to consider in this section is the well-known puzzle: Show that in any group of 6 people there are 3 who either all know each other or are strangers to each other. Wt. arc) inlplicitly assuming that the relation "X knows y" is symmetric. The argument goes follows. Consider one of these people. say ,Jot. Of t11v remaining 5 people, either there are at least 3 who know .Joe. or thcjrt. arch;it least 3 who do not know Joe. Let us assume that Peter, Paul, anc! h,l:rry krloxv Joe. If two of them, say Peter and Paul, know each other. then we havc ;3 pc'oplc who know each other (Joe, Peter, and Paul). Otherwise, Peter, Paul. and Mary are 3 people who are strangers to each other. If Peter, Paul, and hlary are 3 people who do not know Joe, the argument is similar. We now restate this problem in a inore abstract form that allows for ready generalizations. Let S be a set. For r E N, r # 0, [S]' = { X S I 1x1= r ) is the collection of all r-element subsets of S. (See Exercise 3.5 in Chapter 4 . ) Let. {A,):;: be s partition of [S]' into S classes ( S E N - (0)); i.e., [S]' = A, and A; n A j = 0 for i # j . We say that a set H S is homogeneous for the part,ition if [H]' Ai for some i; Le., if all r-element subsets of H belong to the sanw class Ai of the partition. It may help to think of the elements of the diffclrerlt c:la~sesm btling colore(1 by distinct colors. We thus have S colors (numbered 0, l , , . . , s - 1) ailcl eiic:l~ r-element subset of S is colored by one of them. Homogeneous set are nlonoch,romatic; i.e., all r-element subsets of them are colored by the same color. In this terminology, the introductory example shows that, for a set S with IS1 2 6, every partition of [SI2into two classes has n homogeneous set H with 1HI 2 3. (The classes are A. = { { X , y} E [S]" X and y know each other) and Al = {{X, y } E lsj2I X and y are stxangers to each other}.) Put, yet another way: pick a set S of a t least G points and connect t.;iibli
c
c
CHAPTER 12. COMBINATORIAL SET THEORY
218
pair of points by a line segment colored either red or blue (but not both). Tllo resulting graph then contains a monochromatic triangle (i.e., all red or all blue). Let K , X be cardinal numbers. We write K -+ (X): as a shorthand for the statement: for every set S with IS1 = ti and every partition of [S]' into S classes. there exists a homogelleous set H with [ H (> X. The negation of this st.atenl~rlt is denoted K f , (X):. We proved that 6 -+ (3);; it is an easy exercise to show that 5 f . (3);. (Sec. Exercise 1.1.) More generally, K -,(X I , . . . , X,): means: for every set S with 1 S ( = h: t i ~ i c l every partition { A i j f ~ofi [S]' into S classes there is some i and a set H c S with [H]' 2 A, and IHtI > X,. Exercises 1.2 and 1.3 exhibit some very sin~pl(, properties of these symbols. The fundamental result about partitions on finite sets is the Finite Ramst.y's Theorem. I t asserts that if S is sufficierltly large, any coloring of its T-elvmellt subsets will have monochromatic sets of the prescribed size k. 1.1 The Finite Ramsey's Theorem For any positive natural numbers k . r , s there exists a natural number n such that n -+ (k):.
The readers interested primarily in infinite sets can skip the proof wittiout. harm. We first consider s = 2 and prove that for all r,p, q E N - ( 0 ) Proof. there exists n E N for which n -+ (p, g ) ; by induction on r . Let r = 1. Then it suffices to take n = p q - 1: it is clear that if 15') = 1 1 and [S]' = S = A. U A l , we cannot have both [Ao[ p - 1 and \ A 11 5 q - l . If (Ao[ 2 p let H = Ao; otherwise, let H = A l . We now assume that the statement is true for 7. (and any p. (1) and prow it for r l. We denote the least 71 for which n + (p, q); by R(p,q ; T ) . Tlle proof now proceeds by induction on (p p). Consider a set S with /S(= t~ > 0 (as yet unspecified) and a ~)r-~rtitio[~ [S]'" = A. U A I where A. n A l = 8. The statement is true if p < 7. or rl < ,.: if p < T (q < r , respectively) we car1 take as H any subset of S with \HI = 11 (IHI = g, respectively); if p = q = 7. then either A. # O and any H E A. will do, or A I f 8 and any H E A I will do. So we 7,. (1 > 7- it1lc1 tl~cl statement is true for p', g' if p' + q' < p + g. Fix a E S, let Sa = S - (a) and define a partition {Bo,B l ) of [S"]'' 11y X E B. ( B 1respectively) , if and only if { U ) U X E A. ( A l , respectively) (for X E [Sa]'). If we choose n so large that n - 1 = R ( p l ,g'; r ) (p', g' as yet unspecified) then there will be a set Ha C Sa so that either
+
+
(ii) [H a ]" C B l and (HaI 2 g'
+
<
1. RA MSEY'S THEOREMS
219
Either way, all ( r + 1)-element sets of the form {a} U X, X E [H a ]' are thus guaranteed to be colored by the same color, so it remains to concern ourselves with ( r l)-element subsets of H a . Assume (i) occurs. By the inductive assumption, there exists 72 for which n -+ (p - l, g);' l holds; let p' = R@ - 1,g; r + 1) br the least such n. I t then p - 1 and [Hr]'+' C A,; in this follows that either there is H' C H a with IH'I case we let H = H' U { U } and notice that (HI 3 p and [HJT+lC Ao. Or, there H a with JH1'I 2 q and [H"]++' C A I ; in this case we let H = H" and is H" notice that IHJ 2 q and [H]'+l G A l . The case (ii) is handled similarly, letting g' = R(pl q - 1; r l ) ,the least n for which n -+ (p, q - 1);" holds. In conclusion, if we let n = R(p', g'; r ) for p' = R(p - l , g; r + 1) and q' = R(p,q- l ; r + l ) then n-+ @,q);+'. This concludes the proof for S = 2. In particular, taking p = g = k we have n + (k); for all k, r . The proof of the Finite Ramsey's Theorem can now be completed by induction on S . The case S = 1 is trivial and S = 2 is proved above. Assume that m has the property m -+ (k):. Let us consider a partition {A,):=, of [S]' into (S 1) classes, where S is a set of an as yet unspecified cardinality n. Then { BolB1) defined by
+
>
c
+
+
is a partition of [S]' into 2 classes. If n = R(1,l; r ) (for as yet unspecified 1) then there is H' C S, JH'I 1 , such that either [H']' 2 BOor [H']' C B 1 .We take 1 = max{m, k). In the first case, {A,}::; partitions [H']' into S classes. so our choice of l guarantees existence of H H' such that JHI 2 k and [H]' A, for some i E ( 0 , .. . S - 1). In the second case, our choice of 1 allows us to let H = H' and conclude JHI 2 k, [ H ] +E A,. Either way, we are done.
>
c
?
We remark that the exact determination of the numbers R(p, g; r ) is difficult and only a few are known despite much effort and extensive use of computers. Here is a complete list, as of now (1998): R(3,l; 2) for 1 = 3 , 4 , 5 , 6 , 7 , 8 , 9 has values 6,9,14,18,23,28,36, respectively; R(4,4; 2) = 18, R(4,5; 2) = 25. R(3,3,3;2) = 17, and R(4,4; 3) = 13. They tend to grow rapidly with increasing (p + g) and r. This is a subject of great interest in the area of mathematics known as (finite) combinatorics, but we now turn to analogous questions for infinite cardinal numbers. There the most important result is the (Infinite) Ramsey's Theorem. 1.2 Ramsey's T h e o r e m
No
-+
(No): holds f o r all r , S E N
-
(0).
In words, if r-elernent subsets of an infinite set are colored by a finite number S of colors, then there is an infinite subset, all of whose T-element subsets are colored by the same color.
CHAPTER 12. COMBINATORIAL SET THEORY
220
Proof. It suffices to consider S = N (see Exercise 1.2). We proceed by induction on r . = 1 : If [NIL= N = :U :: At then a t least one of the sets A, lilust infinite (Theorem 2.7 in Chapter 4) and we let H be such an A,. r = 2: This is a warm-up for the general induction step. Let [NI2= U:, .4,. m We construct sequences ( a n ) ~ = O , and by recursiori. WO scit a0 = 0 and let BP = {b E N I b # a* and { a o ,b ) E A, ). Then {B,O):;,: is a partition of N - { a o ) ; ~e take io to be the first i for which BP is infinitt\ (refer to the case r = 1) and let f i = B:,. All the: subseqllent trrms of tlu. sequence (a,) will be selected froni Ho, thus guaranteeing that {uo,U,,) E A,,, for all n > 0. We select a1 to be the first element of H() a r ~ dlet 8,' = { h E - 'I partitiorl of the infinite set H" I b # a1 and { a 0 ,b) E A , ) . Again, {B,1 S - l 1s Hu - { a l ) and we let i l be the first i for which B: is infinite. and H I = LI;,. Proceeding in this rnanner one obtains the desired sequences. It is obvious fro111 the construction that CL^^)^!^ is an increasing sequence and {a,,, U,,,) E A,,, all m > n . The sequerice h:; values from the finittt set ( 0 . . . . . .S - 1 ). Ilclnce there exists j arid ail irifiuite set hd such that, i,, = j f i ~ rt t l l 71 C .\I. I t remains to let H = { a , 1 n E M ) . Then H is infinite and [HI2 C d4, ~ , C Y . H I I S O { a , , ~ , , ) E AI,, = Aj for all n , m M, ~ n < m. Tt~cgeneral case is similar. )V[! assume the theorem is trur for T ancl provi) = U ::: A.. Consider an arbitrary a E N itrirl ill1 it for r + 1. So let arbitrary infinite S C N such that a 4 S. We define a partition {Bi):=(fof [S]" follows: for X E [Slr,X E BEif and only if {a) U X E A, (i.e.. we r:ulor t . 1 1 ~ T-element subset X of S by the sarne color the ( r + l)-element set { a ) U X was colored by originally). By the inductive assumption, there is an infinite H 5 S such that [H]" B, for some i. We select one such i = i ( u , S ) kind H = H ( u . S ) . (We can always use the least such i, but we use the Axiorn of Choice [ P ( N ) can bc well-ordered] t o select a particular H . The use of the Axiom of Choicc, can be avoided, at the cost of' aclditional complications.) MTcl note tllrit all s ~ t s of the form { a } U X where X E [H]"belong to A,. 'I'he rest of thc: proof imitates the r:ae r = 2 closely. We construct seqllPnc:ibs and (H,):& by recursion. WC let a0 = U, io i(0.N - (0)). (a,,);=.,, H. = H(0, N - ( 0 ) ) . Having construc.ted a,, i,,, and H,, we let a,+ l be the least element of H,, and & + I = i ( a n + l ,H , , - { ~ , + I ) ) , H,+I = H ( Q , , + I ,H n - { a n + l ) ) . The point is again to guarantee that all sets of the form {U,, u k , , . . . , u k , ) wht'rt~ 71 < k l . . . . , n < k, (and k l , . . . , k,. are mutually distinct) belong to A,,, . Again, there is j E (0, l ,. . . , s -- 1 ) sucli that Ad = { n E~ N I , , , = J ) is infinite. We let H = {U,, ( 7rc. E M ) and [iotjic:e that H is iiifiriit.cb i i r ~ r l [H]"+' Af. !,it
r e
(i.,,)r=,
-
L
We conclude this section with two simple applications of Ramsey's Tlieore~~r. 1.3 Corollary Every infinite ordered set ( P ,5 ) contains a n infinite subset S such that either any two distinct rlements of S are comparable (i.e., 5' i s rr chain) or any two distinct elements o f S are incomparable.
2. PARTITION CALCUL US FOR UNCOUNTABLE CARDINALS Proof.
22 1
Apply Ramsey's Theorem to the partition { Ao, A1 ) of IP]' wherc
A. = { {X,y) A I = {{X,y)
E
[pI2I X and y arc c:omparable),
E [pI2I X and y are incomparable}.
1.4 Corollary Every infinite linearly ordered set contains a subset similar to either ( N , <) or ( N ,>).
Proof. Let ( P , +) be an infinite linearly ordered set and let well-ordering of P. We partition [pI2as follows:
[P]" X < y and A [ = {{X,y} E [pj21 X < y ancl A. =
{{X, y} E
X 4 X
< bc some
y};
+ Y}.
Let. H be an infinite homogeneous set provided by Ramsey's Theorem. The relation < well-orders H; let H be the initial segment of H of order type w. If [ H I 2 C A. then ( H , 4 ) is similar to ( N , <); if [H]' C_ A l then ( H , + ) is similar to ( N I<). Exercises
1.l Show that 5 b , (3);.
1.2 Prove that the words "every set S" in the definition of K (X):,' rltn be replaced by "some set S." 1 . 3 Assume that K -+ (X): holds. Prove (a) If K' > K then K' + (X):. (b) If X' 5 X then K -+ (X'):. ( C ) If S' S then n (X):,,. (d) If r' 5 r then K + (X): . Prove analogous statements for K -+ (X1, . . . , X,):. Also show that K -+ ( X I , . . . ,X,): holds if and only if K --+ (X,(ll,. . . ,X,(,)): holds, where r is any one-to-one mapping of {l,.. . , S ) onto itself. 1.4 Show that for an infinite cardinal K , K (K): holds if and only if X < ~f ( K ). 1.5 Let [S]<'" = U:,[S]*. Show that there is a partition { Ao, A I ) of [NIcJ such that [H]"" n A. # 0 and [H]<"n A I # 8 for every infinite H C N. [Hint:Put X E A. if and only if E X.] -+
<
-
-+
1x1
2. Part it ion Calculus for Uncountable Cardinals Ranlsey's Theorem asserts that any partition of r-element s u hsets of an infinite set S into a, finite number of classes has an injinzte homogeneous set. Thv next natural question is, how large does S have t o be in order that every part.ition has an uncountable homogeneous set. By analogy with Ramsey's Tlleorem one might expect N I -+ (NI):. However, as the next theorem shows, this is false.
CHAPTER 12. COA4BINA'l'ORIAL SE?' THEOR)'
222
2.1 Theorem 2N"
fi
(NI);
Proof. The argument is quite similar to the one used to deduce Corollary 1.4. Let X = 2"" be the cnrdinlrlity of the continuum. Tlir s ~ R t of all r c ~ l numbers has IR( = X and is li~learlyordered by 5 . Let 6 be some well-orderilrg of R of order type X. We partition [RI2as follows: A. = {{z,y} E [R12(
X
+yandx
A l = {{x.y) E [ R ]( ~X 4 y a n d z > y). Let H be an uncountable Ilonlogeneuus set for this partition. That lneans t.lli~t either (i) for all
X,
y E H,
X 4
(ii) for all x , y E H, r
y implies
X
< y, or
inlplies
X
> y.
4 y
We show that this is impossible. Let us assume (i) holds. Let cp : p --+ H be an isomorphism of ( p , E ) and (H, 4),where p is an ordinal (necessarily, p X and p uncountable). W'(: note that ( < 71 < p implies v(<) 4 p(q) and llencc p(<) < v(r/). TIIIIS ((p(<),p(( + l ) ) I < p } is an urlcourltable collection of nlutually disjoint open irltervals in R, a11 impossibility by Tlleorem 3.2 in Ch:tpt,~r10. . . C:LSP(ii) is sinlilar.
<
For a slightly different proof see Exercise 2.1. There is a weaker posit,ivc~ result. 2.2 Erdiis-Dushnik-Miller P a r t i t i o n Theorem
N1 -. (NI. NO);.
In words, for every partition of pairs of elements of an uncountable set illto two classes, if one class has no infinite homogeneous set, then the other one has to have an uncountable homogeneous set. Let { A , B ) beapartitionof[w1l2. F o r a € wl l e t B ( a ) = { P E wl I p # tr and ( u ,8 ) C B). There are two possibilities. 1. For every uncountable X W l there exists some U E X for whicli I B ((1) 1-1 XI = N I . In this case we construct a countable homogeneous set for B l)y recursion. 1% let X. = w l and let CLO be the first element of X. for u.llic.11 ( B ( u o )n Xol = N 1 . Ure lirxt let XI = B(clO)n X, and let a l be tllu first element of X L for which I B ( a l ) n X L( = NI. In general, a t stage n + 1 we Ict Xn+1 = B ( a n )n X,, C X,, arltl let. 0 , , +1 b~ the first element of X,,, 1 for u:llic:l~ [B(a,+ l ) n X,, ,I = N I . It is clear that H = ( a , 1 T L E N } is a courltable set. and [HI2 C B. 2. In the opposite case, there is an uncountable set X C wl such that ( B( a )n X1 No for all cr E X . This tirne we construct an uncou[ltable hunlogellc-ous set for A by transfinite recursion of length w l . Having defined a one-to-on(? sequence (CL, / U < X) where tr,, E X for i ~ ~ cu h< A < c ~ 1 ,WC' c ) ~ s ( ~ I .t.llilt. v(-~
Awuf.
c
<
2. PARTITION CALCULUS FOR UNCOUNTABLE CARDINALS
223
the set U,,x B(a,) fl X of those p E X for which some {a,, 0 ) E B (for some v < A) is a t most countable (it is a countable union of a t most countable sets). Therefore, X - u,,A(B(au) U {a,)) is uncountable, and we let a~ be its first element. Clearly ax # a, and {a,, a x ) E A , for all U < A . The srlt. il H = {a, I v < w ~ satisfies ) IHI = Nl and [H]' C A . Actually, the theorem holds for any infinite cardinal K (in place of N I ) . i.e., n --+ (K,No):. The proof for regular K is a straightforward modification of tlie one just given for K = N I , and is left as an exercise. has an uncountable homogeneous To guarantee that every partition of [5J2 set, the underlying set S must be more than just uncountable.
2.3 ErdBs-Rad6 Partition Theorem
(2'(1)+
+
(N1);,, .
Assuming the validity of tlze Continuum Hypotllesis, this reduces to N 2
( N I )2,,,-
-
We denote (2"")' by A and let { A n ) n E N be a partition of [X]'. For any pair {a,0 ) E [A]%e let n ( a , 0) be the unique n for which { a , 0 ) E A, ( n ( a ,0) is the "color" of { a , P)). For every a < X we construct a transfinite sequence f, recursively as follows: fa (0) = 0; If (fa(7j) 1 7j < c) is defined (< < a ) and if there exists some a < c~ suc11 that a # fa(v) and n(f,,(v),u) = n ( f = ( ~ ) ~holds a ) for all ;rl < c, then we let L(<) to be the least such a; otherwise we stop. We note that dorn fa is an initial segment of CY (or CY itself) and f,, is an increasing sequence of elements of a. We claim that for some a < X, I dorn f, l > N I . Assume, to the contrary, that 1 dorn f,,l 5 No for all a < A. Consider t,he stJclue[ic.esg,, : rloili(j;,) 4N defined by g,,(rl) = ?L( fa (v), a ) . We note that g, = gp inlplies fa = f J : (:lc~t~.ly dorn f, = domg, = domgp = dorn fa, and if f,(v) = f $ ( ~for ) all 7j < <, then r l ( f,,(rl), a ) = n( fa(q), a) holds if and only if 7a(fd(q),a } = 7 t ( f d ( r / ) . 3 ) (for 7j < E), showing ttlcrt fa(<) = fp(<) also, by definition of f . Our assu~nptio~l implies that dorn g, is a countable ordinal; hence there are only N I . (N:") = 2u0 possibly distinct 9,'s. Therefore there exist P < a < A such that g17 = g,,. I i t l ~ i c , ~ f~ = fo- AS then n(fa(v)l P) = n ( f p ( ~ ) , P )= SD(V)= Y ~ ( v=) n ( f o ( ~ ) tfor ~) all v E y = dorn fa, it follows that fa(?) is defined (as P). This is a contradiction that proves the claim. 1% now fix a with ldom fal N I and let X = ranf,. Then 1x1 2 h', and the set X hc?s the property that, for any U , T , T ' E X , a < T , a < 7 ' . n(a. r ) = n(u, 7'); i.e., (U,r ) E An if ancl only if {a.r ' ) E A,l. Ict B,, = ( a E X 1 {a, T ) E A , for some (or all) T E X, U < r ) . 'rile collectioll (B,) is ii partition of X and 2 N I implies that for some n E N , ( B , / > N I . Clearly [BnI2 A,,, so H = B, is the desired homogeneous set. P?-oof.
>
1x1
It is not too hard to generalize this result. First, we define the transfinite sequence of beths, the cardinal numbers obtained by iterated exponentiation.
C H A P I ' E R 12. COAdBIXA'I'ORIA L SE?' 'I'HEO R \ '
224 2,4 Definition
I*= No;
I,+]= 23-; 7~= sup(^,, I u < X) for limit X # U. A more general version of the Erdos-Rado Partition Theorem can now stated. 2.5 Theorem (In)+ -. (N,);:' iiulds
107
ull
ri E
N.
Tlie proof is by iriduction or1 n, following closely the special case n = 1 in Theorem 2.3. There are an:ilogous results for ariy infinite c;irdinnl h. see Exercise 2.5. Many results of sirnilar nature, both positive :i11(1 ricbgt~tiirc~. have been obtained by researchers in the part of set theory known as infinitary cornbinatorics. One problem iri particular h a played a very iruportii~lt.r o l t l iri the developmerit of moderri set theory. It is the question of wlittt11t.r thc3t.t~ are any uncountable cardinal numbers K. for which an analogue of Ramsey's Theorem holds. 2.6 Definition All uncountable ciirdilial Ilolds for all F , s E N - { U ) .
K
is called weakly compact if
K
-+
(K)::
2.7 Theorem Weakly compact cardinals am strongly inaccessible.
PT-uof. \Ve liave to prove that a wr:iikly conipact c:arctiii;~lr; is r.egulal. a11t1 a strorig limit cardinal. ( i ) Regularity: Assume
Define a partition of ( CL,
K.
P, where X < K and 1P,I < r; ibr ;ill v < A.
=
[K]'
by:
P ) E AO if and only if thcre is v < X sucli that
E P, n r ~ t lljl E P,,:
( a ,0)E AI otherwise. Clearly (Ao, AI ) caiiriot havcb
:i
liorriogeneous set of' ctirdilltility
(iij Strong limit property: Assurne X < K 5 2" By Exercise 2.1. 2A hence K f , (X+); and, as X+ ti, ti (K);.
<
+
r;.
+ (A+):. ill
Exercises
+
2.1 Prove that 2" (K.+); for any infinite K . [Hint: Follow the proof of Tlieorem 2.1, but replace ( R . < ) by t110 scht (O? 1)" ordered lexicograplliciilly (see Chapter 4 , Sec:tioli 4).J 2.2 Prow. that K (L.NO); holrls for all infinite regular K . 4
3. TREES
225
2.3 Prove: If X is an infinite set and 5 and < are two well-orderings of X , then there is Y C X with IYI = sucl.1 that yr 5 32 if ancl only if y l < y2 holds for all yl , yz E Y. [Hint:Let A = {{X,y } E [X]'I X < y a11cl z 4 g } and B = {{..c. y} E [X]'/ :c < y and .c + y ) . Apply the Errliis-Duslluik-h3illt.r Thtsorcw1 allcl sliow t,hat B cniinot have an infinite honloger~eousset.] 2.4 Prove that (I,)+ -+ (N~)::'. [Hjllt: Induction.] 2 . For any cardinal K define expo(r;) = h-, cxp,, ( K ) = 2rx~'11(h). Prc)w: For any infinite K . (expn(tc))+ (K+):+', 111 part.ic~llar,(2")+ -4 ( K + ) : .
1x1
-
3.
,
Trees
Like partitions, trees originated in finite combinatorics. but turned nut t o bp of great ilitfarest t o set tlicorists x s ivell.
3 , l Definition A tree is a n ordered set ( T ,<) which Ili~s least ttlcb~~ent ii11(1 is sut:li that, for every X E T, the set { y E T 1 y < X ) is well-ordcred by 5 . Sce Figure 1.
Elements of T are called nodes. If X ,y E T and y < X, we say that y is :L predecessor of X and X is a successor of y . T h e unique least element of T is t h ~ root. By Theorem 3.1 in Chapter 6, the well-ordered set { y E T I y < 2:) of (211 predecessors of X is isomorphic t o a unique ordinal number h ( z ) .tllc hcight of .T. The set ?', = {:I: E 7' ( I z ( r ) = fi} is the trth. 1elrc.l o f 7'. It' I,(.z:} is i i s~ic~c.t~sso~ ordinal, X is called a successor node, otherwise it is i l limit node. 'l'l~t! least c4 for which T, = 0 is called the height of the tree T. h ( T ) . A branch in ?' is a maximal chain (i.e., n linearly ortlererl s1rI)set) iu T. ' l ' l ~ r . order type of a branch b is called its length and is denoted [ ( b ) . I t is always ill] ordinal number less than or equal t o the height of T . A branc:h whose Icngtll equals the height of' the tree is cat led cofinal. A subset T' of T is a subtree of T if for all X E T', y E T,y < X implies y E T ' . Then T' is also a tree (when ordered by <) rind TL = T, n T' for a11 u < h ( T f ) . T h e set T(,) = UOCaTo is a subtree of T , for lr~lycr 5 11(17).i i l l l l h ( ~ ( "= ) )u. If X E T, then { y E T I y < X} is a branch of T ( " )of le~lgtli cl.: hoivctvc.r, "l'(&) nliiy have other branches of l ~ i l j i t l lr , ;is nw~l1(if 11 is i i l i i i ~ i t orclinal). Finally, a set A '7' is an antichain in T if niiy two distinct elements of A are illc:omparable, i.e., X ,y E A, X # y implies that neither X < g nor y < .I,. T h e reader is strongly urged to work out Exercises 3.1 and 3.2, where solrie simple properties of these concepts are developed.
c
CHAPTER 12. COMBINATORIAL S E T THEORY
Figure 1
It is time for some examples of trees. 3.2 Example (a) Every well-ordered set ( W, 5 ) is a tree. Hence trees car] t ) ~ viewed as generalizations of well-orderings. h ( W ) is the order type of' H,': the only branch is W itself, and it is cofinal. (b) Let X be all ordinal number and A a nonempty set. Define A<" U,,* A" t o be the set of all transfinite sequences of elements of A of length less than X. We let T = A C Aand order it by C;so f 5 g for f , g E T means f C_ g. i.e., f = g r dom f . It is easy t o verify that T is a tree. For f E Y'. h ( f ) = a if and only if f E A,; i.e., T, = A . For a = B + I and f E A". f 1 0 is the immediate predecessor of f , and all f U { ( P . U)), U E A, arcb immediate successors of f . Branches in T are in one-to-one corresporlde~l(:t> with functions from X into A: if F E then {F 1 U. ( CL. < X) is brarl(:h in T. Conversely, if B is a branch in T then B is a compatible systerri of' functions and F = U B E A A . We note that all branches are cofinal. (c) More generally, if T 2 A<' is a subtree of (A", G),branches in T are in one-to-one correspondence with those functions F E A<' U A A for which F 1 a E T for all a E d o m F and either F 4 T, or F E T and F has no successors in T. We often identify branches and their correspondi~ig ftlnctions in this situation. (d) Let A = N, X = U . Let T G N < be ~ the set of all finite decreasing sequences: i.e.. f E 7' if and only if f(i) > f(j) holds for ill1 i < ,l < dorn f E N . Then T is a subtree of (N'". 2).T is a tree o f height which has no cofinal branches (see Exercise 2.8, Chapter 3 ) . ( e ) Let (R, 5 ) be a linearly ordered set. A representation of a t.rw (l'.+) /):I/ i~~tervals in R.5 ) is a one-to-one functiori which assiglis to tbacli ;r. r 7 ' Q
3. TREES
an interval @(X) in (R, 5 ) so that, for
X,
y
T,
>
(i)
X
=S y if and only if @(X) @(y);
(ii)
X
and y are incomparable if and only if @(X) n @(y)= Q).
This implies, in particular, that (@[Tj,2 ) is a tree isomorphic to (T. <). For example, the system (D, ( s E S) constructed in Example 3.18, Chapter 10, is a representation of the tree S = Seq({O, 1 ) ) = {0,1)<" (ordered by C) by closed intervals on the real line. The study of finite trees is one of the key concerns of combinatorics. Mre do not pursue it here, and turn our attention instead to infinite trees. We concern ourselves mainly with the question, under what circumstances does a tree have a cofinal branch. For trees whose height is a successor ordinal the answer is obvious: if h ( T ) = a + 1 then T. # 8 and {y E T 1 y 5 X ) is a cofinal branch in T, for any X E To. Henceforth, we concentrate on trees of limit height. Example 3.2(d) shows that there are trees of height w that have only finite branches. The next theorem, the most basic observation relevant t o our question, shows t h a t this cannot happen if the tree is sufficiently "slim." 3.3 Konig's Lemma If T i s U tree of height w , all leveEs of which are finite, then T has a branch of length U.
Equivalently, every tree of height w such that each node has finitely many immediate successors has an infinite branch. We use recursion t o construct a n infinite sequence ( c ~ >of~nodes = ~ of T so that, for each n, ( a E T I c, 5 a ) is infinite. We let c0 be the root of T and note that ( a E T I c0 5 a ) = T is infiliite. Given c, such t h a t {U E T J c,, _< a ) is infinite, we observe t h a t Proof,
where S is the finite set of all immediate successors of c, (Exercise 3.l(v)). Hence, for a t least one b E S , { a E T 1 b 5 a) is infinite, and we let c,+] be one such b. It is easy t o verify that { a E T I a 5 c, for some n E N ) is a branch in T of length W . There is an important point t o be made about this recursive construction, and many similar ones to come. The Recursion Theorem, ,as formulated in Section 3 of Chapter 3, requires a function g , whose purpose is to "compr~te" c,+lfrom c,. In the previous proof, we did not explicitly specify such a g. and indeed some form of the Axiom of Choice is needed to do so. For example, let k be a choice function for P(T).Let S, denote the set of all immediate successors of c in T. If we let g ( c , n ) = k({b E S, I { a E T I b 5 a ) is infinite)), then c , + l = g ( % , n ) defines (c, ) n E N ) in conformity with the Recursion Theorem.
CHAPTER 12. COMBINATORIAL SE?' THEORY'
228
It is customary t o ornit this kind of detailed justification. and wc3 rlo SO from now on. The reader is encouraged to supply the details of the next. fr.1~ applications of the Recursion Theorem or the Trarisfinite Recursion Theorerr) as 2111 exercise. Returning t o Kiinig's Lemma: there are several interesting ways t o generalizt. it. For example, it is fairly easy t o prove that any tree of height K whose levc~ls are finite has a branch of length K (Exercise 3.3). We r:onsider the fullowillg question: let T be a tree of height w l , each level of which is a t most cx)urit;il1lt1: does it have to have a branch of length W ] ?It turns out that the answer is ..no."
3.4 Definition A tree of height w l is called ari Aranszajn tree if all its levels itre a t most countable and if it has no branch of length wl. 3.5 Theorem Aronszajn trees of height wl exist.
Proof. We construct the levels T,, trarisfinite recursion ill such a way that (i) T,,
c U"; 1?',1
< wl, of an Aronszajll t r w
1)y
I No;
(ii) If f E T, then f is one-to-one iind
(iii) If f E To and
CI
0 < u then f 1 /Ir
E
(W
- ran f ) is infinite;
To;
(iv) For any 0 < a , any g E To, and ariy firiite X E such that f 3 g and ran f n X = 0.
-
rat1 g. tliere is f E l ; ,
Let us assume this has been done and let us show that T = U,,,l T,, is thrn an Aronszajn tree. Clearly T is a tree (by ( S ) ) , each level is at most countablrh (by (i)), and its height is wl (by (iv), each T, # Q)). If B were a branch of Icrlgth w l in T, then F = U B would be a one-to-one functiori from ul into w (by (ii)). a contradiction. It remains t o impIement the construction of T,'s. We let 'To = (0). Assuming T,, satisfying (i)-(iv) has been constructed, we let T,+ 1 = { g {~( a a, ) ) / 9 E T,,, U E - r a n g ) . It is easy to check (i)-(iv) hold for a + I . It remains to corlstruct T,, for a limit. For any g E To,fl < a , and any fir~it,t> X C (w - ran g) we construct a particular f = f (g, X ) by rccursiori as follows. ~ a0 = fl and sup{on ) n E N ) = c t . Fix :in increasing sequence ( Q ~ ) ? , S~ I I C that w - r a n f o . Having defined f, E T,,, antl Let fo = g E Ta,, arid X. = X finite X, w - ran we first take some finite 3 X,, X,, 1 2 w - riin f, (that is possible because the last set is infinite, by (ii)) and then setec:t sc)lr~c' fn+1 E ?h,,+,such that fn+1 3 f n and n ranfn+1 = Q) (possible by (iv)). j n . Clearly f : u w , f is one-to-one (because all f n are), We let f = X.) = Q),SO W - 'a11 f is infinite, arid ran f n X = 0. T l l l ~ sf rari f n well. satisfies (ij). For ,h' < a , f 1 fl = f,, 18 when p < Q,, so (iii) is satisfied We put this f = f (g, X ) irito 1; for each g E U*<, T[, and each firlit,r X C_ w - ran g. Tlius (iv) is satisfied. As I U,,,, TjI <- , T < H,, ( \ ) v
c
I,,
-
3. TREES
229
inductive assumptio~i(i)) and the number of finite subsets of w is countable. t,he set T, is a t most countable, and (i) is satisfied as well. More generally, a tree of height K ( K an uncoutltable cardinal) is called an Aronszajn tree if all its levels have cardina1it.y less than r; and there are rio branches of length K . The question of existence of such trees is very r:omplicatetl and not yet fully resolved. It is easy to show that they always exist when r; is singular (Exercise 3.4). Of particular interest are uricountable carrlinals for which all analogue of Konig's Lemma holds, i.e., no Aronszajn trees of height r; exist; such cardinals are said to have the tree p7.0pe7-ty. It turris out that stror~gly inaccessible cardinals with the tree property are precisely the weakly conlpact cardinals defined in Section 2. Exercises
3.1 Let (T, 5) be a tree. Prove
(i) The root is the unique r E T for which h ( r ) = To # Q). (ii) If u # 0 then T, n To = 0.
C).
In particular.
(iii) T = ~ ( ~ ( =~ Uo
E T is a successor node if and only if there exists y E T such that, y < X and there is no z E T for which y < z < X holds. If X is
X
a successor node then such y is unique; it is called the i m m e d i a t e predecessor of X, and X is called a n immediate successor of y. (Each node has a t most one immediate predecessor, but it may have many immediate successors.) (v) If X < y then there is a unique z E T sucll that X < z 5 y arld z is an immediate successor of X . 3.2 Prove: (i) Every chain in T is well-ordered.
(ii) If b is a branch in T, X E b, and y < X , then y E b. (iii) If b is a branch in T , then Ib n T, I = 1 for u < [(b) and J bn T,,( = U for a > ((b). Conclude that [(b) 5 h(T). (iv) Show that h ( T ) = sup{!(b)
I b is a
branch in T ) . (v) Show that each T, (a< h ( T ) )is an antichain in T . 3.3 Let (T, 5 ) be a tree of height K ( K an infinite cardinal), all levels of which are finite. Prove that T has a branch of length K . [Hint: Let U, = {X E T, I [{y E T I X 5 y}l = K). Prove that (IU,I I cr < K ) C N is bounded; let nt be its maximum (m > 1 ) . Show ttiat T has exactly m branches of length K .] 3.4 Construct an Aronszajn tree of height N,. Generalize to an arbitrary singular cardinal K.
CHAPTER 12. COMBINAllORIAL S E T THEORY
230
4.
Suslin's Problem
We proved in Chapter 4, Theorem 5.7, that the real numbers, with their usual ordering, are the unique (up to isomorphism) complete linearly ordered set, wit,llout endpoints t h a t has a countable dense subset. As an immediate consequenrc.. every collection of mutually disjoint open intervals in (R, <) is a t most coulitable (Theorem 3.2 in Chapter 10). Suslin asked in 1920 whether the ordering of real numbers is uniquely characterized by this weaker property. 4.1 Definition A Suslin line is a complete linearly ordered set without tlntlpoints where every collection of mutually disjoint open intervals is a t most, countable, but where there is no countable dense subset.
The famous Suslin's Hypothesis asserts that there are no Suslin lines. It is now known that Suslin's Hypothesis can be neither proved nor refuted from thv axioms of Zermelo-Frwnkel set theory with Choice. In this section wc establish a relationship between Suslin lines and certain kinds of trees. In the next sectiol~ we consider some additional axioms of combinatorial nature that lead t o a n answer to Suslin's problem. Theorem 3.5 makes it clear that trees of height w l with countable levels iil.tb not always "slim enough" to have a cofinal branch. However, we can impost. ki stronger condition: we can require that all antichains (not just the levels) 1)e countable. 4.2 Definition A tree of height wl is called a Suslin tree if all its antichains are at most countable and there are no branches of length W * . Clearly a Suslin tree is an Aronszajn tree, but the converse need not !)ta true. As the name indicates, there is a close connection between Suslin t19r:es and Suslin lines. It is this connection that we proceed t o establish. Let (S, <) be a Suslin line; we recall that every complete ordering is bv definitio~idense. We construct a tree T c N<W'and its representation open intervals in S using transfinite recursion of length wl. More precisely. \ve construct the levels T, C N and mappings Q, so that, for each a < w l . t ) j p
(* 1
Q
IT,/
< NO, T(a+l)= UD5,To is a tree, and
@(,+l)
=
UU5,
is a representation of T(&+') by intervals on (S,<).
We let To = (81, Q(0) = S. Assume T, C Na has been constructed and satisfies I*). For each f E T, and TL E N - {U) we put f n = f U {((I. 7 1 ) ) into TQ+l.Let Q,(f) = ( U , b); using density of ( S ,<) we pick an increasing sequence a = a1 < a2 < as < . - - < 6 and let Q,+,(fn) = (U,, u n + l ) for rL 2 1 . If ao = ~ u p { a , } ~ = < ~b we also put f o = f U {(a,O))into T,,+] ; ~ n dset Qa+1[ fo) = (ao,6). It is clear that the condition (*) holds for a + 1. Now let a < wl be a liniit ordinal. We assume that To and Qa satisfying I*) have been constructed for all 4 < o. It is then clear that T ( ~=) UU,, l>is i t
4. SUSLZN'S PROBLEM
231
<
tree, IT(")/ No, and $(Q) = Ug<, GO is a representation of T(") by intervals in S. Let f be a branch in ~ ( " 1 ; we can think of it as an element of N o . Then Q ( , ) ( f 1 0)= (ao, bci) is an interval in S and /3 < y < a implies ag 5 a, < b, 5 bp. P u t a = supa<, aa, b = infp<, bg (here we use the completeness of S);clearly a 5 b. If in fact a < b, we put f into T, and define Q,( f ) = ( U . h). It is easy t o verify t.hat T(&+')= T ( , ) U T, is a tree and @ ( , + l ) = U Qn is its representation (for example, if f , g E T, are incompatible, we consider the first p where f (P) # g@); we then have f 1 (B 1) # g 1 (P I ) , I ( p 1)) = 0 by inductive assumption. and so @(")(f ( P 1)) n so @(f) n @(g) = 0). The set T, is a t most countable, because otherwise {Q(,)(!) I f E T,} would be an uncountable collection of mutually disjoint intervals in S. Thus (*) is satisfied a t stage a . This completes the recursive construction. We let T = T, and Q = Q,. Clearly T is a tree and represents it by intervals in (S. <). WP claim that T is a Suslin tree. It is clear that T has no uncountable antichaills: the image of ariy antichain in T by the one-to-one function @ is a collectio~~ of mutually disjoint intervals in S and so it is at most countable. We next show that T has no branches of length wl.
r
+
+
+
+
U,
4.3 Claim Let (T, 5 ) be a tree where each node has at least two immediate successors. IfT has no uncountable antichains then T has no branches of length 2 w1.
Proof. Let (X, 1 a < w l ) be a chain in T , where X, E T,. The set of immediate successors of X, has a t least 2 elements, so let y,+l be a n immediate successor of X, different from x,+l. Then (y,+l I a < w l ) is an antichain in T. It remains to show that T has height w l , i.e., that each T, # 0. We proceed by contradiction. Let h < wl be the first ordinal for which TE = Q); it is clear from the construction that h has to be a limit ordinal. Let C be the set of 1 ; C is a t most countable, and all endpoints of all intervals @(g),g E ~ ( ~then cannot be dense in S. In other words, there is an interval (c, d) disjoint with C. We use it to construct a branch in ~ ( as~follows: 1 we let F(0) = 0 and note (c, d) G S = @(F(O)). Given F ( & ) E T, such t h a t (c, d) 2 @ ( F ( u ) )= ( Q , b) and t h e sequence a = a1 < a2 < ag < - . < a0 5 b used t o construct we note that C n (c, d) = 0 implies that there is a unique n E N such that (c, d) 2 (an, a n + l ) (or (C, d) G (ao,b), in case n = 0); we let F(a+ 1) = F(&) U {( CL, 7 1 ) ) for that n. At cr limit, @ ( F @ ) ) 2 (c,d) by the inductive assumption, so g = U U
no,,
>
n,
CHAPTER 12. CObIBINATORIAL SET THEORY
2 32
(i) The set of all immediate successors of each node is couritable. (ii) If X,?/E EL for limit a < h ( T ) arid { z E T t11en X = y.
I z <X)
= {z E T
I
z
<
!I).
We call trees with properties (i} ancl (ii) regular. T h e arguments just ~ornplet~etl prove tlre following theoreln. 4.4 Theorem If a Suslin line exists then there exists a r-egl~la7. S T L S ~Irtre. Z~L 'rile theorern hcls a converse.
4.5 Theorem
If a regular Susli7~tree exists then there is a Suslin line.
'I'he existence of Suslin lines is thus equivalent to the existerice o f regulal. Suslin trees. It is in fact equivalerit to the existence of Suslin trees, but W(?o ~ i l i t the urlappealing proof of this fact. Let (T, 4 ) be a regular Suslirl tree; without loss of generality n1r> Proof. (:;it1 issulne that T C_ A"' for somcaset A, and 6 is 2 (Exercise 4.3). For tni\(.11 f E T the set Sf of immediate suc:[:cssors of f is r:ountable, and we fix i~ ( l t ~ l ~ s ~ liritiar orclerilig 4~ of it, without erl(lpoints. We note that the length of ; I I I ~ branch b in T lnust be a lirnit orrlirial (because each node iri T hiis succ:essors) less than wl (because '1' is Suslill). We let B be the set of all l)ril~l(:t~os of T. nrld order it "lexicographi(:~~lty."More precisely, we view B as n s~~bsc-.t of U(T" ) CL < w1, (L. lirriit) i111d for b. bf E B, 6 # 6'. L V ~put
b < 6'
if and only if
b
r
(Q+
6' r ( o
l)
+ 1)
where ( 1 is the lemaximal chains.) Argunletnts sirni1;ir t o tlroscb used i r i tile proof of Theorem 4.7 irr Chapter 4 show that < is a lirleiir or(leriug. If b < 6' and a is as above, there exist g E Sbllrsuch that b (ck + l ) +,,1(? !I < I , : , , b' (u + 1) (by density of Ariy branch b" 2 g then satisfies b < 6" < h'. so < is derise or1 B. Similar argunients show B has rio least or grc;ltest c ~ l c ~ ~ ~ r c ~ ~ l t . . We clczirn t h a t every systenl of' ~ r l u t ~ ~disjoint ~ l l y irltervals i l l B is ilt n~ost. countable. Assume to the contrary that {(bi,bi) 1 i < w l ) is a disjoint systern. Let a, be tl.ie first ordinal where b , ( r ~ , )# b:(cr,) and let y, E Sb,l,,, be' ~ 1 1 ~ that 11 b, 1 (a, + 1) 4 g; 4 b: 1 ( I Y , + l ) . It is easy to check that {g, I i < U , ) is r i l l arltichain in T , a contradiction. Finally, we show that B does riot have a c:ountrrble tlr?rise subset. Let C' := {b,, 1 71 E N ) be one. Let cr = sup(l(b,) 1 71 E N ) be tJie suprellIun1 of the leligths of the branches in C; :is [Cl . INo arid each t ( b , ) < IJI , regularity of ul implies cl: < w l M well. But 7; # 0,so take some f E T,,, two of its i~liriiecliat.~ successors, say f l < J f2, and sollie branches b l f 1 , h2 3 j 2 . IZil ~ I I P I I 1 l ; l ~ ' c h Ill < b2, but t,he interval (01, b2) is clearly disjoint with C. I11 sullirnary, (B. <) hiis all thc. properties of a Susliri line except c:onlplot.r,less. \VC.let (B,<) t ) tlie ~ Derlrkind r o ~ ~ ~ p l e tof i o (~Bl ,<) as colistrrtct.c~r1ill
>
5. COMBINATORIAL PRINCIPLES
Section 5 of Chapter 4. I t is an easy exercise (Exercise 4.4) t o verify t h a t is a Suslin line.
233
(B. <)
Exercises
4.1 Let (S, < I ) be a Suslin line, ({a),<*) a one-element set with its unique the real line. Show that the sum of t.he ordctrt>d ordering, and ( R , is a Suslin line. sets (S,c l ) ,( { a ) ,< 2 ) , and (R, 4.2 Let ( S , <) be a Suslin line; prove t h a t IS1 5 2H". [Hint: Let T and @ be as in the proof of Tlleorem 4.4; by letti~ig5 = in the argument showing that T has height w l we get a set C of' ICJ= N1 which is dense in S. Show t h a t each n: E S is a limit of a seque~lceof clement,^ of C; then apply Exercise 3.1 iri Chapter 9.1 4.3 Show that a tree has property (ii) from the dcfi~litio~l of regularity if' 2t1ld only if it is isomorphic to a tree of transfinite sequences (as in Exn~llyle 3.2(c)). 4.4 Let (B, <) be a dense linear ordering without endpoints that h a s all t,lie properties of a Suslin line, except completeness. Show that its Drdekil~d completion (B,<) is a Suslin line. [Hint: If (B, <) had a countable dense subset, then it \voulcl be isor~iorphic to the real line. But every infinite subset. of R has a countable tle~lscb subset; so B would have a countable dense subset.] 4.5 A regular Suslin tree (T, 5 ) is normal if for all a. < P < w l and all r E Y', there exists y E T' such that X < y. A Suslin line (L, 5 ) is proper if no open interval in L has a countable dense subsrt. Show t h a t a Suslin t.rt.>e is normal if and only if the Suslin line constructetl from it as in the proof of Theorem 4.5 is proper. 4.6 Prove: If a Suslin line exists, then a proper Suslin line exists. [Hint:Define a n equivalence relation on the Suslin line ( L , 5 ) by X g if and only if the interval with endpoints X ,y contains a countable dense subset. Show that each [X] contains a countable dense s u b s e t Let = {[X]1 X E L); let [X] 6 [y] if and only if X 5 y and show t h a t ( L , 4 ) is a proper Suslin line.]
-
5.
Combinatorial Principles
Suslin's Hypothesis can be neither proved nor refuted from the axionis of ZFC. This was shown in the sixties by constructing models where the axioms of ZFC are satisfied and Suslin's Hypothesis fails, as well as such nlodels where it is true. T h e study of models of set theory requires familiarity with formal logic anti is beyond the scope of this book, except for a brief introduction in Chapter 15. However, early in the development of the subject researchers isolated ;I lumber of general principles of combinatorial nature which hold in one or another moclel of ZFC and have, as a consequence, a definite answer to Suslin's Problem, as well as to a number of other questions. If one is willing to accept such a principle as an additional axiom of set theory, one can then, by purely classical methods,
234
CHAPTER 12. COMBINATORIAL SET THEORY
without deep study of logic, prove from it results unavailable in ZFC alone. This has become the way to proceed in general topology and some parts of abstract algebra. We demonstrate this approach here by two examples. Jensen's Principl~ Diamond and Martin's Axiom. We want to stress that these axioms c10 not have the same epistemological status as the axioms of ZFC. At the present state o f set theory there are no compelling reasons to believe that one or another of thc:rrl is intuitively true. It is rnerely known that they do not lead to contr;~dict.ioll..;. assuming ZFC does not. Their significance lies in systematizing the rrlorass o t corlsistency results obtained by the technique of models. The first combinatorial prirlciple we consider is J ~ n s e n ' sPri~lciple0 .
Principle 0 There exists a sequence (W, ) cu < w l ) such that. for eacll cu < w l , Wa C P ( & ) ,Wa is a t most countable, and for any X C wl the set. {cu < wl I X n a E W , ) is stationary. Principle 0 is known to hold in the constructible model (see Chapter 15. Section 2 ) . Here we prove two of its consequences; others can be found in the exercises. 5.1 Theorem If 0 holds, then 2Nu= N 1 .
If X C w C_ wl the11 Sx = { a < w 1 I X n cu E W,) is stationary. Proof. For cu E Sx, a 2 w this implies X = ,Y n a E W a . So P ( w ) G Uai,, W,. hut. the cardinality of the last set is a t most ~ , , , 1 No = N o . N I = NI. Cl 5.2 Theorem If 0 holds, then Suslin lines exist. The key idea of the proof is as follows. As usual, we construct T = U,,,, Y;, by transfinite recursion. We want to make sure that T has no uncountable antichains. As IT1 = N 1 , there are 2 N 1 > N I subsets of T , hence more than N 1 sets X C T which our construct,ion h a s to prevent from beconling :intick~:iill..;. The crux of the matter is being able to take care of more than NI sets ill h'l steps. It is here that 0 is used. Roughly speaking, we proceed in such a way that those A E W, that are maxirnal antichains in the subtree T ( " )(:onstruc.t,etl before stage a can no longer grow -- they remain maximal antichains for tilt. rest of the construction of T. 0 implies that when X is a maximal antichcziri in T , then X n T ( ~is )a maximal antichain in T ( ~and ) X n T ( ~E W,, ) for sonie cu < w l . As X n T(") subsequently does not grow, we must have X = X n T ( ' ~ ) so in particular X is a t most countable. We now proceed with the details. One minor problem arises because 0 applies to subsets of u l ,while we pIan to construct T as a subset of W<,'. The following lemma takes care of this. 5.3 Lemma 0 implies that there exists a sequence (2, ) cr < w l ) such that, for each a < w l , 2, L wia, ZU is at most countable, and for any X C t ~ < " l the set {u! < w l ( X n w C U E 2,) is stationary.
5. COMBINATORIAL PRINCIPLES
235
Proof. We note that I w < ~ ~ I = I W"[ = 2"1) = N 1 because 2"" = N 1 by Theorem 5.1. Let F be a one-to-one mapping of W<"' onto w l . We claim that SF = {a < w1 I sup F [ w C a ]= a ) is closed unbounded. It is clear that SF is closed. Given P < wl we define recursively: a0 = 0, Qn+l = sup F [ w < ~ and ~ ~set ] ,a = sup{a, I n < W ) . It is clear that P 5 a < wl and a E SF. We now set 2, = {F- '[A]n wCa 1 A E W,) if a E SF, Z, = 0 otherwise. Clearly 2, C_ W < , and is a t most countable. If X C W<"' then F [ X ] w l ! so S = {a < wl I F [ X ]n a E W,) is stationary. Then S n SF is also stationary and a E S n SF implies 2, 3 F - ' [ F I X ] n a] n w < " = X n W < " . U
c
Proof of Theorem 5.2. We construct a particularly nice Suslin tree. 5.4 Definition (see Exercise 4.5) A regular tree ( T ,5 ) is normal if for all a < /? < h ( T )and all X E T,, there exists y E To such that X < p.
We construct Ta by transfinite recursion, so that IT,[ 5 No and T ( " + ' )= Uolu TBis normal, for all a < W'. We put To = 10) W'. Given To C W" such that lTal 4 N o and T ( ~ +is' )normal, we let To+, = { f W {(a,n ) } I f E T,, n E U ) and note that IT,+ll 5 No and T(,+') is normal. Now let a be a limit ordinal. By the inductive assumption, all T B ,,f3 < (.v, and hence also T(,) = UBCuTo, are normal, and IT(")I 5 CO<, ITo/ 5 N o . Let {C, 1 n E N ) be the at most countable collection of those elements of Z, which happen to be maximal antichains in T ( o ) .
c
5.5 Claim For each f E T(,) there is a branch b of length a such that f C_ 6 and for each n E N , there is some g E G, such that 6 2 g .
Proof. We fix an increasing sequence of ordinals (an I n E W ) such that a0 = dom f and supnEwa, = a. We construct b by recursion. n ~with dom b,, a,, there is some g E C, comparable Let bo = f . ~ i v e b, , the with 6,; otherwise, C,,U {b,) would be an antichain in T ( o )contradicting maximality of C,. If a,+l 5 domg we let b,+ l = g ; otherwise we take some 6,+ E Tar,+ L such that bn+' 2 g U b, (it exists by normality of ~ ( ~ 1 Letting ) . b= U ,,, h, we have dom b = sup{a, I n E W ) = a , and the other properties of b required by the Claim are clearly satisfied. il
>
Returning to the proof of Theorem 5,2, for each f E T(,) we choose one branch bl as in Claim 5.5, and we let Ta = {bj I f E T ( " ) } .It is clear that T("+l)is a normal tree and I T ( ~ + ' ) I No. The key observation is: each C,, remains a maximal antichain in T("+'). This is so because each b E T, is comparable with some g E C, (in fact, g C_ b). This completes the recursive construction. We let T = U,,wh Ta and note that T is a normal tree of height wl. In view of Claim 4.3, it remains to show that T has no antichains of cardinality N I . Let X be such an antichain; we call assume X is maximal.
<
CHAPTER 12. COMBINATORIAL SET 'THEORY
236
5.6 Claim Sx = { a < unbounded.
wl
I X n?'(")is
a maxinzal antichazn in
~ ( " 1 )
is clo,s~d
P7'ooJ Let 0 < wl be arbitlxry. We construct a sequence (U,, ( 71 E dL.'jI)\. recursion: (1" = B- Given a,,. the set T ( " ' ~is) a t most countable, and fur (>very f E thew is some g j E X roiilpnrable with j (otherwisr. X woulcl i a ~ tI)i. iL nlaximt~lantichain). We let tr,,+l = sup({(dom y j ) + 1 I f E T " " ~ ~ L)) ( } (I,, } j. If tr = s t i p { ~ , ( n E W ) we have 0 < 0 < L L ~and I each f E T ( ( ' )is ( ~ o t ~ i p a ~ ~ i . ~ I ~ l + ~ with some eltment of X n Tr"), so X n T(") is ;l rnaxir11:tl ;silt.ic.:hai~iill '1''"). . This shows Ss is urlbouncled. I t is tliLsy to see that Sx is clostxl. .~
(
~
1
1
)
:
1
We now coxnplete the proof of Theorem 5.2. Lemma 5.3 provides a E Ss . ( I limit, such that X nw'" E 2,. So X n ~ (is a~maximal ) antichaill i r l i~ll(l belongs to 2,. By our co~istructionarld the key observatiorl, X n Y'((') renlair~s a maxirrlal antichain in ?'(nf '1. But then i t remains a trlaxin~alantichnirl i l l l ' ! Indeetl, if j E 7' - T("+') then f (k E T("+') ;intl hence J 2 f (1 2 g t;~l. some X n F''),i.e., f is coirlparable with some y E X n ?'("). It follows that X = X n ~ ( " 1 . and in particular, (XI . IIT(fi)I5 No. rl
r
L
'rhc secorlcl conlbinatorial principle we study in this sectiorl has ~loll~xistc~ii(.i~ of Suslin liues as one of its many consequences. Before stating it we ncl~tlto ixltrocluct~some terminology.
c
5.7 Definition Let ( P , <) be an ordered set. We say that a set C P is ( : ~ J ' I ~ ( L I in P if for every p E P there is g f C stlch that p . I(l. A set D G P is dir.ct-icd if for till d l , d z E D thert? is cl E D such that dl . Id, d2 5 d. A set. A c P is it lowe7- set if a E A , p E P , p U irnplies p E A. Let C be a collectiotl of c.ofilli~l subsets of P. A set G c P is ct~lletlC-gene7ic if G is it clir~.c:te(l1owc.r set i ~ l i { l G n C # 8 for each C E C.
<
T is (lirected if and orlly if D i,s (I 5.8 Example Let (T. <) be a treti. D chain. Assume T is normal. Tlleil T ( u ) = Un,pTi, = { Y E T 1 tht:rr rxists .r- .: T,, such that z 5 g} is cofinal in T. for each a 2 h(7'). A set G T is C-grllc.l.i(, for C = {T(o)I ck < h ( T ) } if ;u~clo~llyif G is a brancdl of lellgtkl /1(7')i l l '1'.
c
The following easy theorem is
;L
furidanle~ltalfact about generic sets.
5.9 Theorem Let C be a collection of cofinal subsets of P with )Cl 5 NO. Tlrt.11. for every p E P there exists a C-generic set G such that p E G .
I
Proof. Let C = {Cn n E N } . We construct a sequence ( p , I n E N ) recursively. We let p" = p; given p,, we let p,,+l = q for some q E C,., such that l)n< Y . Fitlally we let G = { r E P I r- Pn for some 71 E N}. This nlitkrs G ii 7 i L. lower set and G is clearly directecl and C-generic.
<
8-
As an application we prove Theorem.
L:
special c.ase (for R) of t,ht. Baire C';~trgorv
5. COMBINATORIAL PRINCIPLES
237
5.10 Corollary T h e intersection of any at most countable collection of open dense sets in R is dense. Let 0 be an a t most countable collection of open dense sets iu Proof. R. Let (a, b ) be an open interval. We consider the set P of all closed intervals [a, P] c (a, b): a,B E R, a < p, and order P by reverse inclusion 2 . If 0 C R is open dense, we let CO = ( [ a , p ] E P I [a.P] 0 ) ;it is easy to verify that. C'() is cofinal in P. We let C = {Co I 0 E 0 ) ;by Theorem 5.9: there is a C-generic set G. In particular, G is directed by 2. It follows that G is a collection of nonempty closed and bounded intervals with the finite intersection property. By Theorem 3.4 in Chapter 10, n G # 0. Clearly, any X E n G belongs to ( a ,b) n
no.
The interesting question is whether C-generic set,s exist for uncount,al~tc~ C. The next example shows they do not, in general. a s us~lal. So T is the tree of all = u : ~ ordered , by we let C,, = { f E T 1 finite sequences of countable ordinals. For each tr < ct. E rarl f }. C,, is cofinal in T: if f E T, dom f = 7 1 . t,hcn g = f V { ( ? I . t t ) ) E C,, arid f g . Let G be C-generic for C = {C, 1 cr < wl ). C: is a collect.ion of' finitc sequences and, being directed, any two finite sequences in G are co~npat.ibltb. Therefore F = UG is a function from a subset of w into ~ 1 As . I ran F I 5 No. there exists -y E w l - r a n F , but this contradicts C, n G # 0.
5.11 E x a m p l e Let T
As irl our study of trees, we have to restrict ourselvc~st,o sufficiently ..slinl" ordered sets to have a hope of positive results.
5.12 Definition Let (P. 5 ) be an ordered set. We say that p. q E P a r c r:o7rrpatible if there is r E P such that p 5 r , q 5 r ; otherwise they are incompat.iblc. An antichain in P is a subset A c P such that p, q E A, p # q implies p ;mtl arc in~:o~npati ble. We 11ot.e that when (P,5 ) is a tree, p, q E P are compatible if ailcl orily if' they are c:omparable, and so our definition of antichairrs agrees with t.hp prt.>vious one for trees. An ordered set (P,5 ) satisfies the coun.table antichnin condition if rvery ;iritichain in P is at most countable. Let ti be an infinite cardinal. We can now state Martin's Axtom for K . M a r t i n ' s A x i o m MA, If (P, 5 ) is an ordered set satisfying t h c.ouilt:iblt~ ~ antichairi condition, then, for every collectiorl C of cofi~lalsubsets of P n*it.t~ ICI . IK , there exists a C-generic set. We note that MAN,,is true by Theorem 5.9?so MAN, is the first. intcrestiiig case. We prove two of its consequences here, and consider sorne further ~xarrlpl(ls in the exercises.
CHAPTER 12. COMBINATORIAL SET THEORY
2 38
5.13 Theorem MA, implies that the intersection of any collection O of 0pc.n. dense sets i n R of (01 < K is dense.
Proof. Exactly the same as the proof of Corollary 5.10, with the refereric.t~ to Theorem 5.9 replaced by a reference to MA,. P satisfies the countable antichain condition on account of Theorem 3.2 in Chapter 10. E 5.14 Corollary MA, implies 2'" > K .
Proof. For each X E R, 0, = R - { X ) is open dense. If' 2*11 < - K the11 C3 = ( 0 , I X E R) is a collection of open dense sets in R with 101 . IK , hut 0 = 0 is not dense in R.
n
In particular, MAN, implies 2N" > NI, the negation of the Continuum HIpothesis. 5.15 Theorem MAN, implies that there are no Suslin lines.
Proof. Let T be a regular S~lslintree of height wl . We let f' = { t E T 1 {(I- F r } is uncountable). It is clear that T is a subtree of T and T satisfies t h ~ condition from Definition 5.4 (although 5? need not be regular, even if T is!); in particular, if'\ N I . For each cr < w l , p ( a ) = {y E F 1 .r 5 y for some z E is cofinal in T. Let G be C-generic for C = { ~ ( a I )a < w l } . The11 G is ii cofinal branch through p. hence through T , contradicting the assumption t,hi~t T is Suslin. 1-1
T1t
<
-
E,)
' 1
Martin's Axiom (MA) is the statement that MA, holds for all infinite r; < 2"". If 2'" = N 1 , then K < 2'" implies K = No and MA holds. However. it. is consistent to have MA and 2'" > NI. Thus MA can be regarded as a ger1er;ilization of the Continuum Hypothesis. R o m Theorem 5.13 we see immediately that, if MA holds, then the intersection of any collection 0 of open dense ~111)sets of R of i01 < 2"(l is dense. MA also implies that the union of less than 2'" sets of Lebesgue measure 0 has measure 0, the Lebesgue measure is K-additivtl for all K < 2'" (not just countably ;dditive}, and many other results about R. topological spaces, and sets in general. Exercises 5.1 Show that 0 is equivalent to the statement: There exists a sequencr (IVA I cr < W ] )such that, for each a < w l , W[, C_ a&.W: is at most countable, and for any f : w l -+ wl the set { a < w l I f 1 cr E W ; ) is stationary. 5.2 Assuming 0 show that there is a Suslin tree ( T ,5 ) such that the only automorphism of (T, 5 ) is the identity mapping (a rigid Suslin tree). [Hint: Imitate the proof of Theorem 5.2; a t limit stages a make sure that no automorphism h of ?'(") such that h E 2, can be extended to itn automorphism of T ( a + l ) . ]
5. COMBINATORIAL PRINCIPLES
239
5.3 Assuming 0 show that there are 2N1 mutually nun-isomorphic norinal Suslin trees. 5.4 Assuming 0 show that there exist 2N1stationary subsets of w l such that the intersection of any two of them is at most countable. [Hint: Consider Sx = { a < w l I X n a E W,) for all X C w l . ] 5.5 Show that MA, is equivalent to the statement: If (P,5 ) is an ordered set satisfying the countable antichain condition, then for every collection C of maximal antichains with [C[ j K there exists a directed set G such that, for every C E C, there exist c E C and a E G so that c < a . 5.6 Show that there exist 2N0 subsets of W such that the intersection of any two of them is finite. [Hint: For f E {0, let A! = { f [ n 1 71 E W ) C_ W < " . Of course = 1wl.j
5.7 Assuming zN"= NI show that there exist 2Nl subsets of intersection of any two of them is at most countable.
wl
such that the
This page intentionally left blank
Chapter 13
Large Cardinals 1.
The Measure Problem
Theorem 2.14 in Chapter 8 shows that there exists no a-additive translation invariant measure p on the a-algebra of all subsets of R such that p([a.h ] ) = h-(]. for every interval [ a ,b] in R. This raises the quest.ion whether t.here exists any a-additive measure on P(R), or for that matter oil P ( S ) for any infiilitc st:t S . Of course, the counting measure, which allows the value p ( S ) = X . is ii trivial example, so we formulate the problem differently. We allow a rne~asureto have only finite values, and we further require (without loss of generality) that.
p(S) = l. 1.1 Definition Let S be a nonempty set. A (nontrivial probabilistic a-additive) measure on S is a function p : P(S)+ [0, l] such that (8) p(@)= 0, p ( S ) = 1. (b) If X C Y, then p ( X ) p(Y). (c) If X and Y are disjoint, then p ( X U Y) = p ( X ) + p(Y). (d) p({a)) = O for every a E S. is a collection of mutually disjoint subsets of S, then (e) Zf
<
A consequence of (d) and (e) is that every at most countable sulxet of S has measure 0. Hence if there is a measure on S, then S is uncountable. I t is clear that whether there exists a measure on S depends only on the cardinality of S : If S carries a measure and IS'/ = IS1 then S' also carries a measure. Irl Section 2 of Chapter 11 we constructed a nontrivial finitely additive mcuuri? on N, a function that satisfies (a)-(d) in Definition 1.1. The measurt. probleni is the question whether there exists such a function on some S that is o-additive. The measure problem is related to the question whether the Lebesgue n~easurcl can be extended to all sets of reals. Precisely: Does there exist a cr-ad(1it.i~~
CHAPTER 13. LARGE CARDINALS
2 42
measure p : P(R) -, [O, m) U { W ) such that for every Lebesgue measurable set X , p ( X ) is the Lebesgue measure of X . If there is such ark extension p of t l w Lebesgue measure then the restriction of p to F([O,l))is a nontrivial measure on S = [O,l] (satisfying (a)-(e) in Definition 1.1). Conversely, it can be proved that if a nontrivial measure exists on a set S of cardinality 2Nuthen the Lebesguc measure can be extended to a a-additive measure p : P(R) 10, W ) U { W ). The measure problem, a natural question arising in abstract real analysis. is deeply related to the continuum probleni, and surprisingly also to the subject wc touched upon in Chapter D - inaccessibIe cardinals. This problem has becorr~t~ the starting point for investigation of large cardinals, a theory that wc ~xplore further in the next section.
-
1.2 Theorem If there is a measure on 2*", then the Conlinuum fipothe.szs fails. Let us assume that 2N"= NI and that there is a measure p on the Prooj set S = wl, a function on P(S)satisfying (a)-(e) of Definition 1.1. Let I denot,o the ideal of sets of measure 0:
This ideal has the following properties: For every X E S,
(1- 3 )
{X) E
I. 00
If X, t I for all n E N , then
(1.4)
U X,
E
I.
n=O
There is no uncountable mutually disjoint collection S 2 P ( S ) (1.5)
such that X @ I for all X E S.
Property (1.4) is an irrimediate consequence of Definition 1. l (e). As for ( l .S), suppose that S is such a disjoint collection. For each n, let
S,. S is Because p(S) = 1, each S,, can only be finite, and because S = at most countable. We now construct a "matrix" of subsets of S (A,, ( a E w l , n E W ) . For each J < w l , there exists a function f t on W such that J C_ ran f t . Let us choose one f < for each J , and let A,, The matrix (A,,)
(1
=
{ J < w~ 1 f&) = a) (a < W I , n < W).
has the following properties: For every n , if a #
D, then A,,
f~Ao, =
0.
CU
(1.7)
For every a, S -
U A,, n =O
is at most countable.
1. THE MEASURE PROBLEM
243
We have (1.6) because E A,, nAa, wouId mean that f<(n) = a and f [ ( n )= D. The set in (1.7) is at most countable because it, is included in the set c2 U ( 0 ) : If # A,, for a11 n, then cu # ran fc and so 5 cr. Let a < wl be fixed. From (1.7) and (1.4) we conclude that not a11 the sets A,,, n E w,are in the ideal I: wl is the union of the sets A,, and a11 tit, lilost countable set, which belongs to I by (1.3) and (1.4). Thus for each a < wl there exists some n, E N such that A,,,, # I. Because there are uncountabIy many a < wl and only countably many n E N , there must exist some n such that the set ( a I n, = n) is uncountable. Let,
S = {Aan I na = n ) . This is an uncountable coIlection of subsets of S. By (1.6), the sets in S are mutualIy disjoint, and A,, 4 1 for each A,, E S. This contradicts (1.5). We have a contradiction and therefore the assumption that 2'" = NI must 13 be false. This proves the theorem. The proof of Theorem 1.2 can be slightly modified t,o obtain the following result. 1.8 Theorem If there is a measure o n a set S , then some cardinal weakly inaccessible. 1.9 Corollary If there is a measure o n 2'0, then 2'1) inaccessi b Ee cardinal K .
>
K
K
< IS1 is
for some weakly
To prove Theorem 1.8 we follow closely the proof of Theorem 1.2. We assume that for some set S there is a function p : P ( S ) 4 [O,l ] that satisfies Definition 1.1. We have shown that then there exists an ideal I on S that satisfies (1.3). (1.41, and (1.5). Let (1.10) (1.11)
K
= the least cardinal such that for some S of size
there exists an ideal I on S with properties (1.3), (1.4), and (1.5) and K
I is an ideal on S = K with properties (1.3), (1.4), and (1.5).
(Of course, if such an ideal exists on some S of size
K,
then there is one on
K.)
1.12 Lemma For every X then X, E I.
< K, if
{Xrl),-,<~are such that X , E I for all 11 < X,
Proof. Otherwise, there exists some X < K and {X,),<x such that X, E I for each 11 but U ,,, X , # I . We may assume that the X, are mutually disjoint. because we can replace each X, by X i = X, - U{X, I v < p ) , and U , A X: = U,,<,X,. Let J={YEXI UX,€I). ,EY
CHAPTER 13. LARGE CARDINALS
244
We verify that J is a ideal on A (e.g., A $ J because U,EAX,)$ I). For ilaclr rj C A, {Q) E J because X , C I ; thus J satisfies (1.3). Similarly. one verifies that .I satisfies (1.4) and (1.5) ,.;well. [For (1.5) we use thtb assumption that, the X, are mutually disjoint.] Thus J is an ideal on A < K with properties (1.3), (1.4), and (1.S), c:orltr.:irjr to the assumption (1.10). LI: 1.13 Corollary If X C
1 .l4Corollary
Proof.
K
K
and
1x1<
then X E I
K,
is an ur~coz1r~t(ible regular cardinal.
is uncourltable bccause every a t most countable subset of' K ljt?lorigs t o I. And K is regt~lar,because otherwise K would be t,he u r ~ i o of' ~ l I~ISS than K sets of' size < K . and therefort. woulcl belong t o the icleal. ;L r,r~~lt,riicli(.tic,~~ K
i7 We can now finish the proof of Theorern 1.8. 1 .l5 Theorem
K
is weakly inacces,sible.
Proof.
111 view of Corollary 1.14 it suffices t o prove that K is a limit (:firdinal. Let us therefore assume that K is a successor cardinal, K = N,+1. For each < K we choose a function f( on w, such that C ran f( i i 1 1 ~ llct
c
proof o f Theorem 1.2, one shows that the matrix (Amq) has t.11~ following properties:
As
~ I Ithe
(1.16) (1.17)
# 0,then A,,, n Ao, = 0. For every o.Jn A,,,I 5 Nu.
For every q , if a
U
-
7)
U"
Still fcdlowir-igtlie proof of Theorem 1 -2, but using Lemma 1.12 in place of (1. A ) . we show thiit for eacli cr < w,+l tliere exists some < W" S I K :th?lt. ~ ~ An,, I Aud the same argument as before procluces a collection S , of size H,,+ 1 ( t hclrc.f<~r.c. uncountable), of mutually disjoint sets $ I. This contradicts the property ( l .S). .., ancl therefore K cannot be a successor cardinal. -7
i
A measure p is two-valued if it takes only two values, 0 and 1. We return t o two-valued measures in the next section. as a starting point for the t heor~r(of large cardinals.
1.l8Definition Let p be a measure on S, A set A C S is an atom if p ( A) > 0 and if for every X C A, either L ~ ( X=) 0 or p(A - X) = 0. The measure /L is atomless if there exist no atoms.
2 . THE MEASURE PROBLEM
245
We conclude this section with the proof of the following dichotomy due to Stanislaw Ularn. 1.19 Theorem If there a measure then either there exzsts a two-valued measure or there exists a measure on 2'(1.
1.20 Lemma Let p be an atomtess measure on S .
(a) For every E > 0 and every X that 0 < p(Y) 5 E . (h) For every X C S there is a Y
CS
with p(X) > 0 there is a Y
E X such that p(Y)
-
cX
such
+/1(~).
Pro0 f. (a) Let X. = X, and for each n, we find an X,+l C X, such that 0 < ( X , ) . This is possible because X, is not an atom and therefore there is an X C X, such that p(X) > 0 arid p(Xn - X) > 0. A S p(Xn) = p(X) + p(X, - X), it follows that either X,,, = X or X,+, = X, - X has the desired property. Clearly, 0 < p(X,,) 5 &/L(x) for cvvry n, and part (a) follows. (b) Let X C S be such that p(X) = m > 0. By transfinite recursiorl on n. < w l we construct a disjoint family of subsets Y, of X as follows: Ltlt Yo C X be such that 0 < p(Yo) 5 m / 2 ; if p(UBi, Y3) < m/2, we c h o o s ~ Y, C X - U,,, Y, such that 0 < p(Y,) 5 7 - p(U,,, Yo). It. follours from (1.5) that there exists an a a t which the construction stops. i.e..
p(Ua<, Yo) = m/2* Proof of Theorem 1.19. Assume that tllere exists a ~rie<sure p or1 a set. S. I f there exists an atorn A C_ S, we define a two-value(l nle~wurev on A a.. follows: .(X) = p(X)/p.(A)for all X C A. If p is atomless, we define a family {X, I s E Seq) of subsets o f ' s indrxed by finite 0-1 sequences s E Seq = U ~-, { O ,l)". Tllr sets X, are defined by recursion on the length of S: For the empty sequez-ice fl, let Xfl = S. C 'lven X,, we let X,-o and X,-, be subsets of X, such that X,-l = X, - X,.-o :illcl p(XSno)= p(X,-,) = dP(xs). Thus P(X,) = 1/2" where n is the lesgtll of S . Furthermore, for each f E {O, we let Xf = XI,.. Note that if f # g then Xf n X, = 0, and that p(X5) = 0 for each f . Now we define a measure v on the set (0, 1IWas follows:
nrEo
As p is a measure, it is easily verified that v has properties (a). (b). (c) and of Definition 1.1. Property (d) follows from the fact t,llat jl(XI) = 0 fbr txii(.11 ((2)
f E (0,l)". Thus if there exists a measure then either t-here exists one on 2'(1 in which case 2'1) > K for some weakly inaccessible cardinal K . or else there exists a two-valued measure, an alternative that we investigate in the next section.
CHAPTER 23. LARGE CARDINALS
246
2.
Large Cardinals
In the preceding section we proved that if there exists a nontrivial a-additivfs measure, then there exists a weakly inaccessible cardinal. This result is just one example of a vast body of results, known as the theory of large cardinals. 111 this section we give some more examples of large cardinal results and introdur-e the most studied prototype of large cardillals - measurable cardinals. 2.1 Lemma l f p is a two-valued measure on S, then
is a nonprincipal ultrajilter on S, and m
(2.2)
{Xn);P=,
2s
such that ?r', E U for each n, then
r) X,, E U . rc=O
Proof. An easy verification. U is nonprincipal because p is nontrivial. a n t l satisfies (2.2) because p is a-additive. C1 Property (2.2) is called 0-completeness. T h e converse of Lemma 2.1 is also true: if U is a 0-complete nonprincipal ultrafilter on S, then the function p : P ( S ) --+ (0,l ) defined by
is a two-valued measure on S. Thus the problem whether two-valued measures exist is equivalent t o the problem of existence of nonprincipal a-complete ultrafilters. We now investigat,~ this question. First we generalize the definition of 0-completeness. 2.3 Definition Let K be an uncountable cardinal. A filter F on S is K-completc~ if for every cardinal X < K , if X, E F for all a < X, then X, E F. An ideal I on S is K-complete if for every cardinal X < K , if X,, C I for i l l l n < X, then U ,,, X, E I . A filter is K-complete if and only if its dual ideal is K -complete. An N I complete filter is also called 0-complete or countably complete; it Inearis t h a t X,-, E F whenever all X, E F. Similarly for ideals.
nmcA
nreP=,
2.4 Lemma If there exists a nor~principal0-complete u1trafilte.r- then there f i r ists an uncountable cardinal K and a nonprincipal K-complete ult7.afilter on K .
Let K be the least cardinal such that there exists a nonprincipal U Proof. complete ultrafilter on K , and let U be such an ultrafilter. We wish t o show that. U is K-complete (note that because U is 0-complete, K must b e uncountable).
2. LARGE CARDINALS
247
Let I be the ideal dual to U: I = P ( K ) - U. I is a nonprincipal a-complete prime ideal on K ; we show that I is K-complete. If not,, then there exists X < K and {X,),,, such that X, E I for euch V but U,,, X, f I. We may assume that t h e X, are mutually disjoint. Let
J is a a-complete prime ideal on X; it is nonprincipal because X, C I for eacli V E X. T h e dual of J is a nonprincipal a-complete ultrafilter on X. But X < K and that contradicts the assumption that K is the least cardinal on which there is a nonprincipal a-complete ultrafilter. Thus U is K-complete. 2.5 Definition A measuruble cardinal is an uncountable cardinal there exists a non principal K-complete ultrafilter.
K
on which
T h e discussion leading to Definition 2.5 shows that measurable cardinals are related t o the measure problem investigated in Section 1. The existence o f a measurable cardinal is equivalent t o the existence of a nontrivial two-valued a-additive measure. We devote t h e rest o f this section t o measurable cardinals. 2.6 Theorem Every measurable cardinal i s strongly inaccessible.
Proof. We recall that a cardinal is strongly inaccessible if it is regular. uncountable, and strong limit. Let K be a measurable cardinal, and let U be a nonprincipal K -complete ultrafilter on K . Let I be t h e prime ideal dual to the ultrafilter U . Every singleton belongs to I, and by K-complet.eness, every X C K of cardinality < K belongs t o I. If K were singular: the set K would also have to belong to I, by K-completel~ess. Hence K is regular. Now let us assume that K is not strong limit. Therefore, there is X < K such that 2 9 K , so there is a set S C {0,1}* of cardinality n. O n S t l ~ e r cis also a nonprincipal K -complete ultrafilter, say V. For each CI < X, exactly one of the two sets
belongs to V; let us call that set X,. Thus for each a < X we have X, E V, and by K-completeness, X = X, is also in V. But there is at most one function f in S t h a t belongs t o all t h e X,: the value of f a t a is determined < 1, a contradiction since by the choice of one of the sets in (2.7). Hence V is nonprincipal. Hence K is a strong limit cardinal, and therefore stroilgly inaccessible.
1x1
One can prove more than just inaccessibility of measurable cardinals. We give several examples of properties of measurable cardinals that. can be obtained by elementary met hods.
248
CHAPTER 23. LARGE CARDINALS
Let us recall (Section 3 in Chapter 12) t h a t a n uncountable cardinal K hks t h e tree property if there exists no Aronszajn tree of height K . The following t l ~ o r c i nshows t h a t every mc,?surable cardinal h c u t,he tree property.
2.8 Theorem Let K be a measurable cardinal. If T is a t r m of hezght K suc.11 that euch node has less than K z~n~nediatc: successors, then T has U 67.(171(:/). of length K .
Proof, For each a < K , let G be the set of all S C T of' height cr. Since K is a strongly inaccessible cardinal. it follows, by induction 011 0.t h a t ?l,' 1 < for all 0 < K . Hence IT1 5 K , and therefore IT[ = K . Let U b e :L nonprinc:ipid K-cotnplete ultrafilter on T. We>find a branch of length K as follows. By irlductiorl on cr we show t h a t fi>r eacll there exists a unique s, E T,, such t h a t
arid t h a t sa < SO when a < 0. First, let so be the root of T. Given S , , . and assuming (2.9)) we note that the set { t f T s, 5 1 ; ) is thc clisjoint union of { S , ) and the sets { t E T I u 5 t ) where U ranges over all inlrnchdiatt. successors of S,. Since U is :L K-r:orriplete ultrafilter arld ,S,, hiis kwer t h a n h. irnmecliate successors, there is a unique inlinediate successor 11 = S,, l suc:h t h t
{ t E 7' I S n + l < - t ) E U. nTllerl 11 is a limit ordinal less than ti and {S,),<,, S , E 'l; . iu.cl , sucl1 tlliit S,, < S O if c t < I j , and all the S,, siitisfy (2.9), then we have
because U is K -complete. T h e set Sv = { S E T,, 1 S < l: for sonle 1: C S ) is rlu~lernptyand of size less than K ; it follows t h a t there is it unique s E S,, for whir:li { t f T I s 5 t ) E U , and we let S, be this S. I t is clew that rh = { S , , I cr < ti) is a branch in 7'.of lrtngth K .
'L.
111Sect ion 2 of Chapter 12 we ir1troduc:ed weakly coirlpact car(1inals. We ~tiit.0 without proof the following etluivaltlnce t h a t gives ailother cllar:~c:tcbrizntiorlo f weakly compact cardinals. 2.11 Theorem The following a7.e equivalent, fo7- e ~ ~ e uncountable ry c [ L ~ . ( ~ zK T :L(L~ (U) K + (K);. (6) K + (K): f o ~all positive integers T and S. ( c j K zs strongly inaccessible and has the tree property.
Since measurable cardinals are strongly inaccessible and have tlle tree property, it follows t h a t every meilsurable cardinal is weakly cornp;~{:t.Ttlis (.a11i ~ l s o bc: provt?d directly, by showing K -+ ( K ) : . We conclude this introrlur:t,ion to liirgc* r:;irdirlals by presenting the proof of the special c ~ u for e r = s = 2.
2. LARGE CARDINALS
249
2.12 Theorem If ti is a measurable cardinal, thcn every partition, two sets has a homogeneous set of cardinalit?/ K .
of [ti]%n.to
Proof. Let { P l P , 2 ) be a partition of [ K ] ' . TO find a hon10ger1eo~sset,, let I/ be a nonprincipal K-complete ultrafilter on ti. For each a C K , let
S: = { O E K I f l f n and { r r . J ) E P I } .
S: = { D E
K
I D#
n and {o.MJ E R 2 } .
Exactly one of SA,S: belongs to U. Let
Z1 = {n l S: E U), Z2 = { n l S:
E
U}.
Since Z1 U Z2 = K , either Z1 or Z2 is in U. Lvt us assume that Z1 E U iul(1 let us find H K of cardinality K such that [HI2 PI. We construct H = { a < I < K ) by recursion. At step y,we have construct~d an increasing sequence
c
(Q€
of elements of Z1 such that for each
Because SA, E U for all
I I < 7)
c < 77 < y ,
E < y , the set
is in U (by K-completeness) and therefore contains solrle Q greater than 2111 (L:, [ < y. Let Q, be the least such a. Then (2.13) holds for all < 17 < 3 + 1. Let H = {a( I E < K ) , If C < 77; then o , E S:,, nrld so {nZ.ct,JJE P ! . D [HI2 2 PI and therefore H is homogeneous for the partition.
Exercises A strongly inaccessible cardinal cardinals < ti is stationary.
K
is a Mahlo cardinal if the set of all rcguln~.
2.1 If K is Mahlo, then the set of all strongly inacc:cssible carclitlals < K is stationary. [Hint: The set of all strong limit 0 < t i is closed unbounded.]
Let K be a measurable cardinal. A nonprincipal K-complete ult,rafilt.er I: o t l is normal if every regressive function f with dom f E U is constant on sorne set A E U. In Exercises 2.2-2.5, let U be a nonprincipal K-cotnj~lete11ltrafiltc.r on K . For f and g in K ~ let , K
r:nrdirlals by presenting the yl.oof of the special ccue for r
= s = 2.
CHAPTER 13, LARGE CARDINALS
250
2.2
is an equivalence relation on
K"
Let W be t h e set of all equivalerice classes of equivalence class of f . Let
If] < 191 if
= on
{Q.< K I f ( 4 < 9 I a ) )
K";
let [ f ] denote t.ho
E
2.3 < is a linear ordering of W . 2.4 < is n well-ordering of W. [Hint: Otherwise, there is a sequence (f, I 71 E N ) such that if,] > [ f , + l ] . Let X, = { a I f,(a) > f,+,(a)) and let X = X,. If a E X . then f o ( a ) > f l ( a ) > . . - . a rontradictioll.1 2.5 Let h : K 4 K he the least fi~nction(in (W, <)) with the property that. for a l l y < K . , { o < K I h ( a ) > y ) U~ . Let V = { X G K I h l [ X ] E U ) . Show that V is a normal ultrafilter.
nrZP=,
Thus for every measurable cardinal K , there exists a normal ultrafilter on In Exercises 2.6-2.9, U is a nornial ultrafilter on K .
K.
2.6 Every set A E U is stationary. [Hint: Use Exercise 3.9 in Chapter 11 and
the definition of normality.] 2.7 Let X < K. be a regular cardinal and let EA = { a < K ( c:f(u) = X ) . T h e set Ex is not in U. [Hint: Assume that Ex E U . For each Q E EA. < X ) be a n increasing sequence with litnit c t . For v:ic,ll let {X,€ ( < X there is yt and At E U such that X,< = yg for all a E A ( . Let A t . Then A E U, but A contains only one element, namely sup{yc A = E < A}; a contradiction.] 2.8 T h e set of all regular cardinals < K is in U. [Hint: Otherwise, tlitl set S = {cu < K [ c f ( a ) < C Y ) is in U , and because cf is a regressive function on S , there exists X < K such t h a t { a < K I c:f(a) = X) E U ; it contradiction.] 2.9 Every measurable cardinal is a Mahlo cardinal. [Hint: Exercises 2.6 ancl 2.8.1
YA
Chapter 14
The Axiom of Foundation 1. Well-Founded Relations T h e notion of a well-ordering is one of the key concepts of set theory. It emerged in Chapter 3, when we introduced natural numbers; the fact that the natural numbers are well-ordered by size is essentially equivalent to the Induction Principle. We studied well-orderings in full generality in Chapter 6 , and we saw many applications of them in subsequent chapters. It is thus of great interest t o consider whet her the concept allows further useful generalizations. It turns o u t that, for rrlarly purposes, the "ordering" stipulation is unimportant; it is the "well-" part, i.e., the requirement that every nonempty subset has a minimal element, that is crucial. This leads t o the basic definition of this section. 1.1 D e f i n i t i o n Let R be a binary relation in A , and let X C A . M.('I say that. a E X is an R-minimal element of X if there is no X E X such that x R a . R is well-founded on A if every nonempty subset of A has an R-minimal element. T h e set {X E A 1 x R a ) is called the R-extension of a in A and is denotcil extR(a). Thus a is an R-minimal element of X if and only if extR(a) n X = GI.
1.2 E x a m p l e
(a) T h e empty relation R = 0 is well-founded on any A , empty or not. (b) Any well-ordering of A is well-founded on A . In particular, E , is wellfounded on a , for any ordinal number a . (c) Let A = P(,);then E A is well-founded on A . (d) E A is well-founded on A if A = V, (n E N) or A = V, (see Exercise 3.3 in Chapter 6 ) . (e) If (T, <) is a tree (Chapter 12, Section 3) then < is well-founded on T. T h e next two lemmas list some sim p le properties of well-founded relations.
CHAPTER 14. 'THE AXIOM O F FOUNDAY'I(1N
252
1.3 Lemma Let R be a well-four~ded7elatio.n on A. (a) R is antireflexive i n A , i.e.. aR(t is false, for all a E A . (6) R is asymmetric in A , in(!., c~Rbzmplies that bRa does not hold. ( c ) T h e w is no finite seq.r~e7~c:e (ao.(11, . . . . a,,) such that a l n u o . (hz RCL I. ... . a,, Ra,,- 1 , aoRa,. ( d ) There is n o infinite sequentltr ((1, I i E N ) of ~ l e m e n t sof A S T L ~ . ~tistlt ), a,+ I Ru, holds for all i f N .
Proof. (a) If uRa then X = {a) # fl does not have an R-minimal element. (b) If aRb illid bRa then X = { U ,b ) # 8 does not have an R-niininlal elemt~nt. (c) Otllerwise, X = {ao,. . . . a,, ) 111~sno R-rninimal element. ((l) X = {a, 1 i E N ) would Iiiive rio R-niinirnal clement if' (d) hilecl.
n 1.4 Lemma* Let R be a hinary .r.elation in A such that there is n o infinitc~ .sequence (a, I i f N) of elements uf A for which U , , + ~ Rholds U ~ for all i E N . T h e n R i s well-founded un A.
This is a converse to Lemma 1.3(d). In this chapter we clo not ass urn^! t 1 1 ~ Axiom of' Clioice. The few results wliere it is needed are marked by a n :~c,tcbrisk. Under the assumption of the Axiom of Choice, a relation R is woll-fountlet1 if a i d only if there is no infinite "R-rlecrcasing" sequcnt:t3of rleriierlts of .4.
Pr.oof. If R is not well-fbuntlccl on A , then A has a nonempty subset X with tllc property that, for every a E X. there is some b E X such that bRa Iiolcls. ~ ~ . ( a o , .. . . c l , ) . Choose a0 E X . Then choose u1 E X such that I L ~ R Giver1 choose a,,, E X so that U , + ~ R'rhe ~ , . ~*esulting sequence (a, ( i E N) satisfies C! a , + l R a , for all i f N. The importance of well-founded relations rests on the fact that, like wellorderings, they ttllow proofs by in(1ur;tion and con~t,ru~:tior~s of fui~~.tiolis rttc*ursion.
Lt.t P be some p ~ u p ~ r t ~Ass~~rnrl y. li~.ut.R 1.5 The I n d u c t i o n Principle a well-founded relation on A artd, fo7 all z E A, (*)
zf P ( y ) i~o1d.sfor.
T h e n P ( x ) holds for all
X
C
(111
,.v
y C extR(:c), ~ / L C T LP(x)
A.
Otherwise, X = (3: E A I P ( x ) does not hold) # fl. Lct U I ) t l Proof. a n R-niinimal element of X. Then P ( a ) fails, but P ( y ) llolds for all yRn. cont,r(zdicting (*). i
I . WELL-FOUNDED RELATIONS
253
1.6 The Recursion Theorem Let G be an operation. Assume that R is a well-founded relation on A. Then there is a unique function f on A such that. for all X E A, f ( 4 = G(f t ~ X ~ R ( Z ) ) .
T h e proof of Theorem 1.6 uses ideas similar t,o those employed t o prove o t lltlr Recursion Theorems, such as the original one in Section 3 of Chapter 3, and Theorem 4.5 in Chapter 6 . It is convenient to state somct definitions and prove a simple lemma first. 1.7 Definition A set B C A is R-transitive in A if extR(x) C B holds for all X E B. In other words, B is R-transitive if it has the property that X- E B a i ~ d y& imply y E B. I t is clear that the union and the intersection of any collection of R-t.rnnsit ivc. subsets of A is R-transitive. 1.8 Lemma For every C C_ A there is a smallest R-t.r*ansitiveB
c c B.
CA
such, that
Let B. = C, B,+1 = {y E A ( !jRx holds for some X E B,), B = U ,,, B,. I t is clear that B is R-transitive anci C G B. h~loreover,for any R-transitive B', if C B' then each B, B', by induction, and hence Proof. 00
B
c B'.
c
c
cl
I
Proof of the Recursion Theorem. Let T = {g g is a funct.ion, (ion1 is R-transitive in A, and g(x) = G ( g 1 ext R(x)) holds for all X E don1 g ) . We first show that T is a compatible system of functions. n 1 y l (.T) # Consider g l , g2 E T and assume that X = {.X G dom g1 n d o ~ !p2 g2(x)) # 0. Let a be an R-minimal element of X. Then yRa implies E dozn !/l. y € dom.92, and g, (y) = g2(y). Hence gl(a) = g , e x t R ( a ) = g2 / extR(u) = &(a), a contradiction with a E X. We now let f = U T . Clearly f is a function, dorn f = U{donlg I g E 7') is R-transitive, and, if X dom f , then X E d o m g for some g E T. g J . so f (X) = g ( x ) = G ( g t ~ x ~ R ( x=) )G(f t extR(5)). I t remains t o prove that dorn f = A. If not, there is a n R-minimal elerliezlt. U of A - dorn f . Then extR(a) dorn f and D = dorn f U { a ) is R-t,ransitivct. We define g by
c
g(x) = f(x) for X c dom f ; g(a) = G ( f 1 e x t ~ ( a ) ) . Clearly g E T, so g C f and a E dorn f , a contradictioil. Uniqueness of f follows by the same argument that was used t o prove that T is a compatible system.
CHAPTER 14. THE AXIOM O F FOUNDATIOhr
254
In Chapter 6 we used transfinite induction and recursion to show that every well-ordering is isomorphic to a ullique ordinal number (ordered by E ) . In the rest of this section we generalize this important result to well-foundcd relations. We recall Definition 2.1 in Chapter 6;: a set T is trunsitiuc! if c,vclry r~lc~irier~t of T is a subset of T.
1.9 Definition A transitive set T is well-founded if and only if the relation E?. is well-founded on T, i.e. for every X T , X # 0, there is a E X such that a n X =0. Ordinal numbers, P ( w ) , V, (n E N), and V, are examples of transitivc well-founded sets (Exercise 1.2). The next theorem contains the main idea.
1.10 Theorem Let R be a well-founded relation on A. There is a unique function f on A such that
holds for all
X
E A. The set T = ran f is transitive and well-founded,
Proof. The existence and uniqueness of f follow immediately from the Recursion Theorem; it suffices to take G(z)=
ran E
if z is a function; otherwise.
T = ran f is transitive. This is easy: if t E T then t = f (X) for some :I: E A. so s E t = {f(y) I yRx) implies s = f ( y ) for some y E A, so S E T. T is well-founded. If not, there is S C T , S # 8, with the property that for every t E S there is S E S such that S E t . Let B = f-'[S]; B # lil because f maps onto T. If X E B then f (X) E S so there is S E S such that S E f ( x ) = {f(y) I y h } . Hence s = f(y) for some yRx, and y E f-'IS] = B. We have shown that for every X E B there is some y E B such that yR.x, a G contradiction with well-foundedness of R. The mapping f is in general not one-to-one. For example. f (X)= 0 whenever X is an R-minimal element of A (i.e., extR(x) = 0). Similarly, f ( X ) = {g} whenever all elements of extR(x)# 0 are R-minimal, and so on.
1.11 Definition R is extensional on A if X # y implies extR(X)# extR(y). tor all X , y E A. Put differently, R is extensional on A if it has the property that. if for all z E A, z f i if and only if zRy, then X = y. 1.12 Theorem The function f from Theorem 1.10 is one-to-one if and only zf R is extensional on A. If it is one-to-one, it is an isomorphism between (A, R ) and (T,ET).
Proof. Assume that R is extensional on A but f is not one-to-one. Then X = {X E A 1 there exists y E A, y # X , such that f ( x ) = f ( y ) ) # 0. Let a be an R-minimal element of X and let b # a be such that f (a) = f (h). By extensionality of R, there exists c E A such that either cRa but not cRh, or cRh but not cRa. We consider only the first case; the second case is similar. Fkom cRa it follows that f (c) E f (a) = f (b). As f ( b ) = { f ( 2 ) I zRh), there exists some dRb for which f(c) = f(d). We have c # d because cRb fails. But this means that c E X, contradicting the choice of a as an R-minimal element of X . Assume that R is not extensional on A. Then there exist a , h E A, a # h. for ~ = f(h) which extR(a) = extR(b). We then have f (a) = f I e x t ~ ( a )=] f [ e x t(b)] showing that f is not one-to-one. Finally we show that f one-to-one implies f is an isomorphism. Of course aRb implies f (a) E f (h), by definition of f . Conversely, if f (a) E f (b) then f (a) = f ( X ) for some X Rb. As f is one-to-one, we have a = X , so a Rh. The corollary of Theorems 1.10 and 1.12 asserting that every extensio~liil well-founded relation R on A is isomorphic to the membership relation on ;r uniquely determined transitive well-founded set T is known as Most,owski's Collapsing Lemma. I t shows how to define a unique representative for each class of mutually isomorphic extensional well-founded relations, and thus gener iiI'lzes Theorem 3.1 in Chapter 6 (see Exercise l .S). Exercises
1.1 Given (A, R) and X E A, say that a E X is an R-least element of X if a is an R-minimal element of X and aRb holds for all b E X, h # a . Show: if every nonempty subset of A has an R-least element then ( A . R) is 21 well-or der ing. 1.2 Prove that € A is well-founded on A when A = P(i3),A = V, for 11 E N. and A = V,. 1.3 Let (A, R) be well-founded. Show that there is a unique function p, defined on A and with ordinal numbers as values, such that for all n: E A, p(x) = sup{p(y) + 1 I y h } . p(x) is called the rank of s in (A. R). 1.4 (a) Let A = a , R =E,, where cy is an ordinal number. Prove that p(x) = X for all X E A. (b) Let A = V,, R = E A . Prove that p(x) = the least n such t,hat X E V,+, , i.e., V, = {X E V, 1 p(x) < n ) . 1.5 Let ( A , R) be a well-ordering and let f : A --+ T be the function from Theorem 1.10. Prove that T = cu is an ordinal number and f = p is a n isomorphism of (A, R) onto (a,E,). 1.6 (a) Let A be a transitive well-founded set and let R = E A . If f : A --t T is the function from Theorem 1.10 then T = A and f ( x ) = :c for all X
E A.
(b) If A and B are transitive well-founded sets and A and (B, E B ) are not isomorphic.
# B then (A, € A )
256;
2.
CHAPTER 14. THE AXIOM OF FOUNDATION
Well-Founded Sets
In our discussion of Russell's paradox in Section 1 of Chapter 1 we touched upon the question whether a set can be an element of itself. We left it unanswt?rc~cl and, in fact, an aaswer was never needed: all results we proved ur~tilnow lloltl. whether or not such sets exist. Nevertheless, the question is philosophic:;~Hy interesting, and important for more advanced study of' set theory. Wt. riow 11;ivt. the tecllnical tools needed to examine it in some detail. We start with a simple, but basic, observation. 2.1 Lemma For any set X there exists a smallest transitive set contuinin!g X us a subset; it is called the transitive closure of X and is denoted 'I'C(X).
-
Proof. We let X. = X, = UX, {IJ I y E X for solnex r X,,) arid T C ( X ) = U{Xn 1 n E N ) . It is clear that X C T C ( X ) and T C ( X ) is transitive. It T is any transitive set such that X C T, we see by inductiorl that X, C 1' C! for a11 n E N , and hence T C ( X ) 2 T. 2.2 Lemma y E T C ( X ) zf and onhj if there is a finite sequence (xo,X I .. . . ~ 7 ~ tJmt ~ 1 1zo = X , x,+1 E s, for i = 0. l , .. . , ? L - 1, and X, = y .
.X,,)
Proof. Assume y E T C ( X ) = Ur=oX, and proceed by inductk~ri. If y E X. it suffices to take the setluellce (X, p). If y C X,,, we have E .r +E .Y,, for svlrle a. By iilductive iissun~ption,there is ii sequence ( n r l. ,. . . .I:,,) wl~cbrts X, for i = 0 , . . . . ? l . - 1, ailcl P, = 2. We let X,,+] = ;O atlcl zo = X , c:onsider (xO,. . . . X,, X, + 1 ) . Corlversely, given (xo,. . . . ;L,,) wtlcre xo = X and X,+ 1 E .c, for i = 0. l . . . . . E 71 - 1. it is easy to see, by inductio~i, that X, E X, 2 T C ( X ) ,for all i 5 7 1 . 2.3 Definition A set X is called well-founded if T C ( X ) is a transitive w ~ l l founded set.
-
For transitive sets, this definition agrees with Definition 1.9 because T C ( X ) when X is transitive. The significance of well-foundedness is revealecl \ ~ tyh(. t'ollowirlg theorern.
2.4 Theorem (a) If X is ulell-founded then there is no sequerice (X, I I r E N ) sucFt t h t ~ f X. = X and X,,+ 1 E X, f o r (111 7~ E N . (h)* If there is no sequence ( X , I 11 E N ) such that X. = X and X,,+] E X,, for all n E N then X is well-foun.ded, In particular, a well-founded set X cannot be an element of' itself (let X, = S for a11 n ) and one cannot have X E Y and Y E X for any Y ((:c~rlsitlerthe. sccluence ( X , Y, X , Y , X , Y. . . . , ) ) or any other "circ~~lar" situat,ioil.
2. kVELL-FOUNDED SETS
257
Proof. (a) Assume X is well-founded and (X, 1 n E N ) satisfies X. = X ! X,+ E X, for all n E N . We have X. = X TC(X) and, by induction, X,, E T C ( X ) for all n 1. The set {Xn I n 2 1) C T C ( X ) does not have an E-minirilal element, contradicting well-foundedness of TC(X). (b) Assume X is not well-founded, so T C ( X ) is ilot a transitive well-foundt~cl set RS defined in 1.9. This means that there exist,s Y T C ( X ) . Y # Q). with the property that for every y E Y there is z E Y such that z E p. Choose some y G Y. By Lemma 2.2 there is a finite sequence ( X o ,. . . X,, ) such that X. = X , X,+l C X, for i = 0 , . . . . ? I . - 1, and X, = y. W c?xtend it to an infinite sequence by recursion, using the Axiolzi of Choice. Clloose X,+ to be some a E Y such that z E y = X, (so X,+ 1 E X, I-I Y). Given Xn+k E Y choose E IIY similarly. The resulting sequenctl (X, I 71 E N ) is as required by the Theorem.
c
>
c
It is time to reconsider our intuitive understanding of sets, U'e recall Cantor's original des(:riptioii: a set is a collection into a whole of definite. distinct objt1c.t~ of our intuition or our thought. It seems reasonable to iiiterpret this i ~ r~~ealiilig s that the objects have to exist (in our mind) before they can be collected illto scc a set. Let us accept this position (it is not the oi~lypassible onc, as in Section 31, ancl recall that sets are the only objects we are interest.etl ill. Suppose we want to form a set "for the first time." That means that. there are as yet no suitsableobjects (sets) in our mind, ancl so the only collection wc can form is the empty set 0. But now we have something! V) is now a dcfiilite object in our mind, so we can collect the set (0). At this stage there are two objects in our mind, 0 and {a}, and we can collect various sets of these. 111 fact.. thcre are four: 0, {V)), {{8}}, and (0, (8)). Next we forin sets made of t.hese four objects (there are eight of them), and so on. Below, we describe this procedure rigorously, and show that one obtains by it precisely all the well-founrletl sets.
2.5 Definition (The cumulative hierarchy of well-founded sets.) v 0 =
V,+
l =
V, =
0; ?'(V,)
U VD
for all a; for all limit n # 0.
Dca
We note that V1 = {O}, V2 = (0, {@l), V3 = {g, {V)}: { { 0 ) ) ,{Q). {Q)}}}alxl I.;, for n 5 w have beer1 defined in Exercise 3.3 in Chapter G, and some of' thcilproperties stated in Exercises 3.4 and 3.5 in Chapter G (see also Exercises 1.2 and 1.4(b) in this chapter).
258
CHAPTER 14. THE AXIOM OFFOUNDATION
2.6 Lemma (a) If X E Va and y E X then y E V# for some ,O < CL . (b) I f @ < a then Vp C V,. (c) For all a , V, is transitive and weEl-fowded.
Proof. (a) We proceed by transfinite induction on a. T h e statement is clear w l ~ i 1 a = 0 or a # 0 limit. If X E V,+l then X C V,, so y E X implies y E I,:, and we let 0 = a. (b) Again we use transfinite induction on a. Only the successor stage is 11011trivial. We show t h a t V, C V,,+ By (a), if X E V, then X Up,,, V#. By inductive assumption, U#<, V. /u V,. We conclude t h a t X C V, and heix:e X E V,+1. It follows that Vp 2 V, C holds for all 0 5 a. (c) Combining (a) and (b) gives that X E V, implies X c Ub,n V$ C V(,. showing that V, is transitive. We next show that each V, is well-founded. Let Y 2 V,, Y # 0. Let 0 bcb the least ordinal for which Y n Vp # 0;clearly 0 5 a. Take any X E Y n V,j: by Lemma 2.6(a), y E X implies y E V, for some y < 0, hence y Y. This shows t h a t X is an €-minimal element of Y . As V, is transitive. it. is well-founded. D
c
2.7 Theorem A set X is well-founded if and only if X E V, for some ordinal
Proof. 1) Assume that X is well-founded. It is easy to check that T C ( X ) U { X ) = T C ( { X ) ) is a transitive well-founded set. Let Y = {X E T C ( { X ) ) 1 .T E Va for some a ) ; it suffices t o prove t h a t Y = T C ( { X ) ) . If not, there is ;in c-minimal element a of TC({X}) - Y. For each y E a we then have y E Y: let f (y) be the least a such t h a t y E V;,. T h e Axiom Schema of Replacemellt guarantees that f is a well-defined function on a. We let sup f [ U \ . T11011 V,. and U E V,+1, (:orltriltiit:t.ir~:: y E Vf(y) V7 holds for each y E a , so a a $Y. 2) If X E V,, we have X c V, by transitivity of V,, so also T C ( X ) C K,. Any Y T C ( X ) , Y # 0, has a n €-minimal element, because Va is transitive and well-founded. We conclude that T C ( X ) is well-founded, and, by definitiori. '1 so is X . d
Earlier we proposed an intuitive understanding of the word "collection" that assumes t h a t the objects being collected exist "previously." There is thus a process of collecting more and more complicated sets stage by stage, described rigorously by the cumulative hierarchy V,, and yielding all well-founded sets. On the other hand, a set which is not well-founded does not fit this understanding, because the existence of a sequence (X, 1 n E N ) where X. 3 X I3 X:! 3 . . . implies an infinite regress in time. These considerations suggest that, from our present position, only well-founded sets are "true" sets. T h e reasoriableness ot'
2. WELL-F0 UNDED SETS
259
this attitude can be further confirmed by observing that all axioms of set theory we have considered (so far) remain true if the word "set" is replaced everywhere by the words "well-founded set." We illustrate this by a few examples, and leave the rest of the axioms as an exercise. 2.8 Example
(a) The Axiom of Extensionality. All elements of a well-founded set are wellfounded ( X E V, implies X V,). Therefore, if well-founded sets X and Y have the same well-founded elements the11 X = Y . (b) The Axiom of Powerset. The powerset P ( S ) of a well-founded set S is well-founded, (If S E V, then S C V,, so P(S) C P(V,) = V,+, and P ( S ) E V,+z is well-founded.) Thus for any well-founded set S there is a well-founded set P ( = P ( S ) ) such that for any well-founded X, X E P if and only if X C S. (c) The Axiom Schema of Comprehension. Let P ( x ) be any property of X (in which the word "set" has been everywhere replaced by "well-founded set"). If A is well-founded, {XE A I P ( x ) ) C_ A is also well-founded.
s
These and similar arguments can be converted into a rigorous proof of the fact that all the axioms of set theory we have adopted so far are consistent with the assumption that only well-founded sets exist. Doing so requires a rigorous formal analysis of what is meant by a property, a subject of mathematical logic that we cannot pursue here. Nevertheless, this fact. the sharpened intuitive understanding of the way sets can be collected in stages that we described above, and an observation that all particular sets needed in mathematics are well-founded (see Exercise 2-31, make most set theorists inctude the statement that all sets are well-founded among their axioms for set theory. The Axiom of Foundation are well-founded.
(Also called the Axiom of Regularity.) All sets
It should be stressed that, whether or not one accepts the Axiom of Foundntion, makes no difference as far as the development of ordinary mathematics in set theory is concerned. Natural numbers, integers, real numbers and funct,ions on them, and even cardinal and ordinal numbers have been defined, and their properties proved in this book, without any use of the Axiom of Foundation. As far as they are concerned, it does not make any difference whether or not there exist any non-well-founded sets. However, the Axiom of Foundation is very useful in investigations of models of set theory (see Chapter 15). Exercises
2.1 For any set X , TC({X}) is the smallest transitive set containing X as an element. 2.2 (a} V, E V,+1 - V,. (b) ,O < CY implies V' C V,. (c) CY E V,+1 - V, for all ordinals a.
CHAPTER 14. THE AXIOM O F FOUNDATI0;Y
260
2.3 Assume that X and Y are well-founded sets. Prove that {X. Y ) . (X. Y ) . X U Y , X n Y, X - Y , X X Y , X', U X . (h111 X. and I.:~II X ;LIT' wr.11founded. Prove that N , 2.Q, and R are well-founded sets. 2.4 Show that the Axioms of Existe~ice,Pair. U ~ ~ i oInfinity. n, and Choic:c~. as well a s the Axiorr~Scherna of Replar:enler~t,remain true if tllc~n.or(l "set" is replaced everywhere by the words "well-founded set." [Hirlt: Us(. Exercise 2 -3.] 2.5 An Induction Principle. Lcit P be solzle property. Assunle that. for all well-founded sets X ,
(*
>
if P ( y ) Ilolcls for all y E X. t.ht.11P ( x ) .
Conclude that P ( x ) 11oIdsfor a11 well-founded X . 2.G A R.ecursion Theorelzl. Let G be a11 operation. 'l'here is ;.L ul~icltlclr ) p f a ~ . i i tion F such that F ( x ) = G ( F 1 X) for all well-fou~~ded :c. F ( r ) = V] fill. a11 non-well-founded X . 2.7 Use Exercise 2.6 to show t.11at there is a unique operation p (rank) stlc,li that p(x) = sup{p(y) + 1 I y E X ) for well-founded X , p(:c) = { { V ] ) ) otherwise. Prove that V, = { X I p(x) E a ) holds for ztll ordinals 0 . 2.8 The Axiorn of Foundation car1 be used to give a rigorous definition of 01.der types of linear orderings (Chapter 4, Section 4). If 2l = (A, <) is a li11ear ordering, let CY be the least ordinal number such that therc is a linc;~~. ordering 2l' = (A', <') isolilorphic to U, arld of rank p(%') = n . (Cf. Exvrcise 2.7.) We define T ( % ) = {U' ( 2l' is iso~liorphicto 2l and p(2l') = 0 ) . This is a set because T ( % ) C_ V , + ] . Prove that 2l is isomorphir. t o 23 = ((B. 4)if and o~llyif' r(2.l) = r(C13). I . S O ~ ~ Ol?/pc.s T ~ ) o/ fLi\rt)iZST~ trary structures car1 be (lcfined in the same way.
3.
Non- Well-Founded Sets
For reasons discussed in the previous section, most set theorists accept the ,4xiorn of Foundation as part of their axiomatic system for set theory. Nevertheless. alternatives allowing non-well-founded sets are logically coi~sisteilt.have a (.PI.tain intuitive appeal, and lately liave found some applications. We clis(:uss two such "anti-foundation" axioms in this section. An alternative intuitive view of' sets is that the objects of' wllicli a stat is composed have to be definite arid distinct when the set is "finished," but 11ot ~~ecessarily beforehand. They may be formed as part of the same process that leads to the collection of the set. A p ~ i o r i ,this does not exclude the possibilitv that a set could become one of its own elements, perhaps even its only element. How could such sets be obtained'? It turns out that natural generalizations of results in Sections 1 and 2, in particular of Theorems 1 . l 0 and 1.12, lead t o non-well-founded sets. It is convenient to restate these results using some llrw terminology.
3. NON- WELL-FOUNDED SETS
261
3.1 Definition A graph is a structure (A, R ) where R is a binary relation in A. A pointed gruph is (A, R,p) where (A, R) is a graph and p E A # fl. A decoration of (A, R ) or (A, R,p) is a function f wit8hcjom f = A such that. for all X E A, f (X)= if (Y) l Y R ~ .
If f is a decoration of a pointed graph (A, R, p), the set f (p) is its value. In this terminology, Theorems 1.10 and 1 . l 2 imply, respectively: Every well-founded graph has a unique dec,orat,io~i. Every well-founded extensional gra p h has an injcctive (one-to-one) decorat ion, Moreover, a set is well-founded if and only if it is the value of a decoration of some well-founded graph. To prove this last remark, note that, if (A, R, p) is a well-foundecl graph and f a decoration, f ( p ) E ran f = T where T is transitive and well-founded by Theorem 1.10, so f (p) C_ T is well-founded. Conversely, given a well-founded set X , let A = TC({X)), R = € A , p = X , and f = IdA. It is easy to verify that (A, R,p) is R well-founded extensional pointed graph, f is an injective tlecoration of it, and f (p) = X (see Exercise 2.1). Let us now consider decorations of non-well-founded graphs. I11 the picture of ( A , R, p ) , elements of A are represented by dots, aRh is depicted as an arrow from h to a , and the "point" p is circled. 3.2 Example (a) See Figure l (a). Let A = {a) be a singleton, R = {(a, a ) ) , ancl p = U . Let S be the value of some decoration f of (A, R.p). Then S = { S ) . (b) See Figure l(b). Let A = {a, b} where a # h, R. = {( U , b), (h, a ) ) , and p = a. ). If T is the value of some decoration of ( A , R, p) then T = { (7') (c) For a more complicated example of a non-well-founded set, and t i hint of possible applications, we consider a task that arises in the study of programming languages. One has a set D of "data" and a set P of "programs." A program T E P accepts data as "inputs" and returns data m "out,puts": mathematically, it is just a function from D to D. This is straightforward when D n P = 0, but interesting complications arise when programs accept other programs (or even themselves) as input, a situation quite sonlmorl in programming. As a simple specific example, we let P = {T), D = {V), T}?and we stipu1at.c that the program T accepts any input from D and returns i t as output. witliout any modification; i.e., ~ ( 0=) 0, T ( T ) = T. We then have 7r = {(0,0),( T , T)} = {{{0}}, { { . t r } } } . Such a set can be obtained as the value of ally decoration of the graph depicted in Figure l ( c ) .
These examples show that we obtain non-well-fc~uncledsets if we postulate that (at least some) non-well-founded graphs have decoratior~s.But first t l ~ r r e is an interesting question to consider: what is the nlearli~~g of equality for rlollwell-founded sets? The Axiom of Extensionality, a sine quu non of set theory.
CHAPTER 14. THE AXIOM OF FOUNDATION
262 a S = {S}
8
Figure 1 tells us that, for any sets X and Y. in order to show X = Y it suffices to establish that X and Y have the same elements. For well-founded sets this is an intuitively effective procedure, because the elements of X and Y have bee11 formed "previously." But if, for example, X = { X ) and Y = { Y ) , trying to use Extensionality to decide whether X = Y begs the question. Some strong-~r principle (compatible with the Axiom of Extensionality, of course) is needed t o make the determination. The axiom we consider below implies existence of n m i y non-well-founded sets, a s well as a principle for deciding their equality. I t is a natural generalization of Theorem 1.10, obtained by dropping the wsumptiori that R is well-founded. 3.3 The Axiom of Anti- Foundation
Every graph has a unique decoration.
It is known that this axiom is consistent with Zermelo-Fraenkel set theory
3. NON- WELL-FOUNDED SETS
(without the Axiom of Foundation). 3.4 Definition A set X is reflexive if X
=
{X)
3.5 T h e o r e m The A ~ o m of Anti-Foundation implies that there exists a unique reflexive set.
Let (A, R , p ) be the pointed graph from Example 3.2(a). By t,lle Proof. Axiom of Anti-Foundation it has a decoration f ; X = f (p) is a reflexive set. If Y is also a reflexive set, we define g on A by g(u) = Y . It is clear that g is also 0 a decoration. By uniqueness, f = g and hence X = Y. Next we develop a general criterion for determining whether two dec.oratiuns have the same value. It involves an important notion of bisimulation, which originated in the st,udy of ways how one (infinite) process can imitate another. 3.6 Definition Let (Al, R1) and (A2,Rz) be graphs. For B C_ A I X A2 WP define B+ A1 X Az by: ( a l , a 2 )E B+ if for every X , E e x t ~( ,a l ) there exists 2 2 E extR,(a2),and for every x2 E extR,(a2) there exists z l E cxt,?, ( c l l ) . so
that ( 2 1 , x2) E B. We say that B is a bisimulation between ( A l , R , ) ant1 (A2. R2) if B+
B.
3.7 Lemma
(i) I f B C_ C {ii)
0 is a
C A1 X A2
then B+ C G+.
bisimulation.
(iii) The union of any collection of bisimulations is a bisimulation,
-
Proof. (i) and (ii) are obvious. (iii) If B U,,I B, and Bi C B: for all i E I, then B: C_ B+ for all i E I by (i), so B = U,,, B; 2 U,,, B: G B+. It follows immediately that, for any given (Al, R , ) , (A2, R2), there exists largest b2sirnulat2on
B
=
U{B
;I
A1 X A2 I B is a bisimulation}.
The next lemma connects this concept with decorations. 3.8 Lemma Let f 1 , f 2 be decorations of (Al, R I ) , (Az.R 2 ) , respectivelg. Set B = { ( a ] ,a2) Al X Az I f l ( a l ) = f2(a2)}. Then B is U bisimulation.
CHAPTER 14. THE AXIOM OF FOUNDATION
264 Proof.
We prove that B = B'. We have
( a l l a2) E B if and only if f ( a l ) = f2(a2) if and only if for every xl E extR,( a l ) : f l ( x l ) E f 2 ( a 2 ) and vice versa if and only if for every :cl E e x t ~((l1) , there exists x2 E e x t ~ , ( c ~ ~ ) such that f ( x l ) = f2(x2)and vice versa if and only if for every x l E e x t ~( ,a l ) there exists 5.2 E extRz(c12j such that (xl, x2) E B and vice versa if and only if ( a l , a 2 ) E B'.
3.9 Definition Pointed graphs (A l , Rl , p l ) and (A2,R 2 ,p2) are bisimulatio?~ equizlalent if there is a bisimulation B between (A l . R , ) and (A2, Rz) s1ic.h t . t ~ t (p1,p2) E B. (Equivalently, if (p1, p 2 ) E B.)
The next theorem provides the promised criterion for equality. 3.10 Theorem Let f l be a decoration of (A1, R I ,p l ) and let f 2 be a decoratzon of (A2, RZ, p 2 ) . The Axiom of Anti-Foundation implies that f and f 2 have the same value if and only if (A1,R I ,p1) and (A2,Rz,p2) are bisimulation equi~lnlent.
-
3.11 Corollary Assuming the Axiom of Anti-Foundation, X Y if and on.!?/ if the pointed graphs ( T C ( { X ) ) ,E , X ) and (TC({Y)), E , Y) are bisimulatio~t. equivalent. Before proving Theorem 3.10 we need a technical lemma.
3.12 Lemma (a) For i = 1 , 2 let fi be a decoration of (AigR Ipi)>et be the smallest &-transitive subset of A, .such that p, E A*.Let R, = R, n ( K X K ) and f, = fi I K. Then f, is a decoration of (Ai, l & , ~ ~ ) . (bj Let B be a lrisi~rrulationOetwecn (Al, R I,P I) and (A2, R 2, p 2 ) . Then B = B n (Al X A2) is a bisimulation between (A1,R1) and ( A 2 , R 2 ) . If ( p l ,p a ) E and r a n B = B then d o m B =
z.
Proof We recall from the proof of Lemma 1.8 that = C,,, whert: = { p t ) , CZqn+l = {y E At I yRzx holds for some x E C,,,) = U { e x t ~ , ( : c )I s E C,,,). In particular, e x t ~( X, ) = extK(x) for x E X . It follows immediately that is a decoration and B is a bisimulation. The last statement implies that dom B is an K-transitive subset of hence also an Rl-t ransitive subset of A 1 . If - , p2) E B then p1 E d o r n g , and we conclude that i;C dom B.Similarly. Az r a n E . D
X,
c
3.
NON- WELL-FO UNDED SETS
265
Proof of Theorem 3.10. Assumethat f l ( p l ) = f 2 ( p 2 ) .Then B = { ( a l , a s )E A1 X A2 I f l ( a 1 ) = f 2 ( a 2 ) ) is a bisimulation by Lemma 3.8, and ( p l , p 2 )E B. Hence ( A I ,R1,p1 ) and ( A z ,R2, pz) are bisimulation equivalent. Conversely, let B be a bisimulation satisfying - - ( p 1,p2) c B . From Lemma 3.12 we obtain a bisimulation B between ( A l ,R L )and ( A 2 .R 2 ) ,and decorations f l and We now define a graph (A, R) a s follows:
fi.
-
A = B = { ( a l , a 2 )I a l
EX,a2 EX.( ~ 1 . ~ E2 B); )
R = { ( ( b l ,b Z ) ,( a l , a z ) )E A
X
A I (br,al)
and (b2,az) E
G).
We define functions Fr and F2 on A by:
We note that F l ( ( a l , a z ) ) = % ( a ) = (%(h) I h % l ) = i { F ~ ( ( b l . b ~l ) ) bl%al, b p z a p , (br, bp) E B) = { F ~ ( ( bbl 2, ) ) I (bl , bz)R(a1:u p ) } ,so F1 is a decoration on ( A ,R). Similarly, F2 is a decoration on ( A . R). The Axioin of AntiFoundation implies F1 = F2, SO in particular f l ( p l ) = fi(p1) = Fl((p1,p z ) ) = F2 ( ( P I ,~ 2 ) =) Z ( p 2) = h (p2). Other "anti-foundation" axioms have been considered. For example, one (:all generalize Theorem 2.12 and obtain the following.
3.13 The Axiom of Universality decoration.
E v e q extensional graph has an injective
The Axiom of Universality is also consistent, but it is incompatible with the Axiom of Anti-Foundation: as the next theorem shows, it provides many more non-well-founded sets.
3.14 Theorem The Axiom of Universality implies that there exist collections of reflexive sets of arbitrary cndinalzty. Proof. Let A be any set, and let R = { ( a , a ) ( U E A ) be the identity relation on A. If a # b then extR(a) = { a ) # { h ) = e x t R ( b ) , so ( A , R) is extensional. If f is some injective decoration of ( A ,R) then f ( a ) = { f ( a ) ) holds for each a E A, and a # b implies f ( a ) # f ( b ) , by injectivity of f . Hence { f ( a ) ( a E A ) is a collection of reflexive sets of the same cardinality as A. Stronger results along these lines can be found in the exercises. Exercises 3.1 Construct pointed graphs whose value S has the property (a) S = (0,S ) ; (b) S = (0, S ) ; (C) S = N u {S).
CHAPTER 14. THE AXIOM O F FOUNDATIOAT
266
3.2 Show that Bf = holds for the largest bisimulation 3.3 For given graphs (A I , R I ) , ( A 2 , R2), (A3, R3) show
B.
(i) IdA, is a bisimulation between (A1, R1) and (A1, R I ) . (ii) If B is a bisimulation between ( A l , RI) ancl (A2. RZ) then B-' is a bisimulation between (A2, R2) and (A1, R1). (iii) If B is a bisimulation between (A1,R1) ancl (A2, R 2 ) ,arid C bt?twer.11 (A2, R 2 ) and (ASIR:,), then COB is a bisinlulation between ( A 1 , R I ) and (A37 R3). Conclude that the notion of bisimulation equivalerlce is reflexive, syrrimetric, and transitive. 3.4 Show that any two of the followirlg pointed graphs are bisirnulatiorl equivalent: (i) Exarnple 3.2(a); (ii) Example 3.2(b);
(iii) ( N , ((71
+ 1,n)(
71
N),O);
(iv) ( N ,>, 0). 3.5 Let (A, R) be a graph. Defii~eby trarisfinite recursion:
W,
-U
WO for limit o
# 0.
a<&
Show that there exists h such that WA = WA+1. Prove that (WA,R n 1%';) is well-founded. We call WA the well-founded part of ( A , R). 3.6 If W is the well-foundeti part of (A, R) and f l , f2 arc decorations of ( A , R ) then f l 1 W = f 2 1 W . 3.7 Let (A, R, p) be an extensiorial pointed graph, W its well-founded part. ;tnd p 4 W. Assuming the Axiom of Universality, show that therrb arc sets of arbitrarily large cnrtlinality, all elements of which arc! v:iluc>s of this graph. [Hint: Take the union of ;ill arbitrary collection of disjoint c:opies o f (A, R); identify their well-founded parts, show that the resulting str1ic:ture is extensionnl, and consider its injective decoration.) b
Chapter 15
The Axiomatic Set Theory The Zermelo-Fraenkel Set Theory With Choice
1.
In the course of the previous fourteen chapters we introduced axioms which. taken together, constitute the Zermelo-Fraenkel set, t,heory with Ch0ic.c (ZFC). For the reader's convenience, we list all of t,llein here.
There exists a set whic11 has no elements
The Axiom of Existence
The Axiom of Extensionality If every element, of X is an eleinei~tof Y ancl every element of Y is an element of X , then X = Y.
The Axiom Schema of comprehension Let P(z) be ;i property of :I-. For any A, there is B such that a: E B if and onIy if z C A and P ( x ) holds.
The Axiom of Pair ifx=Aorx=B.
ForanyAandB,thereisCsuchthat.?:ECif;~ildo~ily
The Axiom of Union X C A for some A E S.
For any S, there is I/ such that
The Axiom of Power Set if X S.
c
The Axiom of Infinity
X
E
U if and only if
For any S, there is P such that X E P if and only
An inductive set exists.
CHAPTER 15. THE AXIOMATIC SET THEORY T h e Axiom Schema of Replacement Let P[x, y ) be a property such that for every X there is a unique y for which P ( x , y ) holds. For every A there is B such that for every X E A there is y E B for which P ( x , g ) holds. T h e Axiom of Foundation T h e Axiom of Choice
All sets are well-founded.
Every system of sets has a choice function.
(The reader might notice that some of the axioms are redundant. For example, the Axiom of Existence ancl the Axiom of Pair can be proved from the rest .) We have shown in this book that the well-known concepts of real analysis (real numbers and arithmetic operations on them, limits of sequences, contiriuous functions, etc.) can be defined in set theory and their basic propert.ios proved from Zermelo-Fraenkel axioins with Choice. A similar asscrtioii [:an t)c. 111adeabout any other branch of contemporary mathematics (except category theory). Fundamental objects of topology, algebra, or functional analysis (siy. topological spaces, vector spaces, groups, rings, Banach spaces) are r:uston~arily defined to be sets of a specific kind. Topologic, algebraic. ancl analytic properties of these objects are then derived from the various properties of sets. which can be themselves in their turn obtained as consequences of the axionls of ZFC. Experience shows that all theorems whose proofs mathematicians accept on intuitive grounds call be in principle proved from the axionis of ZFC. In this serise, the axiomatic set theory serves a s a satisfactory unifying foundation for mathematics. Having ascertained that ZFC completely codifies current matherzlaticnl pri1c.tice, one might wonder whether this is likely to be the case also for mathematical practice of the future. To put the question differently, can all true mathematical theorems (including those whose truth h<% not yet been demonstratecl) be proved in Zermelo-Fraenkel set theory with Choice? Shoulcl the answer he "yes," we would know that all as yet open mathematical questions can be. a t least in principle, decided [proved or disproved) from the 'axioms of ZFC alo~~r.. However, matters turned out differently. Mathematicians have been baffled for decades by relatively easily f~rmulat~etl set-theoretic problems, which they were unable to either prove or disprove. A typical example of a problem of this kind is the Continuum Hypothesis (CH): Every set of real numbers is either at most countable or hCmthe cardinality o f the continuum. We showed in Chapter 9 that CH is equivalent to the statct~nc~r~t 2'" = N l r but we proved neither 2Nn = N I nor 2N" # N I . Another sirrlilitr problem is the Suslin's Hypothesis, and others arose in topology and measure t,k~eory(see Section 3). Persistent failures of all attempts to solve these problems led some mathematicians to suspect that they cannot be solved at all a t the current level of mathematical art. The axiomatic approach to set theory makes it possible to formulate this suspicior~as a rigorous, mathematically verifiable conjecture that, say, the Continuurrl Hypothesis cannot be decided from t.l~e
1. THE ZERMELO-FRAENKEL S E T THE0R.Y CVITH CHOICE
269
axioms of ZFC (which codify the current state of mathematical art, as we have already discussed). This conjecture has been shown correct by work of Kurt Godel and Paul Cohen. First, Godel demonstrated in 1939 that the Continuum Hypothesis car~not be disproved in ZFC (that is, one cannot prove 2N0 # NI). Twenty-four years later, Cohen showed that it cannot be proved either. Tlleir techniques were later used by other researchers to show that the Suslin's Hypothesis and many other problems are also undecidable in ZFC. We try to outJine some of Godel's and Cohen's ideas in Section 2, but readers more deeply interested in this matter should consult some of the more advanced texts of set theory. The foregoing results put the classical open problenls of set theory in a new perspective. Undecidability of the Continuum Hypothesis on the basis of' our present understanding of sets as reflected by the axioms of ZFC means that some fundamental property of sets is still unknown. The task is now to find this property and formulate it as a new axiom which, when added to ZFC? would decide C13 one way or another. At one level, such axioms are very easy to come by. We could, for example. add to ZFC the axiom [r2N(1= N1 ." Unfortunately, one could instead add the axiom
and get a different, incompatible set theory. Or how about the axiokn ';2N0= Cohen's work shows that this, also, produces a consistent set theory, incompatible with the previous two. Furthermore, it is possible to create two subvarieties of each of these three set theories by adding to them either the :txiol~l asserting that Suslin's Hypothesis holds or the axiom asserting that it fails. In the opinion of some researchers, that is where the matter now stands. Similarly as in geometry, where there are, besides the classical euclidean geometry. also various noneuclidean ones (elliptic, hyperbolic, etc.), we have the cantorian set theory with 2N" = N1 as an axiom, and, besides it. various nont:iintorial~wt. theories, wit h no logical reasons for preferring one to another. Such an attitude is hardly completely satisfactory. To solve the continuuni problem by arbitrarily adding 2'" = as an axiom is clearly cheating. It, is certainly not the way we have proceeded in the previous chapters. Whenever we accepted an axiom: (a) It was intuitively obvious that sets, as we understand them, have the property postulated by the axiom (some doubts arose in the case of the Axiom of Choice, but those were discussed at length.) (b) The axiom had important consequences both in set theory and in other mathematical disciplines; some of these consequences were in fact equivalerlt to it. So far, there appears to be very little evidence that tile axiom "'2 = Nd+17 (or any other axiom of the form "'2 = N,) satisfies either (a) or (b).
CHAPTER 15. THE AXIOMATIC SET THEORY
270
In fact, no axioms that would be intuitively obvious in the sarne way :is the axioms of ZFC have been proposed. Perhaps our intuition in these matters has reached its limits. However, in recent years it has been disc:ovcred that it number of open questiorls, irl ~);uticularin the field called clescriptiv~scat theory, has an intimate conriect ioil with various large cardinals! sucli as tliost> we mentioned in Chapters 9 and 13. The typical pattern is that such q l ~ ~ s t , i o i ~ s (:a11 be answered one way tfisu~i~irig an appropriate large (:rtrrli~lalexists. nrltl have an opposite answer :tssuming it does not, with the answer obtained with the help of large cardinals being preferable in the sense of being much morr, "n;ttural," "profound," or "beautiful." Tile great amount of research into these ~llikttersperformed over the last 40 years has protluced thth very rich, often very subtle and difficult theory of large cardinals, whose c?sthetic:aypcal rnakihs it difficult not to believe that, it c1t:scribes true itsyttcts of the univcbrstlof st.t t1ir:ory. We cliscuss sorrie of these results very briefly in Section 3.
2.
Consistency and Independence
111orcler to understand ~ ~ i e t ~ i ~for o tsllowing ls consisteocy and i~lclepenilcnt:t~ of t110 Continuuni Hypothesis with respt.c:t t.o Zermelo-F'raenkel axiorr~sfor set tllc-)or!.. let u s first iiivcstigate a similar, but ~rluchsimpler, probleni. In Chapter 2. defined (strictly) ordered sets ;is pairs ( A , <) where A is ;i set, ii~lcl< is i l t l ;~synlrnctricand transitive binary rt3lation in A. Ecluivalcntly. oat? nligltt s a y t.11at a11 orclerecl set is a str~lcturo( A . <), which si~t~isficbs the following nxiolns. The Axiom of Asymmetry
'rlirlre are no a and 6 such that
T h e Axiom of Transitivity
For a11 U , b, and c, if a < 6
i
d
U
< 6 tirltl b < ( 1 .
O < (:. tllrtn
U
< (..
WC?car1 also say that the Axio~lto f Asyrrlnletry nrtcl the Axiom of ?'rnnsitivit\. c.ornprise an axiomatic theory uf O T - ~ C T and that ordered scts ;ire ~rlodetsof' this ;~xiornatictheory. Finally, let us forr~lulateyet another axiom. T h e Axiom of Linearity
For all
U
tiiid O, either
U
< b or a
=
b or b < U .
Let us now twk whether the Axiorll of Linearity can be provecl ur clisprovchti in our axiomatic theory of order. First, let us assume that the Axion~of Linearity can be proved in the th(hory of 01-clcr. The11 every model of t l l ~theory of order woultl have t.o siit.istj~i~lst) the Axiorn of Linearity, L: logical consequence of t h t theory. Irl il rnorthf i ~ r t ~ i l i i \ i terminology, every orclerirlg woulii hive to be a linear ordering. But this is fiilsc.: an c?xampleof a model for the theory of order in which the Axioni of' linear it.^ does not hold is in Figure l (a). Next. let us msurrle t.hktt t,lit: Axiorrl of Linct~ritycar1 bt? rlisl~rovetli t 1 ttlth theory of orcler. 'I'llen every nlodel of the theory of order would havcb t.0 sat isf!also the nttgatiol~of the Axior~tof Linearity. In other. wort-ls. c've1.y orclcl~.irlg
2. CONSISTENCY AND INDEPENDENCE
Figure 1: (a) (P((0, l ) ) ,C ) ; (b) ((0. (0) ), E ) .
would have to be nonlinear. But this is again false; see Figure l(b) for an example of a model for the theory of order in which the Axiom of Linearity holds. We conclude that the Axiom of Linearity is undecidable in the theory of order. Let us now return to the question of undecidability of the Cont.inuun1 Hypothesis in ZFC. By analogy with the preceding exalnple, we see that. in order to show that the Continuum Hypothesis cannot be proved in ZFC, one has to construct a model of ZFC in which the Continuuill Hypothesis faiIs, and. sinlilarly, in order to show that the Continuum Hypothesis cannot be disprovcrl ill ZFC, one has to construct a model of ZFC in which the Conti~luurnHypothesis holds. In the rest of this section, we outline constructions of just such models. But a technical point has to be clarified first. To describe a model for the theory of order, i.e., an ordered set,, we have to specify the members of the model (by choosing the set A ) and the meaning of the relation "less than" (by clhoosilkg ;i binary relation < in A ) . Similarly, to describe a model for set theory, we have t.o specify the members of the rnodel and the meaning of the relation "belongs to" in the model. But, in view of the paradoxes of set theory, it is too optirnist,ic.t o expect that the members of a model for set theory call themselves be gatt~erctl into a set. (Actually, such models are not entirely impossible; their existenc:~ cannot be proved in ZFC, but can be proved if one extends ZFC by soinchlarge c:ardinal axiom - see Section 3). To circumvent this difficulty, models for set theory have to be described bj. a pair of properties, M ( x ) and E(x, y). Here M(x) reads "X is a set of the modtl" and E(x, y) reads "X belongs to y in the sense of the model." A trivial example of a model for set theory is obtained by choosing M ( x ) to be the property "X is a set" and E(x, y ) to be the property "X E y." The model simply consists of all sets with the usual membership relation. The first nontrivial model for set theory was constructed by Kurt Godel in 1939 and became known as the constntctiblc mudcl. Godel wanted to find
CHAPTER 15. THE AXIOMATIC SET THEORY
272
a model in which the Contiriuum Hypothesis would hold, that is, in which 2u0 = N1. Cantor's Theorem asserts that 2'0 > N I ; in other words N 1 is the smallest cardinal to which the cardinality of the continuum can be equal. Thew c.onsiderations suggest a search for a model which would contain as few sets as possible. Godel constructed such a irlodel by transfinite recursion, in stages. At. each stage, sets are put into the moclel only if their existence is guaranteed bv one of the axioms of ZFC. To begin with, the Axioni of Infinity and the Axiom Schema of Comprehension guarantee existence of the set of natural numbers, and so we let
Next, if P ( n ) is any property with parameters from U , then the Axiorn Schema of Comprehension postulates existence of the set {n E W ( P ( n ) holds in ( W , E ) ) , and we have to put it into the model. Therefore, we let
L, = {X c - L. I X = {n E L. IP(n) holds in (Lo,E ) ) for some property P with parameters from L. ) . In order to show that LI exists, it is necessary first to give formal definitior~s of logical concepts, such as "property" and "holds in," in set theory. Such definitions can be found in most textbooks of mathematical logic, but they arcB beyond the scope of this brief outline. Existence of L1 itself then follows from the Axiom of Power Set and the Axiom Schema of Comprehension, so L1itself has to be in the model. Notice that L. C L 1 : We can obtain X = k for any k E L. by taking "n C k" as the property P(n). Next we repeat the argument of the previous paragraph with L1 in place of Lo. Since L1is in the model, all of its subsets which are definablr in ( L 1 . C ) l)!some property P must be put into the model, as well as the set L2 of all suc.h subsets. In this fashion one defines Lg, L4, . . . . At stage W , we merely take the union of the previously constructed L,:
(its existence follows from the Axiom Schema of Replacement and the Axiorn of Union), and then continue as before. The recursive definition of the operation L thus goes as follows:
L. = w ; L,+l = {X C L, I X is definable in (L,, E ) by some property P with parameters from L,);
La =
U La if cr is a limit ordinal, a > 0. B
2. CONSISTENCY AND INDEPENDENCE
c
It is easy to see that L, L p whenever a < 0. A set is called constructible if it belongs to L, for some ordinal a . We are tlow ready to describe the constructible model. Sets of the model are going to be precisely the constructible sets [that is, M(x) is the property "X is constructible"]. The membership relation in the model is the usual one: If X and y are sets of the model, 2: belongs to y in the model if and only if X E y [that is, E(x,g ) is the property "X E y"]. Of course, it has to be verified that the constructible model is indeed a model for ZFC, that is, satisfies all of its axioms. As an illustration, we show that the Axiom of Pair holds in the constructible model: For any A and B in the model, there is C in the model such that, for all X in the model, X belongs to C in the model if and only if X = A or X = B. Taking into account the definition of the constructible model, this amounts to showing: For any constructible sets A and B , there is a constructible set. C such that, for all constructible X , X E C if and only if X = A or X = B. Let constructible sets A and B be given; we first prove that { A , B) is also a constructible set. Indeed, if A E L,, B E Lp, and y = max{a,p), then A. B C L, and the set {A, B ) is definable in (L,, E ) as { X E L, I X = A or X = B ) . We can conclude that (A, B ) E L,+1 and is, therefore, constructible. If one now sets C = {A, B), then C is a constructible set which clearly has the required property. The previous proof actually establishes a stronger result than mere validity of the Axiom of Pair in the constructible mode!; i t demonstrates that the operation of unordered pair, when performed in the model on two sets from the model, results in the usual unordered pair of these two sets. We say that the operation of unordered pair is absolute. A similar detailed analysis of the set-theoretic concepts involved shows that all the remaining axioms of ZFC hold in the constructible model, and that many of the usual set-theoretic operations and notions are absolute. The notion of natural number is absolute (that is, a constructible set is a natural number in the sense of the constructible model if and only if it is really a natural number) and so is the notion of ordinal number. Iiowever, some concepts are not absolute; among the most important examples are the operation of power set and the notion of cardina! number. The reason for this is that, say, the power set of U in the sense of the constructible model consists of all constructible subsets of w , while the real power set, of W consists of all subsets of W . So clearly the power set of w in the model is a subset of the real power set of W , but not necessarily vice versa. Indeed, it, is this phenomenori which allows us t o "cut down" the size of the continuum in the constructible model. Finally, the notion of constructibility is itself absolute: A set is constructible in the sense of the constructible model if and only if it is constructible. But all sets in the model are constructible! We can conclude that every set in the lnodel is constructible in the sense of the model. This argument proves that the following statement holds in the constructible model. The Axiom of Constructibility
Every set is constructible.
CHAPTER 15. THE AXIOMATIC SET THE0R.Y' To summarize, the constructible model satisfies all axioms of ZFC ancl, in addition, the Axiom of Constructibility. Consequently, t h e Axiom of Cotlstructibility cannot be disproved in ZFC, and can be added to it without danger of producing contradictions. It turns out that there are many remarkable t t l t b orems one can prove in ZFC enrirhecl by t-he Axioln of Cotistructibility: 2111 o t them then also hold in the constructible model, and thus caririot be disprovcltl in ZFC. Godel himself showed that t,he Axiorri of Constructibility irnplies t t w Generalized Continuum Hypothesis: 2"tt = N +
for a11 ordinals cr.
Ronald Jensen used the Axiom of Constructibility to construct a linearly ordcretl set without endpoints in which every system of mutually disjoint intervals is a t most countable, but which has no a t most countable dense subset, nnct t.1lits showed the failure of the Suslin's Hypothesis in this model. Many other tlt!op results of a similar character have been obtained since. Here we merely indicatc how the Axiom of Constructibility is uset1 t o show that 2"(1 = N I . Tlie proof is based on a futida~nentalresult in mathematical logic. tht) Skolem-Lijwenheim Theorern. This theorem asserts that for any structure ( A ;R , . . . F, . . . ) where R is a binary rdation, F is n unary function. etc.. tlltlrt. is an at n ~ o s tcouniable B A such that any property with parameters frotri U holds in (A; R.. . . , F , .. . ) precisely when it holds in ( B ; R n B 2 , .. . , F I B , . . . ). (This is an abstract, generalized versioti of such theorems as "Every group has ;l11 a t most countable subgroup," etc. I t also generalizes Theorem 3.14 i r ~Chaptor 4.) Let now X C W . T h e Axiom of Coristructibility guarantees t h a t X E L , , + I for sorne, possibly uncountablc, orclirial a. This means t h a t there is a proprrt,y P such t h a t n E X if and only if P(n) holds in (L,, E ) . By the Skolem-Lijwenheitl~ Theorem, there is an at most countabie set B C L, such that ( B . E ) s;itisfies the slttrie statements as (L,, , C ) . I11 particular, 7~ E X if ant1 only if P ( / / ) holds in ( B , E ) . Moreover, the fitct that t i structure is of the form ( L r jE, ) f i ~ r sorne ordinal ,O can itself be expressed by a suitable statemerlt, wliic:l.1 1loltl.i; ill (L,. E ) and thus also in ( B . E ) . Frorri all this, one can conclucle that ( B . E ) is (isomorphic to) a structure of the form ( L a , E ) for some, necessarily ;it 11lost countable, ordinal P. Since X is definable in ( B , E ) , we get X E L a + , . 1%can conclude t h a t every set of natural numbers is constructed at, so~rlc~ U4
.
2. CONSISTENCY AND INDEPENDENCE
275
T h e question of independence of the Continuttm Hypothesis from the axion~s of ZFC (that is, of showing that it cannot be proved in ZFC) remained open much longer. Finally, in 1963 Paul Cohen announceit the discovery of a inethod which enabled him to construct a model of ZFC ill which t h e Continuum Hypothesis failed. We devote the rest of this section to an outline of some of his ideas. Let us again consider the universe of all sets as described by Zermelo-Fraenkel axioms with Choice. T h e only information about 2") we have been able t,o derive is provided by Cantor's Theorem: 2'" > No (more generally, cf(2"") > No, see Lemma 3.3 in Chapter 9). In particular, 2'" = N I is a ilistinct possibilit,y (and becomes a provable fact if the Axiom of Construc.tibi1ity is also assunled). In general, it is, therefore, necessary t o add "new" sets t,o our universe in order to get a model for 2'" > NI. We concentrate our attention on the task of adding just one " ~ F L W "set of natural numbers X . For the time being X is just a symbol devoid of cont.ent,, a name for a set yet t o be described. Let us see what orie could say about it. A key t o the matter is a realization t h a t one cannot expect to have complete information about X . If we found a property P which would tell us exactly )) which natural numbers belong to X , we could set, X = {n E W I P ( ~ Latld conclude on the basis of the Axiom Schema of Conlprehension that X exists in our universe, and so is not a "new" set. Cohen's basic idea was t h a t partial descriptions of X are sufficient. He described the set X by a col1ec:tion of "approximations" in much the same way as irrationaI rlutnbers can be approximated by rationals. Specifically, we call finite sequences of zeros and ones cunditiuns; for example, Q), ( l ) , ( l , 0, l ) , ( l , l , O), { l , l ,0, l ) are conditions. Wt? view these conditiolis as providirlg partial information about X in the following sense: If the kth entry in a condition is 1, that condition determines that k E X . If it is 0, the conditioli determines that k # X . For example, {l,1 , 0 , 1 ) det,crtnines t h a t 0 E X , 1 E X , 2 # X , 3 E X (but does not determine, say, 4 E X either way.) Next, it should be noted t h a t adding one set X to the universe immediately gives rise to many other sets which were not in the universe originally. sucli IIS w - X , W X X , X 2 , P ( X ) ,etc. Each condition, by providing some information about X . enables us to make some conclusions also about these other sets. ant1 about the whole expanded universe. Cohen writes p It P ( p forces P ) to indic:at,c that information provided by the condition p determines t h a t the property P holds. For example, it is obvious that
(because, as we noted before, {l,1 , 0 , 1 ) IF 3
E
X , and 5 E w is true) or
(because ( 1 , 1 , 0 , 1 ) It- 2 # X). It should be noticed that conditions often clasl~:For example, { l , l , 0) IF (0, l ) E
x2,
276
CHAPTER 15. THE AXIOMATIC SET THEORY
while { l , 0, l ) It (0, l) @
x2-
To understand this phenomerion, the reader should think of p It P ( X ) as a conditional statement: If the set X is as described by the conditior~p, then X has the property P. So ( l ,1,O) It ( 0 , l ) E X' means: If 0 E X , 1 E X . and 2 $ X , then ( 0 , l ) E X'. while (1, 0: 1) 1l- (0.1) @ X 2 means: If 0 E X . 1 # X , and 2 E X , then ( 0 , l ) # X? If we knew the set X , we would of course be able t o determine whether it is as described by the condition ( 1 , 1 , 0 ) or by the condition (1,0,1) or perhaps by another of the remaining six conditiotls of length three. Since we cannot know the set X , we never know which of' these conditions is "true"; thus we never are able t o decide whether ( 0 , l ) E x2 or not. Nevertheless, it turns out that there are a great many properties o f X which can be decided, because they must hold n o matter which coriditions are "true." As an illustration, let us show that every condition forces t h a t X is infinite. If not, there is a condition p and a natural number k such that. p Il- "X has k elements.". For example, let p = (1,0,1) and k = 5: we show t h a t ( 1 , 0 , 1 )It "X has five elements" is impossible. Consider the condition q = (1.0,1,1,1,1,1). First of all, the condition q contains all information suppliecl by p: 0 E X, 1 # X , 2 E X . If the conclusion t h a t X has five elements coul(l be derived from p, it could be derived from q also. But this is absurd because clearly q It "X has at least six elements" ; namely, q It- 0 E X , 2 E X, 3 E X . 4 E X , 5 E X , 6 E X . T h e same type of argurilent leads t o a contradictiorl for any p and k. As a second and last illustration, we show t h a t every condition forces t h a t X is a "new" set of natural numbers. More precisely, if A is any set of natural numbers from the "original" universe (before adding X ) , then every conditiori forces that X # A. If not, there is a condition p such that p It X = A. Let us again assume that p = ( 1 , 0 , 1 ) ; then p Il- 0 E X , 1 @ X , 2 E X . Now there ilt.c. two possibilities. If 3 E A, let ql = ( 1 , 0 , 1 , 0 ) . Since q l contains all inforriiat,ioll supplied by p, (11 I t X = A. But this is impossible, because It- 3 @ X . whert>as 3 E A. If 3 # A, let qz = ( 1 , 0 , 1 , 1 ) . We can again conclude t h a t (12 1k X = .4 and get a contradiction from qz IF 3 E X and 3 # A. Again, a similar argun~erit works for any p. Let us now review what has beer1 accomplished by Cohen's construction. T h e universe of set theory has been extended by adding t o it a "new," "imaginary" set X (and various other sets which can be obtained from X by set-theoretic operations). Partial descriptions of X by conditions are available. These cl(:scriptions are not sufficient t o decide whether a given natural number belongs to X or not, but allow us, nevertheless, t o demonstrate certain statements about X , such as t h a t X is infinite and differs from every set in the original universth. Cohen has established t h a t the descriptions by conditions are sufficient t o show the validity of all axioms of Zerrnelo-Fraenkel set theory with Choice in t h e extended universe. Although adding one set of natural numbers t o the universe does not increase the cardinality of t h e continuum, one can next take the extended universe aild.
3, THE UNIVERSE OF SET THEORY
by repeating the whole construction, add to it another "new" set of natural numbers Y. If this procedure is iterated N 2 times, the result is a model in which there are at least N 2 sets of natural numbers, i.e., 2N(12 N2. Alternatively, one can simply add N 2 such sets at once by employing slightly modified conditions. Cohen's method has been used to construct models in which 2N" = N, for any N, with cf(N,) > No. It can also be used to build models in which Suslin's Hypothesis holds or fails to hold, models for Martin's Axiom MA,, as well as models for many other undecidable propositions of set theory. The methods described in this section can also be applied to showing that the Axiom of Choice is neither provable nor refutable from the other axioms of Zermelo-Fraenkel set theory. It is not refutahle because one can define the constructible model and prove that it is a inodel of ZFC using onlg the axioms of Zermelo-F'raenkel set theory without Choice. [A choice function for any system X of constructible sets can be defined roughly as follows: Se1ec.t from each A E S ( A # $3) that element which is constructed at the earliest stage - that is. belongs to L. or t o L,+1 for the least possible a. Should there be several. select the E-least element of L. or the one whose definition in (L,, E ) comes first in the alphabetic order of all possible definitions.] On the other hand, the Axiom of Choice is not provable in Zermelo-Fraenkel set theory either, because, Collen has shown, it is possible to extend the universe of set theory by adding to it a "new" set of real numbers without adding any well-ordering of this set, and thus obtain a model of set theory in which the set of real numbers cannot bc! wellordered. Consistency of the Axiom of Foundation with the remaining ;~xioiils of ZFC is also established by the method of models. A model for set theory with the Axiom of Foundation is obtained by letting M(x) be the property "X is a well-founded set," and E(x,y) be "X E y." The proof that this is indeed a model satisfying the Axiom of Foundation follows the lines of Example 2.8 in Section 2 of Chapter 14.
3.
The Universe of Set Theory
In this last section we give some thought t o possibilities of extending ZermeloF'raenkel set theory with Choice by additional axioms. Our interest here is in axioms which could with some justification be regarded as true. One possible candidate for such a new axiom has been introduced in Section 2; it is the Axiom of Constructibility, We have seen there that the Axiom of ConstructibiIity has important consequences in set theory; for example, it implies the Generalized Continuum Hypothesis and existence of counterexamples to Suslin's Hypothesis. In recent years it has also been shown that the Axiom of Constructibility can be a powerful tool for other branches of abstract mathematics, and a series of important results, including some of great elegance and intuitive appeal, has been derived from it in model theory, general topology. and group theory. On the other hand, there do not seem to be any good intuitive reasons for belief that all sets are constructible; the ease with which the method of forcing from Section 2 establishes the possibility of existence of
CHAPTER 15. THE AXIOMATIC SET THEORY
2 78
nonconstructible sets might suggest rather the opposite. Moreover, a1l of thc aforementioned results can be proved equally well from other axion~s,sc~rrieof' them weaker than the Axiom of Constructibility, and some of them act,u~~IIj. contradicting it. However, the most serious objections t o acceptirig the Axiorii of Constr uc:t,it ~ i l ity a t the same level as the axioms of ZFC stem fro111the fact that it yields s ~ t l l ( ~ rather unnatural consequences in rlcscriptive set theory, a branch of niat,he.matics concerned with detailed study of the complexity of sets of r e d ntlrnbers. As descriptive set theory provides sortie of the most important insights ir1t.o tlirb foundations of set theory, we outline a few of it,s basic problerris :rtid res~ilts next.
1% introduced Borel sets in Section 5 of Chapter 10 as sets of real ~lutrlb(~l-s t h a t are particularly simple. Their detailed study confirms this by sho\j7ing that Borel sets exhibit particularly nice behavior: Every Borel s ~ t is , elitlicbr a t most countable or it contains tr perfect subset, and so, in accordarlc:~wit,ll the Cotitinuum Hypothesis, it I l i ~either cardinality 5 No or 2Nli. Siniilarij-. every Borel set is Lebesgue rneasur:rble, and has rriany otlitir r~icepropc.rt,ic.s. Descriptive set theory attempts to extend these results to more complex stlt,s. How d o we obtain such sets? Count;rble unions arld irlterset:tions of Borel scts are BoreI, so we cannot use them to obtain "new" sets. Cotltirluous futictiotls are well-understood, simple functions, but it turns out that trri irnage of Bort.1 set by it continuous function (while 11opefuIIy still "simple" enough) neetl tiot bit a Borel set. We define analytic sets as those sets t h a t are irringes of'a Bol.c.1 set by a cor~tinuousfunction, arid cuanalytic sets as cotnpleirierlts of aniilytic, sets (in R). T h e usual notation is C: for the collectior~of ;ill irrialytiu sets ali(l II; for the collection of all r:o;uielytk: sets. Orle ctul then pn,recd fitrtli(:l-iui{l clefir~erecursively ;W the c:olloction of all coritirluous itr~sgesof H:, sc3ts I and II!+i as the collrctiori of ill1 cornplernelits of sets i ~ iC,,+1. 'l'his is thtt p r o j ~ c t i uhiera~rhg, ~ and sets that 1)t:long to some C:, (or n!,) ;ire tllr p r r ~ ~ r r : l i ~ v sets. We would expect projective sets t o behave nicely, beirig still rather sitr~l>lcb. arid the classical descriptive st2t tliclory as developed I,y Nikoli~iLuzili ii1i(1 Ilis students confirttls this to art tlxtetit. For example: it is kriowtl t h i ~ till1 arlnlytic, ancl coanalytic sets are Lebesgue tneiisurable, and that all uric:ountablt? analyt,ic: sets contain cr perfect subset. As to going further, ill ZFC with t.he Axioni of' Cotistructibility one gets discouragi~~g results: there exist C; (or I l l ) sets t.hi~t ;ire not Lebesgue measurabli?, irr~dtliere exist rincountirble c:oirnalytic ( I l l ) a.ts without a perfect subset. Another important classical result is t h a t and C : sets liaw the so-(:allc>d r ~ d u c t i u nproperty and C : and IIi sets do not have it. (A class o f sets is si~itl to have the reduction property if for all A, B E r there exist A'. B' E r stt(.ll t h a t A' A , B' C B , A' U B' = A U B , arid A' fl B' = C?.) O n e rriight expect nf,C:, Il:, etc., sets to have the reduction property, and CA, U:. et(... to fail to have it. but the Axiorrl of Constructibility leaves tlie case 71 = 1 at1 odd exception and proves instead tll;~tall C!, sets for n 2 2 li:rvt~ t h r wdiictinti property and all I l k sets, n 2 2 , fail t,o liave it.
c
g.
3. THE UNIVERSE O F SET THEORY
279
Surprisingly, a more satisfactory answer can be obtained by usirlg sonle of t h e so-called large cardinal arciams. T h e inaccessible cardinal numbers clcfint.(l in Chapter 9 provide the simplest examples of large cardinals. We remind t.he reader t h a t a cardinal number K > N o is called inaccessible if it is regular and a limit of smaller cardinal numbers. I t is shown in Section 2 of Chapter 9 that the first inaccessible cardinal (assumin g it exists) must be greater than N I , N1, . . . , N,, . . . , N N l , . . . , NNxl,. . . , NHH,, . . . , etc.. hence the adjrctive "large." I t is known t h a t the existence of inaccessible cardinals cannot be proved in ZFC; in fact, the construction of a model for ZFC in which there are iio inaccessible cardinals is rather simple. If the const,ruct,ible universe L contains no inaccessible cardinals (in the sense of L), then it is sudi a model. Otherwisc. we take t h e least inaccessible cardinal 6 in L and r:otlsider a model, set,s of which are precisely the elements of Ls, and the membership relation is the usual enc. It is quite easy to use the inaccessibility of 6 ant1 prove that all axioms of ZFr are satisfied in this model. T h e fact t h a t 6 is tile Ile~st.inaccessible rarc.linii1 ill L guarantees t h a t there are no inaccessible cardinals i l l this model. So tlxiste~~ce of inaccessible cardinals is independent of ZFC. Unlike the Continuum Hypot,hesis or the Suslin's problem, however, it is not possibIe to prove that illacc(>ssible cardinals are consistent with ZFC by way of constructing an appropriate model (doing so would contradict the celebrated Second Irlconlpleteness Theorctlil of Giidel). This means that a set theory with an inac:cessible carc1in:tl is essellt.iidly stronger than plain ZFC. T h e assumption of existence of inaccessible carc1in;lls requires a "leap of faith" (similar to that requiretl for iicr:ept;incc3o f the Asin111 of Infinity), but some intuitive justification for t . h ~ plausibility o f n~iiki~ig it (.;it1 be given. I t goes roughly as follows. Matlie~nati<:iansordinarily consider infinite c:ollcr:tions, sucll as t.11~sot o f natu ral numbers o r the set of real numbers. as finislicd. c.omyletclt1 totti1it.i~~. O n t h e other hand, it is not in the power of an ordinary m(zthemat,icia~~ to regard the collection of all sets as a finished totality, i.e., a set - it woulrl 1e;itl t o a contrlldiction. We call the collections whir:li nliithematicians ordilltirily view as finished the sets of the first order; they are the sets we h a w bt.frll con cerned with until now. Let us now placr o~trselvesi11 a position of ;in extraordin ary, more abstract, mathematician, who has cxanlined the t t ~ i i v ~ r s c ~ of the first-order sets and then collected all those sets into a new finished tot.ality. a second-order set V. Having gotten V, he can, of course. use the 1nct.11odso f Chapters 1- 14 t o form various other second-order sets. s u c l ~iw V - w. N I X 1,'. P ( P ( V ) ) , 0 = {a E V I 0 is a n ordinal), 0 1, O 0 + W ! etc. [Not,ii,e t,tltit 0 is the (second-order) set of all first-order ordin:~ls.] Thci int,uit.ive argttnlc%ilts which justified the axioms of ZFC for the first-order mathctmaticiaii itiso co~lviiicc~ the second-order mathematician of the validity of these axioms in his "sec:o~ldorder" universe of set theory. (Notice that Russell's Paradox is avoidcc1 t.liis viewpoint: T h e second-order mathematician can form t,he set R of 2111 first.order sets which are not elements of themselves, but R is not, a first-order set., so clearly R 4 R, and no contradictions appear. Of course, t,he sc>rotltl-c)~.dt.r mathematician still cannot form the "set" of all sei:orlci-order sets wllicll are llot elements of themselves; this is a task for a third-order ~nathematician.)1%now
+
+
CHAPTER 15. THE AXIOMATIC SET THEORY claim that 0, the least second-order ordinal, is an inaccessible cardinal in t h e second-order universe. Certainly 0 > No and 0 is a Iimit of (the first-order) cardinal numbers, If ( K , I L < a) is a sequence of ordinals less than 0 having length CY < 0 ,then both CY and all K , are first-order ordinals, and the arguments we used to justify the Axiom Schema of Replacement again convince us that, a first-order mathematician should be able to coIlect { K , I L < a ) into a firstorder set. But then sup,,, K , is a first-order ordinal, so sup,,, K , < 0 , ancl 0 is regular. We can conclude that the set-theoretic universe of the second-orcler mathematician satisfies the axiom "There exists an inaccessibIe cardinal." Let us now return to descriptive set theory, in particular to the questio~l whether all countable coanalytic sets have a perfect subset. Robert Solovay discovered a profound connection between this and inaccessible cardinals: If every uncountable coar~alyticset has a perfect subset, then NI is an inaccessible cardinal in the constructible universe L. (We pointed out in Section 2 t,hat thc notion of cardinality is not iibsolute, so the "real" cardinal N I need not be '.Hl in the sense of the model L" if not all sets belong t o L.) In fact, more than this is true:
(*l
For any real number a , N I is inaccessible in L[a],
where L[a] is a model constructed just as L, but starting with Lo[a] = W IJ {a) (it is the least model of set theory containing all ordinals and the real number a ) . Conversely, from the "large cardinal axiom" (*) it follows t h a t a11 uncountable II: sets have a perfect subset. In addition, (*) further implies t h a t all u~lcountableC: sets have a perfect subset (and thus cardinality 2'") arid t h a t all Cl and Ill sets are Lebesgue measurable (but it does not imply t,hr existence of perfect subsets of uncountable Ill sets, nor measurability of C:! sets). Overall, the axiom (*) produces a mild improvement upon the results obtainable from the Axiom of Constructibility. To get better results, we havcb t o assume existence of cardinals rnuch larger than mere inaccessibles. But b(:fot.t' outlining some of them, we have to make another digression. A very important role irl modern descriptive set theory is played by a certain kind of infinitary game. Let us consider a garne between two players, I and 11. described by the following rules. A finite set M of possible moves is given. 'l'lw players choose rnoves, taking turns a r ~ deach moving n times. That is, player I begins by making a move p1 E M ; player I1 responds by making a move qt E ;If. Then it is 1's turn t o make a move p2 E M , t o which I1 responds by p l a y i ~ g (12 E M , etc. T h e resulting sequence of moves ( p l ,(11 , pz, (12, - . - , Pn, (1), is ciille(1 a play. A set of plays S is specified in advance and known to both players. Player I wins if (p],.. . , qn) E S; player I1 wins if ( p l , . . , q,) # S. Many intellectual games between two players, including checkers, chess, and go. Ciul be rrlathematically represented in this abstract form by choosing suitable A f i . 7 1 . and S (some artificial conventions have to be adopted, e.g., in order to allow fur draws). A fundamental notion in game theory is t h a t of a strategy. A strategy for player I is a rule which tells him what move to make at each of his turns.
3. THE UNIVERSE O F SET THEORY
281
depending on the moves played by both players previously. If a strategy for player I has the property that, by following it, I always wins, it is called a winning strategy for I. Similariy, one can talk about a winning strategy for 11. The basic fact about games of this type is that one of the players always has a winning strategy; in other words, the game is determined. The reason is rather simple: If there is a move p1 such t h a t for every ql there is a move p2 such that for every (12 . . . there is a move p, such that for every qn the play (pl,ql,. . . , p n , qn) E S , then obviously player I has a winning strategy. If the opposite holds, it means that For every p1 there is ql such that for every p2 there is q2 such that . . . for every p, there is q, such that the play ( p l ,q l , . . . , p,, q,)
# S.
But in this case, I1 is the player with a winning strategy. We can now modify the game by allowing each player infinitely many turns. The play now is an infinite sequence of moves: (pl, ql , pz, q 2 , . - . ). The payoff set. S is a set of infinite sequences of elements of M , and I is the winner if and only if ( p l ,ql , p2, q2, . . . ) E S. The game played according t o these rules is referred to as G.?. It is no longer obvious that such games are determined, and as u matter of fact they need not be. Using the Axiom of Choice one car] construct a payoff set S M N for any M with IMI > 2 such that neither player has a winning strategy in the game Gs.(The argument is quite similar t o the one we used t o construct an uncountable set without a perfect subset in Example 4.1 1 of Chapter 10.) T h e interesting question is whether the game Gs is determined for "simple" sets S. In order to study this question, we take the finite set A d to be a natural number m; infinite sequences of elements of m can then be viewed as expansions of real numbers in base m, and the payoff set S can be viewed as a subset of R. In this way we can talk about payoff sets being Borel, analytic, etc. A profound theorem of D. Anthony Martin states that all games with Borel payoff sets are determined. The situation a t higher levels of the projective hierarchy is similar to that discussed in connection with the question of existence of perfect subsets, but the large cardinals irivolved are much larger. To be specific, the Axiom of Constructibility can be used to construct games with analytic (E:)payoff sets that are not determined. The work of Martin and Leo Harrington showed that determinacy of all games wit h C : (or, equivalently, H:) payoff sets is equivalent t o a certain large cardinal axiom. It would take us too far to try to state this axiom here (it is known technically as "For all real numbers X , X# exists."); for our purposes it suffices t o say that this axiom implies Solovay's assumption (*), and much more (for example, it implies the existence of Mahlo cardinals and weakly compact cardinals in the constructible
CHAPTER 15. THE AXIOMATIC SET THEORY
282
universe L), and is itself a consequence of the existence of measurable cardinals. So if a measurable cardinal exists therl all analytic and coanalytic games arc) determined. T h e existence of' measurable cardinals cloes riot suffice t o prove determinacy of all games with E: (or II:) payoff sets. Thc work of Martin. John Steel, and Hugh Woodin in the early nineties showed t h a t there are 1argc. i:i~rdinals(much larger than the first mvasurable orw) whosc c?xist,t:nc*einlpIic.~i tllc Axiom of Projective Deteminacy: all galnes with projective payoff sets it1.t' determined. Conversely, the Axiom of Projective Determinacy implies ~xist~eri<:c~ of models of set theory with these large c.ardinals. W h a t reasons are there for thc belief that something like tlie Axionl ot' Projective Determinacy is true'! Besides its own intrinsic plausibilit.y, it st,rms from the fact that determinacy of infinitary games at n certain level in t . t l c b projective hierarchy implies all tlie nice descriptive set-t heorctic pro pert i ~ that s the sets at or near t h a t level arc expected t.o have. For examplc, tletet.mina(:~. of ):I games implies that all unr:ou~lt;lbleIIfand C : sets 1 1 ; a~ perfect. ~ sl11)si.t. ilncl that all E: arrd II; sets are Lcibesgue measurable. T h e determiriacy of C: games implies that all uncountal)ltb II: and C : sets liave a perfect subsrt. ; r ~ l c l that all and II: sets art! Lltbibsgur measurable. Moreover, it also g r t u ~ r a l i z ~ ~ s the reciuctiori property correctly t o tlie third level of the projective hic.rarc:ll>p; i.e., it implies that Hi sets have tlie reduction property and sets do not. T h e full Axiom of Projective Deterrriinacy 11% ns its consecluences the Lebcsgut. rrieasurability of all projective sets, tlie fact that every uncountable projectivcl set has a perfect subset, and the "correct" behavior of the reduction property ( ,E , , E l . . - sets have it, E:, H i , E:, n i , . . . do not). among 111;uiy others not stated here. It is the results like these that are convincing workcm it1 descriptive set theory t h a t P D rriust be true. What emerges from the above c:onsiderations is a hierarchy postu1at.ing t.tie r.xisteiice of larger and larger r:ardiniils and providing better and better approximations t o the ultimate truth about the universe of sets. Further support for this gerit:ral picture has corn(! from the work on undecidable statcArrlentsiri ;u.itllmetic. By aritllrnetic we Inearl t h e theory of Peano arithnletic whoso iixioi 11s were given in Chapter 3; it is easy to establish t h a t Peano arithmetic is equivalent t o the theory of finite sets; i.e., the theory that is obt;iint!cl frorrl ZFC t ~ j , o ~ n i t t i ~ the l g Axiom of' Infinity (ill the resulting tllt:ory, only finite sets (:all bcl proved t o exist). It has been know11 since the fundamental work of Kurt Giiclrl in 1931 that tllere are true statements about natural rlunlbers (or finite sclts) t h a t cannot be decided (eitlier proved or clisproved) from the axiorils of Pctutlo arithmetic (or the theory of fir~itcsets). T h e statements are true because tlley can be proved in ZFC, but the use! of a t least sorrie infinite sets is c~sseriti:~lfor ttie proofs. However, Godel's exmnples of such staternents are based on logic:i~l (:onsiderations (similar to Russell's Paradox) and have no transparent matliernatic:il rneaning. Iri 1977, Jeffrey Paris discoverecl the first siniple nlcitlie~i~aticxl exarnples of such statements, and others have been found subsequently. One of' the rriost interesting is Theorem 6.7 in Chapter 6, the clainl t h a t every Gooilstein sequelice terrr~ir~ates with ii vt.illue of 0 after a firlite nurnber of steps. Our proof there uses irifiriite ordinals; tile work of Paris showed that some use of
)A:
)::
3. THE UNIVERSE OF SET THEORY infinity is necessary. Another example is a version of the Finite Ramscy's T11eorem discussed in Section 1 of Chapter 12. It is possiblc! to determine the sizt. of infinite sets needed t o prove a particular statement, and it turns out that. Ilerc, too, there is a hierarchy of theories postulating the esist,erice of larger and l:irgc?r infinite sets and providing better and better approximations t o the truth about natural numbers (or finite sets). In most cases these throries are subtheories of ZFC (so few mathematicians doubt that they are t,ruc?),but there are examples of (rather complicated) staternents of arithmetic that carinot be decided even in ZFC (but can be decided, for example, in ZFC with all inaccessible c.ardinal). so the hierarchy obtained from the study of the strength of arithmetic stat,rtnetlts merges into the hierarchy of large cardinals needed for the study of the strr?ngth of statements about real numbers arising in descriptive set theory. with ZFC being just one of the stages. Even the techniques used t o prove the rcsuIts bout arithmetic are closely related t o those used to study large cardinals; they rely heavily on such concepts as partitions, trees, and games. Most of the work discussed in this section is relatively recent, atid by no means complete. Both the theory of large cardirlals and t h e study of the undecidable statements of arithmetic are very active research arec% wllere new insights and interconnections continue to be discovered. As Godel assures us in his Incornpleteness Theorem, no axiomatic theory can decide all staternents of arithmetic o r set theory. We can thus feel corifident t h a t the eltterpristb of getting closer and ctoser t o the ultimate truth about the mathelnatical universe will continue indefinitely.
This page intentionally left blank
Bibliography [l]Peter Aczel. Non-well-founded sets. Stanford University Center for the
Study of Language and Information, Stanford, CA, 1988. With a foreword by Jon Barwise [K. Jon Barwise].
[2l Keith Devlin. The joy of sets. Springer-Verlag, New York, rrecond edition, 1993. Fhdamentals of contemporary set theory.
[3l F. R Drake and D. Singh. Infamediate set theory. John Wiley & Sow Ltd., Chichester, 1996.
[4] Herbert B. Enderton. Elements of set thtmy. Academic Preaa parcourt Brace Jovanovich Publishers], New York, 1977. 151 Paul Erdb, An&& Hajnal, Attila MAtC, and Richard WO. Combinatmid set theory: portition &ons for cuniinab. North-Holland Publishing Co., Amsterdam, 1984. (61 b n a l d L. Graham, Bruce L. Rothschild, and Joel H. Spencer. hmey theory. John Wiley & SOWInc., New York, second edition, 1990. A WileyIntemcience Publication.
Naive set thtmy. Springer-Verlag, New York, 1974. [7] Paul R. Halm-. Reprint of the 1960 edition, Undergraduate I?exts in Mathematics. [8] Jam= M. Henle. An outline of set themy. Springer-Verlag, New York, 1986. [g] Thomas Jech. Set t h q . Springer-Verlag, Berlin, second edition, 1997.
[l01 Winfried J w t and Martin We-. Discovering modern set theory. I. American Mathematical Society, Pddence, RI, 1996. The M c s . [l11 Akihiro Kanamori. Thc h i g h infinite. Springer-Verfag, Berlin, 1994. Large cardinals in set theory from their beginnings. [l21 Alexander S. Kechris. Classical h M i p t i u e set theory. Springer-Verlag, New York, 1995.
286
BIBLIOGRAPHY
[l31 Kenneth Kunen. Set theory. North-Holland Publishing Co., Amsterdam, 1983. An introduction to independence prooh, Reprint of the 1980 original. [l41 Yiannia N. Mtxchovakis. Descrigtiue set fheorfl. North-Holland Publishing Co.,Amsterdam, 1980. [l51 Yiannie N. M O B C h o ~ .Notes on set theory. Springer-Verlag, New York, 1994. [l61 Judith Rnitman. Introduction to modern set thsory. John Wiley & Sow Inc., New York, 1990. A Wiey-Interscience Publication. [l71 Itobert L.Vaught. Set theory. Birkh$iwr Boston Inc., Boston, MA, second edition, 1995. An introduction.
Index absolute, 273 accumulation point, 184 addition, 52,172,176 addition of cardinal numbers, 93 addition of ordinal numbers, 119 additive function, 147 aleph, 131 aleph naught No, 78 algebraic number, 101 almost diejoint functions, 213 analytic set, 278 antichain, 225, 237 antieyrnmetric relation, 33 A r o m j n tree, 228 associativity, 13, 94, 122,159, 160 asymmetric relation, 33 at m a t countable, 74 atom, 244 atomless meaaure, 244 automorphism, 60 Axiom of Anti-Foundation, 262 Axiom of Constructibility, 273 Axiom of Countable Choice, 144,152 Axiom of Existence, 7 Axiom of Extensionality, 7 Axiom of Foundation, 259 Axiom of Infinity, 41 Axiom of Pair, 9 Axiom of Power Set, 10 Axiom of Regularity, 259 Axiom of Union, 9 Axiom of Universality, 265 Axiom Schema of Comprehension, 8 Axiom Schema of Replacement,112 axiom, 3 Bake function, 200
Banach, SteEan, 149 M e , 146 beth, 223 binary operation, 55 binary relation, 19 biiimulation, 263 Bore1 set, 194 bounded sequence, 181 bounded set, 86,161 branch, 225 Brouwer-Kleene ordering, 85 Cantor set, 185 Cantor's Theorem, 96 Cantor, Georg, 1, 90,99,101 Cantor-Bernstein Theorem, 66 cardinal number, 68, 130, 140 cardiiity, 65, 68 cartesian product, 21,57 Cauchy sequence, 181 chain, 34 characteristic function, 91 choice function, 138 dosed interval, 179 closed set, 182 closed unbounded filter, 208,212 cl& unbounded set, 208, 212 closed under operation, S5 closure, 60 closure point, 145, 212 d y t i c set, 278 cohal sequence, 143 cohdity, 163 cofhite Bet, 202 Cohen, Paul, 101,139,269,275-277 collection, 9 commutativity, 13, 94
comparable, 34 compatible, 26, 237 complete linear ordering, 87 completion, 87 complex number, 179 composition, 21 computation, 48,115 concatenation, 62 condensation point, 190 condition, 275 connectivee, 5 constant, 7,57 constructible model, 271 conatructible set, 273 continuous function, 145, 182 continuum, 92 Continuum Hypotheeie, 101,141,268 convergence, 145 coordinate, 57 countable additivity, 150 countable set, 74 counting measure, 153, 204 cumulative hierarchy, 257 cut, 88 De Morgan Laws,14 decimal exparusion, 86 decoration, 261 decreclrsing eequence, 144 Dedekind cut, 88 Dedekind finite set, 97 A-lemma, 211 denae linear ordering, 83 density, 203 derived W, 191 deecriptive set theory, 278 determined, 281 diagonal intereection, 210, 212 diagoaalization, 90 diflerence, 13 digit, 174 disjoint, 14 distributivity, 14, 94, 122, 160 division, 173, 178 domain, 19 double induction, 46
element, 1 empty quence, 46 empty set, 698 equiptent, 65 equivalence dam, 30 equivalence relation, 29 ErdbDushnik-Miller Theorem, 222 Erd&M6 Theorem, 223 expansion, 178 exponentiation, 54 exponentiation of cardinals, 95 exponentition of ordinals, 122 extension, 24 extensional relation, 254 hctorial, 48 Fibonacci eequence, 50 filter, 201 finite additivity, 204 finite character, 144 finite intemction property, 180,202 finite eequence, 46 finite set, 6 9 , n h e d point, 69 forcing, 275 function, 23 function into, 24 function on, 24 function onto, 24
Giidel, Kurt, 3, 101, 269, 271, 274, 279,282,283 GCH, 165 generic set, 236 Goodstein quence, 125,126 graph, 261 greatest element, 35 greatest lower bound, 35 Hahn, Hana, 149 Hahn-Banach Theorem, 148 Hamel h i s , 146 Harringbn, h,281 Hartog number, 130 Hausdorff'e Formula, 168 hereditarily finite, 114
INDEX Hilbert, David, 101 homogeneous set, 217 ideal, 202 identity, 4 image, 20 inaaeasible cardinal, 162,168 inclusion, 12 incomparable, 34 incompatible, 237 increasing function, 105 increasing sequence, 160,181 Induction Principle, 42,44,114,260 inductive set, 40 idmum, 35 infinitary operation, 198 infinite product of carttindn, 157 infinite sequence, 46 infinite set, 69 infinite eum of cardinals, 155 initial ordinal, 129 initial segment, 104 injective function, 25 integer, 171 integer part, 174 intersection, 7, 13, 14 inverse, 20 inverse image, 20 invertible function, 25 isolated point, 184 hmorphic structures, 59 isomorphism, 36,58 isomorphiam type, 260 Jensen'e Principle, 234 Jensen, h n a l d , 274 kcomplek filter, 246 KBnig'a Lemma,227 KBnig'e Theorem, 158 least element, 35 least upper bound, 35 Lebesgue measure, 152 length, 62 lexicographic ordering, 82
limit, 160,181 limit cardinal, 162 limit point, 212 linear functional, 148 linear ordering, 34 linearly independent, 146 linearly ordered set, 34 lower bound, 35 Mahlo cardinal, 249 mapping, 23 Martin'e Axiom, 237,238 Martin, Donald Anthony, 281,282 mcurimal element, 35 measurable cardinal, 247 meaaure, 150, 204,241 measure problem, 241 minimal element, 35 model, 270 monochromatic set, 217 monotone function, 69 Mostoweki'e Lernma, 255 multiplication, 54, 172, 177 multiplication of cardinals, 94 multiplication of ordinale, 121 n-ary operation, 57 n-ary relation, 57 n-tuple, 56, 118 natural number, 41 node, 225 nondecreasing quence, 181 normal form, 124 nonnal ultra;filter,249 o n e b o n e function, 25 open interval, 179 open set, 182 operation, 7, 55 order type, 79, 113, 260 ordered n-tuple, 56 ordered field, 178 ordered pair, 17 ordered set, 33 ordering, 33 ordinal number, 107
INDEX
palindrome, 64 parameter, 5 Paris, Jeflrey, 282 partition, 31 Peano arithmetic, 54 Peano, Giuseppe, 54 perfect set, 185 power set, 10 prime ideal, 205 principal filter, 201 principal id&, 202 Principle of Dependent Choices, 144 product of an indexed eystem, 26 projection, 62 Projective Determinacy, 282 projective set, 278 proper subset, 12 P~~P&Y 1, quantifiers, 5 Ramaey's Theorem, 218,219 range, 19 rank, 255 rational number, 173 real number, 89 Recursion Theorem, 47, 115,260 reflexive relation, 29 regressive function, 209,212 regular cardind, 161 relation, 29 restriction, 24 R d l ' s Paradax, 3 R d l , Bertrand, 2 quence, 46 set, 1 set of all seta, 2 set of representatives, 32, 139 setofeets, 2 U-additive, 150, 151, 241 U-algebra, l51 a-complete filter, 246 Silver'e Theorem, 212 singleton, 17 singular cardinal, 160
Skolem-Ewenheim Theorem, 274 fhlovay, Robert, 152, 280, 281 statement, 6 Etationary set, 209,212 Steel, John, 282 strategy, 280 strict ordering, 33 strong limit cardinal, 168 structure, !X, 58 eubaequence, 181 subaet, 6,10 successor, 40, 51 successor cardinal, 162 successor ordinal, 108 eum of linear orderings, 81 supremum, 35,109 Swlin tree, 230 Swlin's Problem, 180,230,268 Suslin, Mikhail, 230 symmetric difference, 13 symmetric relation, 29 system of sets, 9
ternary relation, 22 Tichonov's Theorem, 153 total ordering, 34 transcendental number, 101 transfinite sequence, 115 transitive doare, 256 transitive relation, 29 transitive set, 107 translation invariance, 150 tree, 225 tree property, 229, 248 trivial filter, 201 trivial ideal, 202 trivia) measure, 204 W e y ' s Lemma, 144 twwdued measure, 206,244 U-limit, 206 ultrafilter, 205 unary operation, 55 unary relation, 22 unbounded set, 161 uncountable set, 90
INDEX union, 9, 13 universe, 58 unordered pair, 9 upper bound, 35 value, 23 variables, 4 Venn diagra,ma, 12 weakly compact cardinal, 224 well-founded relation, 251 well-ordered set, 44, 104 well-ordering, 44 Well-Ordering Principle, 142 winning strategy, 281 W d n , Hugh, 282
Zermelo, Ernst, 11,138,139 ZermeleFkaenkel Set Theory, 267 Zorn'e Lemma,142