THE ARCHÉ PAPERS ON THE MATHEMATICS OF ABSTRACTION
THE WESTERN ONTARIO SERIES IN PHILOSOPHY OF SCIENCE A SERIES OF BOOKS IN PHILOSOPHY OF SCIENCE, METHODOLOGY, EPISTEMOLOGY, LOGIC, HISTORY OF SCIENCE, AND RELATED FIELDS
Managing Editor WILLIAM DEMOPOULOS
Department of Philosophy, University of Western Ontario, Canada Department of Logic and Philosophy of Science, University of Californina/Irvine Managing Editor 1980–1997 ROBERT E. BUTTS
Late, Department of Philosophy, University of Western Ontario, Canada
Editorial Board JOHN L. BELL,
University of Western Ontario
JEFFREY BUB,
University of Maryland
PETER CLARK,
St Andrews University
DAVID DEVIDI,
University of Waterloo
ROBERT DiSALLE,
University of Western Ontario Indiana University
MICHAEL FRIEDMAN, MICHAEL HALLETT, WILLIAM HARPER,
University of Western Ontario
CLIFFORD A. HOOKER, AUSONIO MARRAS,
McGill University
University of Newcastle
University of Western Ontario
JÜRGEN MITTELSTRASS, JOHN M. NICHOLAS,
Universität Konstanz
University of Western Ontario
ITAMAR PITOWSKY,
Hebrew University
VOLUME 71
THE ARCHÉ PAPERS ON THE MATHEMATICS OF ABSTRACTION Edited by
ROY T. COOK
123
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN 978–1–4020–4264–5 (HB) ISBN 978–1–4020–4265–2 (e-book) Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. www.springer.com
Printed on acid-free paper
All Rights Reserved c 2007 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
For Alice, who has kindly tolerated the company of many abstractionists, and one in particular
Contents
Foreword
ix
Notes on the Contributors
xi
Acknowledgements
xiii
Introduction
xv
Part I The Philosophy and Mathematics of Hume’s Principle Is Hume’s Principle Analytic? G. Boolos
3
Is Hume’s Principle Analytic? C. Wright
17
Frege, Neo-Logicism and Applied Mathematics P. Clark
45
Finitude and Hume’s Principle R. G. Heck, Jr.
61
On Finite Hume F. MacBride
85
Could Nothing Matter? F. MacBride
95
On the Philosophical Interest of Frege Arithmetic W. Demopoulos
105
Part II The Logic of Abstraction “Neo-logicist” Logic is not Epistemically Innocent S. Shapiro & A. Weir
vii
119
viii
Contents
Aristotelian Logic, Axioms, and Abstraction R. T. Cook
147
Frege’s Unofficial Arithmetic A. Rayo
155
Part III Abstraction and the Continuum Reals by Abstraction R. Hale
175
The State of the Economy: Neo-logicism and Inflation R. T. Cook
197
Frege Meets Dedekind: A Neo-logicist Treatment of Real Analysis S. Shapiro
219
Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint C. Wright
253
Part IV Basic Law V and Set Theory NewV, ZF and Abstraction S. Shapiro & A. Weir
275
Well- and Non-well-founded Extensions I. Jané & G. Uzquiano
303
Abstraction & Set Theory Bob Hale
331
Prolegomenon to Any Future Neo-logicist Set Theory: Abstraction and Indefinite Extensibility S. Shapiro
353
Neo-Fregeanism: An Embarassment of Riches A. Weir
383
Iteration One More Time R. T. Cook
421
Foreword
In September 2000 the Arché Centre launched a five-year research project entitled the Logical and Metaphysical Foundations of Classical Mathematics. Its goal was to study the prospects, philosophical and technical, for abstractionist foundations for the classical mathematical theories of the natural, real and complex numbers and standard set theory. Funding was provided by the then Arts and Humanities Research Board (now the Arts and Humanities Research Council) for the appointment of full-time postdoctoral research fellows and PhD students to collaborate with more senior colleagues in the project, and at the same time the British Academy awarded the Centre additional resources to establish an International Network of scholars to be associated with the work. This was the beginning of the serial ‘Abstraction workshops’ of which the Centre had staged no less than eleven by December 2006. We gratefully acknowledge the generous support of the Academy and Council, sine qua non. The project seminars and Network meetings generated—and continue to generate—a large number of leading-edge research papers on all aspects of the project agenda. The present volume is the first of what we hope will be a number of anthologies of these researches. With two exceptions,—the contribution by the late George Boolos and the co-authored paper by Gabriel Uzquiano and Ignacio Jané,—the papers that Roy Cook has collected in the present volume are all authored by sometime members of the project team or of the British Academy Network. Their broad focus, as he explains, is on some of the more technical issues thrown up by the Abstractionist project, and it is anticipated that subsequent volumes may have a more purely metaphysical or epistemological emphasis. I would like to thank Roy Cook for all his hard work putting the volume together, and Bill Demopoulos for sponsoring its publication in the Western Ontario Series in Philosophy of Science. Special thanks go to the members of the core team and the Network not just for their direct contributions to the researches of the project but for their continuing affirmation, by their active participation, of the wider interest and importance of
ix
x
Foreword
the neo-Fregean enterprise in the landscape of contemporary philosophy of mathematics. CJGW St. Andrews 6/07 The Logical and Metaphysical Foundations of Classical Mathematics Sometime project team members: Crispin Wright, Peter Clark, Roy Cook, Philip Ebert, Bob Hale, Fraser MacBride, Paul McCallion, Darren McDonald, Nikolaj Jang Pedersen, Agustin Rayo, Marcus Rossberg, Andrea Sereni, Stewart Shapiro, Chiara Tabet, Robert Williams Auditor: Kit Fine British Academy International Network members: Alexander Bird, Robert Black, Robin Cameron, William Demopoulos, Richard Heck, Keith Hossack, Daniel Isaacson, John Mayberry, Michael Potter, Adam Rieger, Ian Rumfitt, Peter Simons, William Stirton, Peter Sullivan, Alan Weir
Notes on the Contributors
George Boolos was Professor of Philosophy at Massachusetts Institute of Technology, and the co-author of Computability and Logic (with Richard Jeffrey, Cambridge 2007) and the author of The Logic of Provability (Cambridge 1995). Peter J. Clark is Professor of the Philosophy of Science and Head of the School of Philosophical and Anthropological Studies in the University of St Andrews. He works primarily in the philosophy of physical sciences and mathematics and was editor of the British Journal for the Philosophy of Science 1999–2005. Roy Cook is Visiting Assistant Professor of Philosophy at Villanova University, and an associate fellow of Arché. He has published papers in the philosophy of language, logic, and mathematics, focusing primarily on semantic, soritical, and set-theoretic paradoxes, and Fregean and neo-Fregean philosophies of mathematics. William Demopoulos is a member of the Department of Philosophy of the University of Western Ontario and the Department of Logic and Philosophy of Science of the University of California, Irvine. He has published articles in diverse fields in the philosophy of the exact sciences, and on the development of analytic philosophy in the twentieth century. Bob Hale is Professor of Philosophy at the University of Sheffield, and an Associate Director of Arché. He works mainly on topics in the epistemology and metaphysics of mathematics and modality. His publications include Abstract Objects (Blackwell 1987), and, together with Crispin Wright, The Blackwell Companion to the Philosophy of Language (Blackwell 1997) and The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics (Oxford 2001). Richard Heck is Professor of Philosophy at Brown University and an associate fellow of Arché. He has published extensively on historical, conceptual, and technical issues emerging from Frege’s philosophy of mathematics. Philosophy of language and philosophy of logic are his other main areas of interest. He is now working on a book on philosophy of language and another on the development of Frege’s mature philosophy (co-authored with Robert May). xi
xii
Notes on the Contributors
Ignacio Jané is Professor of Philosophy in the Department of Logic and the History and Philosophy of Science of the University of Barcelona. His main interests are in the foundations of mathematics, philosophy of mathematics, and philosophy of logic. His recent papers include “Reflections on Skolem’s Relativity of Set-Theoretical Concepts” (Philosophia Mathematica, 2001), “Higher-Order Logic Reconsidered” (The Oxford Handbook of Philosophy of Mathematics and Logic, 2005), and “What is Tarski’s Common Concept of Consequence” (The Bulletin of Symbolic Logic, 2006). Fraser MacBride is a Reader in the School of Philosophy at Birkbeck College, London. He previously taught in the Department of Logic & Metaphysics at the University of St. Andrews and was a research fellow at University College, London. He has written several articles on the philosophy of mathematics, metaphysics, and the history of analytic philosophy, and is the editor of Identity & Modality (Oxford, 2006) and The Foundations of Mathematics and Logic (special issue of The Philosophical Quarterly, vol. 54, no. 214, 2004). Agustin Rayo is Associate Professor of Philosophy at MIT and an associate fellow of Arché. He works mainly in the philosophy of language and the philosophy of logic. Stewart Shapiro is the O’Donnell Professor of Philosophy at The Ohio State University and a Professorial Fellow in the Research Centre Arché at the University of St. Andrews. His publications include Foundations without foundationalism: a case for second-order logic (Oxford, 1991), Philosophy of mathematics: structure and ontology (Oxford, 1997), and Vagueness in context (Oxford, 2006). Gabriel Uzquiano is a Tutorial Fellow in Philosophy at Pembroke College and a CUF lecturer in Philosophy at the University of Oxford. He has published articles in metaphysics, philosophical logic, and the philosophy of mathematics. Alan Weir is Professor of Philosophy, University of Glasgow. He has published articles on logic and philosophy of mathematics in a number of journals including Mind, Philosophia Mathematica, Notre Dame Journal of Formal Logic, and Grazer Philosophische Studien and contributed chapters to a number of volumes devoted to these areas. Crispin Wright is Bishop Wardlaw Professor at the University of St Andrews, Global Distinguished Professor at New York University, and Director of the Research Centre, Arché. His writings in the philosophy of mathematics and logic include Wittgenstein on the Foundations of Mathematics (Harvard 1980); Frege’s Conception of Numbers as Objects (Aberdeen 1983); and, with Bob Hale, The Reason’s Proper Study (Oxford 2001). His most recent books, Rails to Infinity (Harvard 2001) and Saving the Differences (Harvard 2003), respectively collect his writings on central themes of Wittgenstein’s Philosophical Investigations and those further developing themes of his Truth and Objectivity.
Acknowledgements
The Editor wishes to thank the following: Oxford University Press, Kluwer Academic Publishers, Analysis, The British Journal for the Philosophy of Science, The Journal of Philosophical Logic, The Journal of Symbolic Logic, The Notre Dame Journal of Formal Logic, Philosophia Mathematica, and Philosophical Books for permission to reprint the papers that follow. Detailed individual citations are included with the papers. Crispin Wright, Director of Arché: Philosophical Research Centre for Logic, Language, Metaphysics, and Epistemology, for providing the foreword. William Demopoulos for proposing, and securing, the publication of this work in the Western Ontario Series in the Philosophy of Science. Charles Erkelens and Lucy Fleet at Springer for their guidance and encouragement. The administrative staff at the Arché Centre in St Andrews (Gill Gardner, Sylvia Rescigno, and Sharon Coull) and at Villanova University (Elvia Beach and Terry DiMartino) for constant assistance in the practical aspects of preparing this volume. Marguerite Nesling for converting a number of the papers from hardcopy to electronic format. The Arts and Humanities Research Board (now the Arts and Humanities Research Council) for support in the form of a postdoctoral research fellowship, which was held by the editor during the initial stages of this volume.
xiii
Introduction
As noted in the preface, the papers included in this volume concentrate (much of the time, at least) on philosophical questions that are intimately tied up with the interesting, and sometimes puzzling, mathematical properties of abstraction principles. As a result, the introduction you are about to read will follow suit – concentrating on philosophical issues that have their roots in the mathematical characteristics of abstraction principles as well as philosophical problems whose solution would seem to require a somewhat technical approach. This focus should not be read as any sort of value judgment regarding the worth of technical versus non-technical work on abstraction principles, or within the philosophy of mathematics more generally. Instead, this focus on philosophical problems that are linked to mathematical aspects of abstraction reflects the fact that there has, in the last decade or two, been an immense amount of valuable work on Fregean-inspired abstraction principles and their philosophical importance. To attempt to cover all of this work, or even all such work that has some connection to the Arché Centre, would require several volumes the size of the present one. Hence the narrower focus. The volume is divided into four sections. The first contains papers of a general sort (which can also serve as a helpful introduction to the subject for those less familiar with the literature), although the majority of these nevertheless address distinctly technical issues, at least indirectly. The remaining three sections are devoted to three topics which have come under increasing study and scrutiny after the apparent success of the account of arithmetic based on abstraction. The second section (“The Logic Of Abstraction”) contains three papers that examine the role of logic (in particular, higher-order logic) within the abstractionist framework. The third section (“Abstractionism and the Continuum”) contains papers that attempt to extend the abstractionist account to the theory of the real numbers, as well as papers critically evaluating such attempts. The fourth and final section (“Basic Law V and Set Theory”) is devoted to attempts to reconstruct set theory (or something like it) within the abstractionist framework – usually by adopting some consistent variant of Frege’s original Basic Law V. Even with our focus narrowed to the more technical aspects of abstraction principles, however, the range of topics and problems addressed in the papers
xv
xvi
Introduction
to follow is vast. Therefore, in the interest of providing a reasonably concise and easily digestible introduction to the subject, there will be no attempt to discuss every issue that arises in the following chapters. Instead, the remainder of this introduction will proceed as follows: First, a brief sketch of the origin of interest in abstraction principles, i.e. Frege’s logicism and its failure, will be provided. Second, we will briefly examine the philosophical framework underlying the resurrection of interest in abstraction principles, a view often called Neo-Fregeanism, Neo-Logicism, or Abstractionism. Next we will look at brief sketches of the philosophical and technical work underlying the abstractionist reconstructions of arithmetic, analysis, and set theory. Then we shall survey three general types of problem that such reconstructions face, and conclude with a brief discussion of indefinite extensibility, a notion that has become of central importance in much of the work attempting to solve problems of the sort covered in the previous sections. Before moving on, a comment needs to be made regarding terminology. As already noted, the philosophical view (or views) under discussion in the remainder of this volume have been called, at various times and places, Neo-Fregeanism, Neo-Logicism, and Abstractionism. In the remainder of this essay the term “Abstractionism” will be used. The reasons for this are simple: “Neo-Logicism” is misleading, since it would seem to imply that the view is a new version of logicism, while, as we shall see, it is no such thing. “Neo-Fregeanism”, while perhaps not misleading in this way, is, in the editor’s opinion, better reserved for the more general view in the philosophy of language, clearly Fregean in nature, that (usually) underlies the philosophy of mathematics discussed in the chapters to follow (although even here there is further confusion, since this term is also used to refer to a collection of views associated with the work of certain Oxford philosophers such as Gareth Evans and John McDowell). One could presumably hold such Fregean views regarding language without believing in the fundamental importance of abstraction principles (and vice versa – see Agustin Rayo’s “Frege’s Unofficial Arithmetic” [2002], reprinted as chapter 10 below). When reading the essays collected in this volume, however, one should keep in mind that these terms are (unfortunately) used for the most part interchangeably.
1.
Abstraction and logicism An abstraction principle is any formula of the form: (∀α)(∀β)(@(α) = @(β) ↔ E(α, β)
where “@” denotes a unary function mapping entities of the type ranged over by α (usually concepts, objects, or sequences of such) to objects, and “E( , )” is an equivalence relation on those same entities. The general idea behind abstraction principles is that they allow us to introduce new
xvii
Introduction
terms (and thus presumably to gain privileged epistemological access to the corresponding objects) by defining the identity conditions for the referents of the novel terms using linguistic resources that are already understood (i.e. those resources occurring in the equivalence relation “E( , )” – in most cases “E( , )” is either a purely logical formula, or one composed of logic plus previously introduced abstraction operators). Thus, an abstraction principle is meant to act as an implicit definition of sorts, providing (so the story goes) an account of the meaning of novel terms of the form “@(α)”. Perhaps the first notable occurrence of an abstraction principle occurs in Frege’s attempt at a logicist reconstruction of arithmetic (and, in fact, all of mathematics). Frege notes, in the Grundlagen [1974] that the standard (higherorder) Peano axioms for arithmetic follow from the abstraction principle now known as Hume’s Principle (the explicit derivation of the Peano axioms from Hume’s Principle was “extrapolated” from Frege’s comments in Crispin Wright [1983], George Boolos [1990a], and Richard Heck [1993], and Boolos & Heck [1998], among others). Hume’s Principle (which Hume himself did not state, and whose name derives from a rather charitable reading of a comment in Hume’s Treatise) is the claim that, given two arbitrary concepts P and Q, the number of P’s is identical to the number of Q’s if and only if the P’s and the Q’s can be put in a one-to-one correspondence. More formally, we have: HP : (∀P)(∀Q)(NUM(P) = NUM(Q) ↔ (P ≈ Q)) where P ≈ Q abbreviates the second-order formula stating that P and Q are equinumerous. We can formulate rather natural definitions of arithmetical notions such ‘natural number’, ‘successor’, and ‘addition’ in terms of the numerical operator “NUM”. The fact that, given these definitions, the second-order Peano axioms for arithmetic follow from Hume’s Principle is quite notable as a mathematical theorem independent of any philosophical motivation, and the result has come to be called Frege’s Theorem (for a detailed examination of this result, and various streamlined versions of it, see Richard Heck’s “Finitude and Hume’s Principle” [1997a], reprinted as chapter 4 below). Frege, of course, wanted to reduce all of arithmetic to logic, thus defending (at least some of) mathematics from the Kantian charge of being synthetic a priori (logic, presumably, being analytic if anything is!). Thus, he rejected Hume’s Principle as the ultimate foundation for arithmetic, since it contains ineliminable occurrences of the cardinal number operator. (More famously, he also rejected Hume’s Principle, in its primitive form, since it was susceptible to the Caesar Problem. See section 7 of this introduction for further discussion of this issue.) As a result, Frege formulated a second abstraction principle, one that mapped each concept onto a unique object – its extension. Unlike Hume’s
xviii
Introduction
Principle, Frege’s Basic Law V: BLV : (∀P)(∀Q)(EXT(P) = EXT(Q) ↔ (∀x)(Px) ↔ Qx)) contains only logical vocabulary (as long as talk of extensions is logical). Basic Law V was, in essence, an early attempt at formulating (in an a priori, logical manner) the foundations of what we would now call set theory. Using Basic Law V, Frege was able to reconstruct Peano Arithmetic on what appeared to be a purely logical basis. The first step was to define numbers to be certain sorts of extensions – the number of a concept P is the extension of the concept “(extension of a) concept equinumerous with P”, or, more formally: NUM(P) =df EXT((∃Y)(x = EXT(Y) ∧ Y ≈ P)) Given this definition, Frege was able to prove Hume’s Principle (now a theorem of Frege’s logic, and not a primitive non-logical definition of cardinal number) and thus prove the second-order Peano axioms for arithmetic. So if Basic Law V was, as Frege hoped, a logical truth, then logicism (at least regarding arithmetic) would be demonstrated. Since not all of us are convinced logicists, something must have gone wrong – something discovered by Bertrand Russell. In a letter dated June 16, 1902, Russell wrote to Frege, humbly pointing out that the crucial axiom (Basic Law V) that provided the power needed to reconstruct arithmetic within logic also seemed to allow for the derivation of a contradiction. Although Russell’s actual presentation of the paradox that now bears his name is a bit muddled in the original missive, the derivation of a contradiction from Basic Law V is well known, and need not be rehashed here. Frege attempted to fix the problem, but failed to find a convincing replacement for Basic Law V. Russell, meanwhile, along with Alfred North Whitehead, attempted his own reconstruction of mathematics from basic, a priori principles in the monumental Principia Mathematica [1910–13]. Although the Principia was (likely) consistent, in the long run it turned out to be no more convincing than Frege’s Grundgesetze.
2.
Abstractionism
After Frege’s failed attempt at utilizing abstraction principles in a logicist framework, this sort of principle lay unstudied for three-quarters of a century. Interest in abstraction of this sort was rekindled, however, by the publication of Crispin Wright’s Frege’s Conception of Numbers as Objects [1983]. Wright noted (perhaps among others, see Parsons [1965] and Hodes [1984]), in essence, that Frege’s project consisted of four basic steps: (1) Recognize Basic Law V as an axiom of logic. (2) Formulate suitable definitions of numerical notions in terms of the extensions provided by Basic Law V
Introduction
xix
(3) Derive Hume’s Principle (4) Derive arithmetic from Hume’s Principle (i.e. Frege’s Theorem)
Wright revived interest in Frege’s project, founding a new project that has come to be called Neo-Logicism, Neo-Fregeanism, and Abstractionism, variously, replacing the above blueprint with the following alternate plan: (1) Lay down Hume’s Principle as an implicit definition of cardinal number (2) Derive arithmetic from Hume’s Principle (i.e. Frege’s Theorem)
Of course, as was already noted, such a view (misleading nomenclature such as “Neo-Logicism” notwithstanding) does not deserve the title ‘logicism’, at least not in the traditional sense of the word as used by Frege, his fans, and his critics. Hume’s Principle, with its primitive and ineliminable occurrences of arithmetical terms (i.e. “NUM”), just does not have the character of a logical law or theorem (a point made strenuously and convincingly by George Boolos in “Is Hume’s Principle Analytic?”, the essay that opens this volume). Thus, abstractionists have had to look elsewhere for their defense of Hume’s Principle as something suitably basic as to provide the foundations of arithmetic. The answer, according to abstractionists, is to note that what is important about logicism is not so much the reduction of mathematics to logic, but rather the fact that this reduction (had it been successful) would have gone a long ways towards providing an account of certain aspects of mathematical knowledge that, ideally, we would like to be able to explain. In particular, the true advantages of logicism were that it purported to explain the a priori character of mathematical knowledge (assuming that the a priori character of purely logical knowledge is unproblematic) and it purported to explain the analyticity of mathematical truths (at least, this is important for those nonQuineans that retain a fondness for the analytic/synthetic distinction in the first place). The solution, then, is to retain these goals, while widening the scope of our means for achieving these goals to something more than pure logic. Along these lines, Wright and those that follow him deny that Hume’s Principle is a logical truth. Instead, Hume’s Principle, so it is argued, is (or is something like) an implicit definition of the “NUM” operator – one that explains the meaning of statements of identity of cardinal numbers. Since Hume’s Principle is a definition, we can come to know its consequences a priori in the same manner (or, at least, in a suitably similar manner) in which we obtain a priori knowledge of the consequences of more pedestrian definitions. Frege’s Theorem insures that all of second-order Peano arithmetic follows from Hume’s Principle plus standard second-order logic, so (since presumably second-order logic is a priori knowable and second-order consequence preserves a priori knowledge) it follows that we can, using the abstractionist recipe, obtain a priori knowledge of all of second-order arithmetic. (The question of analyticity is strictly speaking separate from that of aprioricity, and
xx
Introduction
has been less of a focus for the abstractionists than it was for Frege himself, although Bob Hale has rekindled interest in this issue.) There are, unsurprisingly, deep questions regarding how Hume’s Principle and Frege’s Theorem accomplish this epistemological feat. In particular, there are deep worries regarding the connection between our reconstruction of arithmetic within the abstractionist framework and actual arithmetic practice: How do we know that the knowledge gained from Frege’s Theorem is, in fact, knowledge about the ordinary natural numbers (and not some isomorphic surrogate)? And how do we determine whether our (supposed) a priori knowledge of the former allows for an explanation of the a priori status of everyday mathematics? William Demopoulos’ “On the Philosophical Interest of Frege Arithmetic” [2003] (reprinted below as chapter 7) develops a sustained examination and critique of this aspect of the project (although the reader is encouraged to consult Fraser MacBride’s two contributions to this volume as well). Of course, even if the view in question is not, really, a version of logicism, the above sketch makes it clear that logic plays a crucial role in the abstractionist account of mathematical truth and mathematical knowledge. Defending the claim that second-order logic preserves the relevant epistemological properties is one outstanding lacuna in the abstractionist literature, although it is not one they are unaware of. The most sustained discussion of the issues is to be found in Stewart Shapiro and Alan Weir’s “Neo-Logicist Logic Is Not Epistemically Innocent” [2000], reprinted below as chapter 8. Setting the role of logic aside, however, there is much of interest to be said regarding: (a) the notion of implicit definition required for such an abstractionist project, (b) the more general abstractionist accounts of meaning and reference which might allow for such implicit definitions to succeed, and (c) the metaphysical account of abstract objects that would allow for our epistemological access to them to proceed via such stipulations. Although all of these and more are touched on (and often discussed in some depth) in the essays that follow, they are not the primary focus of this volume or the papers included in it. Instead, we are here interested in those philosophical problems that stem from mathematical issues arising within the abstractionist project. Thus, we shall move on to examine those aspects of abstractionism that are of a more technical nature (the reader interested in more straightforwardly philosophical aspects of the abstractionist project, such as the topics mentioned at the top of this paragraph, can do no better than to consult Fraser MacBride’s “Speaking with Shadows: A Study of Neo-logicism” [2003], although the first two essays in this volume, both titled “Is Hume’s Principle Analytic?”, by George Boolos [1997] and Crispin Wright [1999], also provide much useful philosophical background material). Arithmetic, it would seem, is, in one sense, the big success story of abstractionism, since the technical results, at least, seem to be for the most part settled – all that remains is sorting out the philosophical problems and issues
Introduction
xxi
that result from this abstractionist reconstruction. As we shift our attention to abstractionist accounts of other mathematical theories, however, we shall see that things are not always so successful even within the purely technical aspects of the project. Thus, our next task is to quickly survey the extension of this project to set theory and real analysis.
3.
Abstractionist real numbers
One of the two obvious test cases for extending any philosophy of mathematics past an initial account of the natural numbers is to attempt to reconstruct the continuum (the other test case is to provide an adequate account of sets or something like them, the subject of the next section). Abstractionism is no exception here, and it did not take long for both believers and critics to wonder what shape an abstractionist account of real analysis might take. Although various accounts differ in the details (and this difference tends to depend on varying attitudes towards Frege’s Constraint, see below), Bob Hale’s initial reconstruction (as found in his “Reals By Abstraction” [2000], reprinted as chapter 11 below) and those that follow are similar to the following, at least from a mathematical perspective. The first step in an abstractionist account of the real numbers is to note that we are already provided with the natural numbers via Hume’s Principle. We can obtain the integers from these by adding an additional abstraction principle to our theory – something like the following Difference Abstraction Principle (the universal quantifiers here are restricted to natural numbers and the arithmetical operators on the right-hand side of the biconditional are the standard operations on the natural numbers): DAP : (∀x)(∀y)(∀z)(∀w)(DIFF(x, y) = DIFF(z, w) ↔ x + w = y + z) This principle provides us with an object corresponding to the difference between two natural numbers – in other words, DAP provides us with (a priori access to) the integers. With the integers in hand, we can obtain the rational numbers from these by adding another abstraction principle to our theory – the following Quotient Abstraction Principle will do the trick (here, the initial universal quantifiers are restricted to integers, i.e. objects that are in the range of the DIFF operator, and the arithmetic operators on the right-hand side of the biconditional are the standard operations on the integers – these are definable in terms of DIFF and second-order logic): QAP : (∀x)(∀y)(∀z)(∀w)(QUO(x, y) = QUO(z, w) ↔ ((y = 0 ∧ w = 0) ∨ (y = 0 ∧ w = 0 ∧ x × w = y × z))) Note that the Quotient Abstraction Principle provides us, not only with the rational numbers, but also with an extra, ‘bad’ object: QUO(a, 0) for any integer a. This object results from the fact that we assume that our abstraction
xxii
Introduction
operators are total functions, and thus certain unintended instances (such as division by zero in the present instance) nevertheless result in abstracts. The presence and role of ‘bad’ objects will be discussed in section 6 of this introduction. Now that we have the integers, we can obtain the reals by applying an abstraction principle that simulates Dedekind-style cuts on the rationals, such as the following Cut Abstraction Principle (here the universal quantifiers are restricted to non-empty bounded concepts holding only of non-‘bad’ rationals, i.e. objects in the range of the QUO operator other than the ‘bad’ object): CAP : (∀P)(∀Q)(REAL(P) = REAL(Q) ↔ (∀x)((∀y)(P(y) → y < x) ↔ (∀y)(Q(y) → y < x))) It is possible (although non-trivial) to prove that the objects provided by CAP are a complete ordered field, i.e. that they are isomorphic to the standard classical continuum (see Shapiro’s “Frege Meets Dedekind: A Neologicist Treatment of Real Analysis” [2000], reprinted as chapter 14 below, for details). Thus, the abstractionist position can account for not only the natural numbers, but the classical theory of the real numbers as well (from a technical perspective, at least). As a technical note, there seems to be no reason why, at this last step, we could not have applied, instead of CAP which encodes Dedekind’s notion of cut within the abstractionist framework, an abstraction principle that, when applied to sequences of rationals, provides the real numbers along the lines of Cauchy’s methodology. There is no formal reason why we could not formulate such a principle (e.g. let the abstraction principle in question map functions from the naturals to the rationals onto objects). The possibility of such alternate constructions raises a host of philosophical issues, however. Not least among them are the following questions: If it turns out that both CAP and an appropriate Cauchy-sequence principle are legitimate abstraction principles, then how are we to determine whether they provide us with access to the same objects? If not, then which one delivers the genuine real numbers (as opposed to merely an isomorphic copy)? Such questions are intimately tied up both with Frege’s Constraint and with the Caesar Problem, both of which will be discussed below.
4.
Abstractionist sets
The second natural extension of a foundational account of mathematics is to produce some account of set theory (or, at the very least, to provide some other theory that can do the work for which we normally invoke the theory of sets). The most notable attempt to provide such an account within the abstractionist framework is due to George Boolos, one of the most outspoken critics of the abstractionist view itself.
xxiii
Introduction
In “Iteration Again” [1989], Boolos compared and contrasted the iterative and limitation-of-size conceptions of sets. The former proposes to solve the problem posed by Russell’s paradox by claiming that sets must be formed in an infinitary step-by-step process, while the latter avoids paradoxes by claiming that only collections that are (in some sense) not too ‘big’ determine sets. One version of the limitation-of-size conception (the one Boolos used) can be formulated by defining ‘X is too big’ as ‘there is a bijection between X and the entire domain’. Boolos formulated an abstractionist version of the limitation-of-size conception of set along these lines. Letting “Big(P)” abbreviate the second-order formula asserting that there is an onto function from P to the entire domain, Boolos’ abstraction principle for extensions, called New V, is: New V : (∀P)(∀Q)(EXT(P) = EXT(Q) ↔ ((∀x)(Px ↔ Qx) ∨ (Big(P) ∧ Big(Q)))) New V provides a distinct object (an extension, or, more loosely, set) for each collection of objects provided that collection is smaller than the entire domain – concepts that hold of as many objects as there are in the domain, however, all receive the same abstract, the ‘Bad’ object (again, see below for discussion of ‘bad’ objects). Given New V, we can define a set to be the extension of a small concept: Set(x) =df (∃P)(x = EXT(P) ∧ ¬Big(P)) One object is the member of another object if and only if the second object is the extension of a concept which holds of the first object, or, in symbols: x ∈ y =df (∃P)(y = EXT(P) ∧ P(x)) (Note that ‘Bad’ objects can have, and be, members.) Given these definitions, New V entails many of the standard set theoretic axioms – extensionality, empty set, pairing, separation, replacement, and choice all follow (the union axiom does not follow on the above definitions, since the union of the singleton of the ‘bad’ object is not a set. Slight reformulations of this axiom do follow, however – for details see the chapters in section IV of this volume). In addition, the axiom of foundation holds if restricted to the pure sets (i.e. those sets that can be ‘built up’ from the empty set – see Gabriel Uzquiano and Ignacio Jané’s “Well- and Non-Well-Founded Extensions” [2004], reprinted as chapter 16 of this volume, for an in-depth examination of non-well-founded sets within the abstractionist framework). Thus, the only axioms that fail to follow, in some sense or another, are the powerset axiom and the axiom of infinity. It is easy to see why the axiom of infinity fails – if we take as our domain the hereditarily finite sets built from a single urelement (to serve as the ‘bad’ object), then the resulting model satisfies New V (since all ‘small’, i.e. finite, concepts receive extensions – the corresponding sets – while all ‘big’, i.e. infinite, concepts can be mapped onto our single urelement). In other words,
xxiv
Introduction
although NewV entails that there must be infinitely many objects, it does not entail that there need be any non-‘Bad’ concept that holds of infinitely many objects. The proof that powerset fails is non-trivial, however – readers are encouraged to consult chapter 15 of this volume, “New V, ZF, and Abstraction” [1999] by Stewart Shapiro and Alan Weir, for the technical details. Given that New V does not allow us to reconstruct all of standard Zermelo Fraenkel set theory, work has been done exploring other abstractionist routes to set theory. Among these are Roy T. Cook’s “Iteration One More Time” [2004], which formulates an abstractionist version of the iterative conception of set based on an abstraction principle called Newer V. Newer V entails the extensionality, empty set, pairing, separation, powerset, and choice axioms, but fails to imply both the axiom of infinity and the replacement axiom. Other approaches include Bob Hale’s “Abstraction and Set Theory” [2000], which formulates an alternative version of the limitation of size conception, and Stewart Shapiro’s “Prolegomenon to Any Future Neo-Logicist Set Theory: Abstraction and Indefinite Extensibility” [2003], which examines the general conditions under which a restricted version of Basic Law V (such as New V) will entail various set-theoretic principles (these three papers are reprinted below as chapters 20, 17, and 18 respectively). Although one could debate how much we should worry about abstraction principles failing to imply powerset or replacement, there is no ignoring the fact that the failure of natural abstractionist accounts of set theory to provide a proof of the axiom of infinity is just that, a failure. Presumably, a successful defense of abstractionism will require a development of a set theory (or surrogate for it) that is stronger than any of the existing proposals, since any set theory which fails to guarantee the existence of any infinite sets is unlikely to be adequate to our needs. To be fair, there are abstraction principles that imply all the axioms of second-order Zermelo–Fraenkel set theory (Alan Weir considers such principles in his “Neo-Fregeanism: An Embarrassment of Riches” [2004], reprinted here as chapter 19). Unlike New V or even Newer V, however, these principles do not seem to codify plausible ‘definitions’ of the notion of set or collection – the sort of conception that could underlie successful a priori introduction of the notion of set or extension into our discourse. Instead, these principles seem tailor made to provide all of the set theoretic axioms, and it is unlikely that anyone could have conceived of them without extensive prior knowledge of advanced set theoretic methods (e.g. formulation of typical ‘distractions’ requires an understanding of notions such as strong inaccessible cardinal). Thus, unlike the case of arithmetic and real analysis, there seems to be much more work needed of a purely technical nature before the abstractionist can make any claim to have explicated the aprioricity and analyticity of set theory. The main technical problem is to find an appropriate abstraction principle for extensions that is satisfied only on uncountable domains of the right sort
xxv
Introduction
(presumably, something like an inaccessible rank). As of the time of writing this introduction, there does not seem to be any plausible abstraction principle that will do the job, although there is interesting work leading in this direction (e.g. see Shapiro’s “Prolegomenon to Any Future Neo-Logicist Set Theory. . . ” and the later sections of Cook’s “Iteration One More Time”).
5.
The first problem: too many abstraction principles
The first general problem plaguing the abstractionist project is that there seem to be too many abstraction principles. What is required, and what we, at present, fail to have, is some general criteria for distinguishing between acceptable and unacceptable abstraction principles. Clearly, Basic Law V, being inconsistent, is on the unacceptable side of the field, while Hume’s Principle, the pride and joy of abstractionism, is (it is hoped) on the acceptable side (if not, then presumably some suitably modification of it is, such as Finite Hume, discussed in the next section). The problem, however, is that mere consistency is not enough for acceptability, and as a result, we need some further guide to distinguishing the good from the bad. The initial formulation of this problem is (as is almost always the case in these debates) due to George Boolos (in Boolos [1990a]), who pointed out that there are abstraction principles that are consistent, but which are nevertheless incompatible with Hume’s Principle (or, in fact, with any abstraction principle guaranteeing the existence of infinitely many objects). Assuming that Hume’s Principle is acceptable if anything is, it follows that inconsistency, while sufficient for rejecting an abstraction principle as unacceptable, is not necessary. Crispin Wright [1997] provided perhaps the most well-known example of such an abstraction principle: his aptly-named Nuisance Principle (here FSD(P,Q) abbreviates the second-order formula asserting that the symmetric difference of P and Q, that is, the collection of objects that are either P-andnot-Q or are Q-and-not-P, is finite): NP : (∀P)(∀Q)[NUI(P) = NUI(Q) ↔ FSD(P, Q)] The Nuisance Principle can be satisfied on domains of any finite cardinality (in which case all objects receive the same nuisance), but can be satisfied on no infinite domain. Thus, the Nuisance Principle, although consistent, is as unacceptable an abstraction principle as is Basic Law V. The reason for the unacceptability is different, however. At first glance, the Nuisance Principle appears to derive its unacceptability, not solely in terms of its own formal properties, but rather in terms of its interaction with other principles (such as Hume’s Principle). The existence both of inconsistent abstraction principles, and of pairs of individually consistent but incompatible abstraction principles, has given rise to a collection of problems that have been labeled The Bad Company Objection. Chief amongst the concerns falling under this heading are:
xxvi
Introduction
(1) The Existential Challenge: Given the existence of problematic principles of the same general form as Hume’s Principle (such as Basic Law V and the Nuisance Principle), what reason do we have for thinking that there are any good abstraction principles (including Hume) which have the privileged status the abstractionist claims for them? (2) The Epistemological Challenge: Even if one is convinced that there are good abstraction principles that can play a foundational role such as the one envisioned for Hume’s Principle, in general how do we tell the good principles from the bad?
The epistemological challenge, although clearly important, will be sidestepped here, since we are interested in those problems that are intimately connected to the mathematics of abstractionism. The existential challenge, however, is, or at least can be easily approached as, a logical/mathematical issue – i.e. what proof- or model-theoretic features will guarantee that an abstraction principle is acceptable? One common response to this version of the Bad Company Objection (one first put forward by Wright’s “Is Hume’s Principle Analytic?” [1999] and finessed by Shapiro and Weir in “New V, ZF, and Abstraction” [1999], both reprinted below) is to require that an abstraction principle be conservative in a certain sense. The intuitive philosophical idea is this: An acceptable abstraction principle is meant to be a definition of the abstracts that it introduces, but it is also meant to be no more than this. As a result, the principle in question should have no substantial consequences for those objects in the domain that are not abstracts. Put simply, Hume’s Principle might entail all sorts of interesting claims about numbers, and even interesting claims regarding the numbers corresponding to certain collections of cats, but Hume’s Principle should not imply any substantial non-numerical claim about cats (a numerical claim would be one containing at least one occurrence of the NUM operator). Hume’s Principle can be proven to be conservative, as we would expect. On the other hand, the Nuisance Principle turns out to be non-conservative, as we would hope (since it entails, for example, that there must be only finitely many cats). So at first glance the conservativeness constraint would seem to be doing the job that it was designed to do. There are problems, of course. For one, New V, the most promising abstractionist reconstruction of set theory so far (even if far from fully satisfactory), is non-conservative. The technical details can be found in Shapiro and Weir’s [1999] paper, but the informal idea is easy to grasp. Within the language of New V we can define the ordinals in the usual way – ordinals are just transitive pure sets, well-ordered by membership. By the familiar reasoning of the Burali-Forti paradox, we can conclude as usual that there is no set of all ordinals. Within the context of New V, however, this means that the collection of ordinals is ‘Big’ – i.e. there is an onto function from the ordinals to the entire universe. But, since the ordinals are well-ordered by membership, this
Introduction
xxvii
imposes a well-ordering on the entire universe. So New V is not conservative, since it implies that the universe can be well-ordered (and, within second-order logic, we can express this claim using no set-theoretic terminology). It is worth noting that Cook’s [2004] iterative variant of abstractionist set theory fares no better on this score. The problems with the conservativeness requirement do not stop with the fact that it would seem to rule out principles (such as New V) that we might otherwise have wished to be acceptable, In addition, it turns out, as Alan Weir shows in his “Neo-Fregeanism: An Embarrassment of Riches” [2004], that there are consistent yet incompatible abstraction principles that pass the conservativeness constraint. Weir calls such principles distractions, and he shows, further, that if we try to strengthen the conservativeness constraint in various natural ways in order to avoid such pairs of distractions, analogous problems arise in the meta-theory. There are a number of other criteria that have been proposed for narrowing down the list of potentially good abstraction principles. One suggestion is that the equivalence relation on the right-hand side of the biconditional accurately reflect the mathematical content mastered when we actually first learn the mathematical theory in question. In other words, the criterion for identity of the mathematical objects in question, provided by the abstraction principle, should clearly reflect the criterion by which we actually learned to identify and distinguish the objects in question. (The various abstractionist set theories are of particular relevance here, since there does not seem to be one single notion of set underlying our mathematical practice, but a number of competing notions, which are reflected in the competing reconstructions such as New V and Newer V.) Critics of this approach, however, have suggested that such requirements confuse something like the order of discovery with the order of explanation (e.g. see MacBride’s “On Finite Hume” [2002] and “Could Nothing Matter” [2003], reprinted as chapters 5 and 6 below). According to this line of thought, abstraction principles are intended to provide a story about how we might come to know mathematical truths a priori, but there is no reason to think that the actual route that we took in first coming to know these truths is necessarily anything like the privileged route provided by the abstraction (since the initial knowledge could even have been a posteriori!).
6.
The second problem: Too many objects
One of the main (supposed) advantages of abstractionism is that abstraction principles imply the existence of more objects than we would expect from logic and definitions alone. Some (including, of course, Boolos) have objected to this, on the grounds that logic (or analytic statements, or a priori knowledge more generally) should not imply the existence of all (or most) of the objects studied by working mathematicians:
xxviii
Introduction . . . It was a central tenet of logical positivism that the truths of mathematics were analytic. Positivism was dead by 1960 and the more traditional view, that analytic truths cannot entail the existence either of particular objects or of too many objects, has held sway ever since. (Boolos [1997], pp. 249–250)
Nevertheless, abstractionism is hopeless without the assumption that at least some existential claims are analytic, or a priori knowable, or something similar – the position in question is (on one reading) nothing more than a detailed philosophical account of how such is possible. So for our purposes here we will ignore such general worries regarding ontological excess. This ontological success seems to come at a price, however, since the very abstraction principles (or, sometimes, natural generalizations of them) that provide us with the ontology of standard mathematics have a tendency to imply the existence of more objects than are strictly needed for the reconstruction of the mathematical theory in question. Unfortunately, these additional objects are often unwanted or inconvenient. The first such unwanted object is ‘anti-zero’. Hume’s Principle implies that, in addition to the countable infinity of finite numbers, at least one other number exists, namely the number of the universal concept denoted by “x = x” – this ‘number’ is anti-zero. The standard account of cardinal numbers as developed in ZFC implies that there is no largest cardinal number, however. Thus, as was first pointed out by George Boolos, the theory of cardinals derived from Hume’s Principle seems to contradict the spirit, if not the letter, of the standard theory of cardinality as derived in Zermelo-Fraenkel set theory, where there can be no cardinal number of all objects. It is important to note that there is no formal contradiction here. One can easily construct a model which satisfies both Hume’s Principle and the (second-order) axioms of Zermelo–Fraenkel set theory – just take any settheoretic model of second-order ZFC, and interpret the numerical operator in Hume’s Principle as mapping each concept onto the appropriate ZFC cardinal, if the concept’s extension is set-sized, and mapping all other concepts onto some other object. Boolos’ point, rather, must be that there is no model of Hume’s Principle plus second-order ZFC where the ZFC cardinal numbers are exactly the cardinal numbers as defined by Hume’s Principle (under the same ordering). There are a number of obvious moves one could make here, although each has its problems. Among them are: (a) We might deny that ZFC provides an account of all the cardinal numbers, arguing instead that through this means we only get a model of the cardinal numbers corresponding to setsized concepts (while Hume’s Principle provides us with a theory of all the cardinal numbers). While attractive, this option seems to challenge the idea that set theory (however it is formalized) can play the foundational role traditionally ascribed to it (a role that abstractionists presumably would prefer it to retain, hence the interest in set-theoretic abstraction principles such as New V).
xxix
Introduction
(b) We might adopt a (positive) free logic, so that some instances of the numerical operator fail to designate objects (such as the instance that purports to refer to anti-zero). This strategy, however, seems open to two problems. First, given the abstractionist’s rather lenient criteria for when a term refers (that it occur in a true statement of the appropriate sort), this response seems somewhat ad hoc. Second, if abstractionists make it part of their official view that some numerical terms can fail to refer, then this allows the critic of abstraction to ask why it is not possible that all numerical terms fail to refer (or to argue that since some numerical terms fail to refer, then it seems unlikely that we can know a priori that other numerical terms do refer). For a detailed discussion of free logic within the abstractionist context, see Shapiro and Weir’s “NeoLogicist Logic Is Not Epistemically Innocent” [2000], reprinted below. A final strategy, however, is to replace Hume’s Principle with some suitably modified version, such as Finite Hume (here “Inf(P)” abbreviates the secondorder claim that there are infinitely many P’s): FHP : HP : (∀P)(∀Q)(NUM(P) = NUM(Q) ↔ ((Px ≈ Q) ∨ (Inf(P) ∧ Inf(Q)))) Finite Hume’s Principle provides a cardinal number for each finite concept, but maps any concept with an infinite number of instances onto the same, ‘Bad’ object (assuming that the logic is not free). Frege’s Theorem still holds for Finite Hume’s Principle (since the finite cardinals, i.e. the natural numbers, behave just as they do in the case of Hume’s Principle). There is no largest cardinal number, however, since the ‘Bad’ object that is the value “NUM” assigns to any infinite concept cannot be interpreted coherently as a number at all (to see this, it is enough to note that Finite Hume maps concepts of differing cardinalities onto the ‘Bad’ object in any uncountable model). So Finite Hume does not entail the existence of any strange cardinal numbers, such as anti-zero. It does, however, provide us with a generic ‘Bad’ object, just as the Quotient Principle QAP and New V were seen to do earlier. While such an additional object does not, like anti-zero, seem to violate any intuitions regarding the order-type of the cardinal numbers, they do bring with them different, yet equally serious problems of their own. In examining such ‘Bad’ objects, however, let us return our attention to the ‘Bad’ object provided by New V and similar extensions-forming principles, as it is this object that has attracted the most attention in the literature. Now, all existent abstraction principles that purport to provide us with something like extensions or sets also provide at least one unwanted, ‘Bad’ object – in fact, if the extensions-forming operator EXT is a total function then they must, since the claim that each concept receives a unique extension is contradictory. Typically (as is the case with New V) all of the concepts which are too ‘badly behaved’ to determine sets get mapped on to a single object, the ‘Bad’ extension, and in the case of New V, the ‘Bad’ object is the extension of any concept that is equinumerous to the entire domain.
xxx
Introduction
Now, unlike anti-zero, where the jury is still out regarding whether or not it is a genuine number, the ‘Bad’ extension is clearly not a genuine extension or set at all. It is merely an artifact of the particular abstractionist means for obtaining the things that we do want, i.e. the other extensions. Since treating the ‘Bad’ extension as some novel, until now unrecognized, yet real set seems implausible, the other option would be to treat it exactly as just described – as an artifact of the fact that we are treating our abstraction operators as defining total functions. The first problem is, of course, the seeming unavoidability of such ‘Bad’ objects in the first place. Why should our account of set theory (or rational numbers, or perhaps natural or cardinal numbers) seem to require the existence of an additional, and unwanted, object in the first place? Shouldn’t it be possible to provide a foundational account of any mathematical theory that entails the existence of all, and crucially, only, the objects required by that theory? This sort of question, while important and intuitively quite troubling, is also rather loosely formulated. There are other problems associated with the existence of ‘Bad’ objects (in particular, the ‘Bad’ extension) that are a good bit more precise, however. If the ‘Bad’ extension is merely an artifact of our theory, and not a ‘real’ extension in some sense, then presumably any proof of a crucial set-theoretic result based on New V should not depend on the existence of the ‘Bad’ extension. In other words, any set theoretic axiom that turns out to be true (given New V) should be true solely in virtue of the settheoretic ‘behavior’ of the genuine extensions (and thus should not depend for its truth on the existence or ‘behavior’ of the ‘Bad’ extension). This requirement seems reasonable. Unfortunately, at least for the present attempts at reconstructing set theory within an abstractionist framework, it seems like a requirement that cannot be met. The problem is that, if we are not allowed to make use of the existence of the ‘Bad’ extension, we lose the proof that there are infinitely many sets. Boolos provides the following proof that New V entails the existence of at least two objects (this is the initial part of his proof that New V entails the existence of infinitely many objects): Let Ø be the concept [x : x = x] . . . since there is at least one object (e.g. EXT (x = x) or EXT (x = x)), Ø is small, Ø = V, and EXT(x = x) = EXT(x = x). ([1989], p. 90, notation modified to fit that used here)
Notice that the proof makes explicit reference to the ‘Bad’ extension (i.e. EXT (x = x)). This is not accidental – the result depends on the existence of the ‘Bad’ extension in order to guarantee that finite concepts are not ‘Big’. There does not exist any proof from New V to the pairing axiom that does not, in some way, make use of the ‘Bad’ extension. To see why, consider New V interpreted in a free logic that allows EXT to fail to take on a value when applied to ‘Big’ concepts. In such a logic, there will be
Introduction
xxxi
a one-element model of New V – just let the domain contain (for example) the empty set. Then there are two concepts – the empty one, and the one holding of the empty set. Since the latter is ‘Big’ (it is, in fact, the entire universe), it need not receive an extension, so we can just map the empty concept onto the empty set, and we have our model. Thus, there seems to be a real problem in the way that abstraction principles for extensions such as New V behave. On the one hand, they seem to imply the existence of unwanted objects – in particular, the ‘Bad’ object. On the other hand, this object seems necessary in order to ‘bootstrap’ our way up to a proof that there are infinitely many sets. No satisfactory solution to this dilemma has been presented as of yet. Although the reader might be forgiven for thinking that this is, already, more than enough problems, a look at abstractionist reconstructions of real analysis is in order. As noted above, the abstraction principle generating quotients of integers also generated a ‘Bad’ object, but this seems less worrisome than anti-zero or the ‘Bad’ extension, since the traditional theory of the rationals was already plagued with a similar problem (i.e. the ill-definedness of 1/0 . More troubling, however, is the use of cut abstraction in the final step of the construction. Now, applying this particular abstraction principle to the domain does not, at first glance, seem to present any problems – we obtain, in fact, exactly the real numbers (or something isomorphic to them) and nothing else. The problem possibly arises, however, when we ask the following question: If we can use abstraction to take cuts on the rationals, as we did to obtain the reals, then is it permissible to take, as objects, the cuts on any linear order, by applying an appropriate abstraction principle? If the answer is “No”, the we seem faced with another particularly difficult instance of the Bad Company Objection – how are we to determine when we can, and when we cannot, apply an abstraction principle to a linear order to take cuts as objects? If the answer is “Yes”, however, then we are besieged by another worry: Generalizing such cut abstraction to any linear ordering whatsoever generates a large ontology (in the worse case, proper class sized). This is, at best, extremely surprising in a view that emphasizes its epistemic conservativeness. Additionally, some of the more powerful versions of generalized cut abstraction are incompatible with other, somewhat attractive abstraction principles, such as New V and variants of it (for a discussion of generalized versions of cut abstraction see Cook’s “The State of the Economy: Neo-Logicism and Inflation” [2002], reprinted as chapter 12 below, and criticism of it in Bob Hale’s “Reals by Abstraction”, [2000] chapter 11). Thus, in a number of ways the ontological power of abstractionism seems to backfire – the very strength of the view, the fact that it purports to provide us with an account of how we can have a priori knowledge of the existence and properties of those abstract objects studied by mathematicians, also is one of its weaknesses, since it also seems to provide us with a priori knowledge of the existence of until now unrecognized and, once recognized, unwanted
xxxii
Introduction
objects such as anti-zero and the ‘Bad’ extension. Like the Bad Company objection before it, a satisfactory solution to this problem (or, better, this family of problems) would seem to be a matter of determining where to draw certain lines: How do abstractionists determine which principles (and which formulations of certain principles) will provide them with access to the objects required for mathematics without also entailing the existence of additional objects that are both unnecessary and, at times, inconvenient?
7.
The third problem: What objects?
The final major problem of interest here is the notorious Caesar Problem. Frege first points out the problem in the Grundlagen, where he considers an abstraction principle introducing directions (here the initial quantifiers range over lines, and “//” is the relation of parallelism): (∀a)(∀b)(DIR(a) = DIR(b) ↔ a//b) After pointing out that this definition provides us with the means for identifying directions, and distinguishing distinct directions from one another, he points out that: . . . this means does not provide for all cases. It will not, for instance, decide for us whether England is the same as the direction of the earth’s axis – if I may be forgiven an example which looks nonsensical. Naturally no one is going to confuse England with the direction of the Earth’s axis; but that is no thanks to our definition of direction. (Frege, [1974], §66, pp. 77–78)
Looking at this from a technical perspective, we can see the problem as follows: Abstraction principles, such as Hume’s Principle and New V, whose right-hand side can be expressed in purely logical vocabulary, place no constraints on which object in a particular domain plays the role, say, of seven, or the empty set (things are slightly more complicated in the case of abstraction principles, such as those used to construct the reals, where the equivelence relation on the right contains other abstraction operators). All that determines whether a particular set can serve as the domain of a model of either of these principles is the cardinality of the set – if the set is the right size, then any object in the set can be any number or set (the only requirement is that each object can play the role of at most one number, or one set). Much has been written on the Caesar Problem, but approaches to it generally take one of three routes: First, we can deny it is a problem, adopting a sort of structuralist approach to abstractionism where it does not matter whether Caesar turns out to be the number two, as long as we are guaranteed that some object plays this role. (Although the Caesar problem is not his main target, Crispin Wright’s “Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint” [2000], chapter 13 below, draws connections between ante rem structuralism and abstractionism, and is particularly relevant
Introduction
xxxiii
here.) Second, we can attempt to reformulate our abstraction principles in more complicated ways (e.g. by inserting modal operators in appropriate places or the like) so that the reference of numerical terms is more determinate. Third, we might argue that although abstraction principles alone do not determine which object, in particular, is picked out by a certain numeral, abstraction principles plus other background constraints do determine numerical reference uniquely. Which of these approaches is most promising has yet to be determined. In fact, as the literature grows, new variations on the Caesar Problem seem to sprout up at least as fast as attempts to solve them. Most important among these are: The Counter-Caesar Problem: How do we guarantee that particular Fregean numerals denote the same object as their natural language counterparts (e.g. does NUM(x = x) denote the same thing as the English locution “zero”)? The Julio Cesar Problem: How do we guarantee that the cardinal numbers provided by Hume’s Principle denote the same kind of objects as are denoted by mathematical terms occurring in natural language (e.g. does NUM(x = x) denote the same kind of thing as the English locution “zero”)? The C-R Problem: How do we determine whether abstracts provided by distinct abstraction principles are identical or distinct (e.g. is the complex number 0, provided by the appropriate abstraction principle, identical to the real number 0, provided by a different abstraction principle)? Although the Caesar Problem (and its cousins) results from certain formal characteristics of abstraction principles, responses to it tend to be less technical. Nevertheless, a number of the chapters included below contain extended discussions of it. (The reader is also encouraged to consult MacBride [2005] and Cook and Ebert [2005] for further discussion of variants of the Caesar Problem.)
8.
Indefinite extensibility
As the literature on these problems and other issues has grown, the notion of indefinite extensibility has become more and more central to purported solutions. One promising line of attack on both the ‘too-many-abstraction’ principles class of problems and the ‘too-many-objects’ class of problems has been to suggest that we restrict our attention to those abstraction principles that provide abstracts only for concepts which are not indefinitely extensible. Of course, this does little to help us until we know what indefinite extensibility is. Bertrand Russell seems to be the first person to discuss this notion when considering the cause of the various set-theoretic paradoxes:
xxxiv
Introduction The contradictions result from the fact that . . . there are what we may call selfreproductive processes and classes. That is, there are some properties such that, given any class of terms all having such a property, we can always define a new terms also having the property in question. Hence we can never collect all of the terms having the said property into a whole; because, whenever we hope we have them all, the collection which we have immediately proceeds to generate a new term also having the said property. ([1906], p. 144)
The term “indefinite extensibility” is due to Michael Dummett, however, who extended Russell’s idea as follows: An indefinitely extensible concept is one such that, if we can form a definite conception of a totality all of whose members fall under the concept, we can, by reference to that totality, characterize a larger totality all of whose members fall under it. ([1993], p. 441)
It has become standard to use the term ‘definite’ for those concepts that are not indefinitely extensible. The ordinal numbers provide perhaps the clearest example of an indefinitely extensible collection. Consider any definite collection of ordinals (i.e. a set of ordinals). Given such a collection, we can immediately form a conception of an ordinal not in that collection (i.e. the ‘next’ ordinal, (i.e. either the successor of the greatest ordinal in the collection in question, or the supremum of the collection in question). As a result, there seems to be a sense in which we can never collect together all of the ordinals into a definite totality, since we could repeat this reasoning on such a collection to obtain an ordinal that is not in such a collection of all ordinals – contradiction (this is essentially just the reasoning behind the Burali-Forti paradox). An indefinitely extensible concept is thus one which allows for a certain sort of iteration – any time we have collected together some definite sub-collection of things falling under that concept, we can find a new object that is not in that collection. In fact, the ordinals are not only a clear example of the notion in question, but their structure seems to be fundamental to indefinite extensibility itself, since this iterability suggests that any indefinitely extensible collection will contain a structure isomorphic to the ordinals (it is worth noting, however, that Dummett would reject this Russellian characterization of indefinite extensibility). Thus, one way of characterizing indefinitely extensible concepts is “those concepts that are like the ordinals in relevant ways”. As Peter Clark points out in his contribution to this volume (“Frege, Neo-logicism and Applied Mathematics” [2004], chapter 3 below), another way of picking out the indefinitely extensible concepts is to just note that they are the ones whose extensions do not form sets. But neither of these suggestions, intuitively helpful as they are, do the abstractionist any real good. The abstractionist, remember, wishes to use the notion of indefinite extensibility in order to formulate a restricted version of Basic Law V (and other abstraction principles) which will provide an adequate
Introduction
xxxv
set theory. As a result, no characterization of indefinite extensibility (such as those above) which uses set-theoretic notions can be of use, since using set theoretic notions to formulate one’s implicit definition of set would introduce a rather vicious circle into the picture. Thus, the abstractionist needs some neutral formulation of the notion in question. As of yet, no completely adequate account of indefinite extensibility has been found, at least none that is of the sort that could be mobilized by the abstractionist wishing to use it in formulating various abstraction principles. This is not to say, of course, that no work of interest has been carried out – on the contrary, at least half of the papers in the present volume make at least passing reference to the importance of this problem, and almost all of the papers in the last section, on set theory, contain detailed discussion of the issue. Of particular interest is Stewart Shapiro’s “Prolegomenon to Any Future Neo-Logicist Set Theory. . . ” (chapter 18 below), which contains both a detailed examination of indefinite extensibility as discussed by philosophers such as Dummett and Russell, as well as a sustained technical examination of what formal characteristics a successful abstractionist account of the notion requires.
9.
One last thing
Although the bulk of the literature on abstraction and its mathematics, and the majority of the papers to follow, focus either on the actual formalization of arithmetic, analysis, and set theory, or on the three major sorts of problem just outlined, there are of course many other crucial questions regarding abstraction to be answered and many other avenues to be explored. While space considerations preclude detailed discussion of them here, at least one of them deserves brief mention before moving on to the papers themselves. The question in question is this: In what ways can the abstractionist’s formal results be adopted or adapted by their philosophical opponents? For example, in “Frege’s Unofficial Arithmetic” [2002] (chapter 10 below) Agustin Rayo utilizes Frege’s Theorem (and corollaries of it) to provide a distinctly non-Fregean (in fact, somewhat Quinean) account of arithmetic (and, in particular, applied arithmetic). While Rayo suggests that the account he provides is at least inspired by Frege’s own views (views Frege held after he abandoned logicism), the project he sketches is worked out against a philosophical background quite different from the one assumed by most abstractionists. The point, to put it bluntly, is this: even if abstraction principles are not definitions in the sense the abstractionist suggests, they might nevertheless play some crucial role in our epistemological account of mathematics. It is thus hoped that this collection will serve as a repository of work on the technical aspects of abstraction principles which can be utilized by both the abstractionist himself and also by adherents of different, competing philosophical
xxxvi
Introduction
accounts of mathematics (even if the majority of the actual papers are working within the standard Fregean abstractionist picture).
References Black, M. [1965], Philosophy in America, Ithaca, Cornell University Press. Boolos, G. [1989], “Iteration Again”, Philosophical Topics 17: 5–21. Boolos, G. [1990a], “The Standard of Equality of Numbers”, in Boolos [1990b]: 3–20. Boolos, G. (ed.) [1990b], Meaning and Method: Essays in Honor of Hilary Putnam, Cambridge, Cambridge University Press. Boolos, G. [1997], “Is Hume’s Principle Analytic?”, in Heck [1997b]: 245–261, reprinted below as chapter 1. Boolos, G. [1998], Logic, Logic, and Logic, Cambridge, MA, Harvard University Press. Boolos, G. and Heck, R. [1998] “Die Grundlagen der Arithmetik §82–83”, in Boolos [1998]: 315–338. Burgess, J. [1984], Review of Wright [1983], Philosophical Review 93: 638–640. Clark, P. [2004], “Frege, Neo-logicism and Applied Mathematics”, in Stadler [2004]: 169–183, reprinted below as chapter 3. Cook, R. [2002], “The State of the Economy: Neologicism and Inflation”, Philosophia Mathematica 10: 43–66, reprinted below as chapter 12. Cook, R. [2003], “Aristotelian Logic, Axioms, and Abstraction”, Philosophia Mathematica 11: 195–202, reprinted below as chapter 9. Cook, R. [2004], “Iteration One More Time”, Notre Dame Journal of Formal Logic 44: 63–92, reprinted below as chapter 20. Cook, R. & P. Ebert [2005], “Abstraction and Identity”, Dialectica 59: 121–139. Demopoulos, W. [2003], “On the Philosophical Interest of Frege Arithmetic” Philosophical Books 44: 220–228, reprinted below as chapter 7. Fine, K. [2002], The Limits of Abstraction, Oxford, Clarendon Press. Frege, G. [1974], Die Grundlagen Der Arithmetic, J.L. Austin (trans.), Oxford, Basil Blackwell. Frege, G. [forthcoming], Grundgesetze der Arithmetik, C. Wright et al. (trans.), Oxford, Oxford University Press. Hale, R. [2000], “Reals by Abstraction”, Philosophia Mathematica 8: 100–123, reprinted below as chapter 11. Hale, R. [2000], “Abstraction and Set Theory”, Notre Dame Journal of Formal Logic 41: 379– 398, reprinted below as chapter 17. Hale, B. C. Wright [2001], The Reason’s Proper Study. Oxford, Oxford University Press. Heck, R. [1993], “The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik”, Journal of Symbolic Logic 10: 153–174. Heck, R. [1997a], “Finitude and Hume’s Principle”, Journal of Philosophical Logic 26: 589– 617, reprinted below as chapter 4. Heck, R. (ed.) [1997b], Language, Thought, and Logic, Oxford, Oxford University Press. Hodes, H. [1984], “Logicism and the ontological commitments of arithmetic”, The Journal of Philosophy 81: 123–149. MacBride, F. [2000], “On Finite Hume”, Philosophia Mathematica 8: 150–159, reprinted below as chapter 5. MacBride, F. [2002], “Could Nothing Matter?”, Analysis 62: 125–135, reprinted below as chapter 6. MacBride, F. [2003], “Speaking with Shadows: A Study of Neo-logicism”, British Journal for the Philosophy of Science 54: 103–163. MacBride, F. [2005], “The Julio César Problem”, Dialectica 59: 223–236. Parsons, C. [1965], “Frege’s Theory of Number”, in Black [1965]: 180–203. Rayo, A. [2002], “Frege’s Unofficial Arithmetic”, Journal of Symbolic Logic 67: 1623–1638, reprinted below as chapter 10. Russell, B. [1902], “Letter to Frege” in van Heijenoort [1967]: 124–125.
Introduction
xxxvii
Russell, B. [1906], “On Some Difficulties in the Theory of Transfinite Numbers and Order Types”, Proceedings of the London Mathematical Society 4: 29–53. Shapiro, S. [2000], “Frege Meets Dedekind: A Neologicist Treatment of Real Analysis”, Notre Dame Journal of Formal Logic 41: 335–364, reprinted below as chapter 14. Shapiro, S. [2003], “Prolegomenon to Any Future Neo-Logicist Set Theory: Abstraction and Indefinite Extensibility”, British Journal for the Philosophy of Science 54: 59–91, reprinted below as chapter 18. Shapiro, S. & A. Weir [1999], “New V, ZF and Abstraction”, Philosophia Mathematica 7: 293– 321, reprinted below as chapter 15. Shapiro, S. & Weir [2000], “Neo-Logicist Logic Is Not Epistemically Innocent”, Philosophia Mathematica 8, 160–189, reprinted below as chapter 8. Stadler, F. (ed.) [2004], Induction and Deduction in the Sciences, Dordrecht, Kluwer Academic Publishers. Uzquiano, G. & I. Jané [2004], Well- and Non-Well-Founded Extensions”, Journal of Philosophical Logic 33: 437–465, reprinted below as chapter 16. van Heijenoort, J., (ed.) [1967], From Frege to Gödel: A Sourcebook in Mathematical Logic, Cambridge, MA, Harvard University Press. Weir, A. [2004], “Neo-Fregeanism: An Embarassment of Riches”, Notre Dame Journal of Formal Logic 44: 13–48, reprinted below as chapter 19. Whitehead, A. N. & B. Russell [1910–1913], Principia Mathematica, 3 vols., Cambridge, Cambridge University Press. Wright, C. [1983], Frege’s Conception of Numbers as Objects, Aberdeen, Aberdeen University Press. Wright, C. [1997], “On the Philosophical Significance of Frege’s Theorem”, in Heck [1997b]: 201–244. Wright, C. [1999], “Is Hume’s Principle Analytic?”, Notre Dame Journal of Formal Logic 40: 6–30, reprinted below as chapter 2. Wright, C. [2000], “Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint”, Notre Dame Journal of Formal Logic 41: 317–334, reprinted below as chapter 13.
I
THE PHILOSOPHY AND MATHEMATICS OF HUME’S PRINCIPLE
IS HUME’S PRINCIPLE ANALYTIC?1 George Boolos
The reduction, however, cuts both ways. It is not easy to see how Frege can avoid the seemingly frivolous argument that if his reduction is really successful, one who believes firmly in the synthetic character of arithmetic can conclude that Frege’s logic is thus proved to be synthetic rather than that arithmetic is proved to be analytic. Hao Wang 2
There are a number of issues on which Crispin Wright and I disagree, some of them substantive and some merely terminological. For example, we disagree over whether the term “analytic” can be suitably applied to HP and whether a derivation of arithmetic from HP would establish a doctrine appropriately called “logicism.” I also have certain reservations, which I shall set out later, about his notions of explanation and reconceptualization. However, I think the areas of agreement about the interest of Frege’s derivation of arithmetic are both wide-ranging and far more significant than those of disagreement. In particular I want to endorse Wright’s closing suggestion that “the problems and possibilities of a Fregean foundation for mathematics remain [wide?] open” and the remark made earlier in his paper that “The more extensive epistemological programme which Frege hoped to accomplish in the Grundgesetze is still a going concern.” I also want to emphasize that I consider Wright to have made a great scientific contribution in showing contemporary readers 1 This article first appeared in Richard G. Heck Jr., ed., Logic, Language, and Thought, Oxford, Oxford
University Press (1997). Reprinted by kind permission of Oxford University Press. A version of this paper was presented to a 1994 American Philosophical Association symposium on the topic of logicism. Crispin Wright was the co-symposiast and Charles Parsons the commentator. Michael Dummett much dislikes the designation “Hume’s Principle” because the remark in Hume’s Treatise (I, III, I, para. 5) which Frege cited with approval and from which the name derives, presupposes the doctrine that a number is an item composed of units, a doctrine which Frege is presumed to have refuted. Since this paper first appeared in a Festschrift for Michael, I used the designation “HP” instead. Cf. Chomsky and “LF.” 2 Wang, Hao, “The Axiomatization of Arithmetic,” Journal of Symbolic Logic 22 (1957), pp. 145–158, reprinted in Wang, Hao, A Survey of Mathematical Logic, Peking, Science Press (1963), pp. 68–81. The quotation, together with other extremely interesting observations, appears on p. 80.
3 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 3–15. c 2007 Springer.
4
The Arché Papers on the Mathematics of Abstraction
how the deduction of the Peano postulates from HP could be carried out and in formulating the conjecture, subsequently verified, that HP is consistent. 3 The first issue I want to take up is whether a derivation of arithmetic from HP vindicates logicism. My view is: no logic, no logicism. It is clear what has to be established in order to show the truth of something we can call logicism with a clear conscience. Arithmetic has to be shown to be provable from an extension by definitions of a theory that is logically true. In technical parlance, arithmetic has to be interpreted in a logically true theory. It cannot be, trivially: Arithmetic implies that there are two distinct numbers; were the relativization of this statement to the definitions of the predicate “number” provable by logic alone, logic would imply the existence of two distinct objects, which it fails to do (on any understanding of logic now available to us). Wright states that if it has to be made out that HP is a truth of logic, then “the prospects are unimproved,” the prospects, I take it, being those for establishing a species of logicism. I infer that he does not consider HP to be a truth of logic. Nor do I: the principle implies the existence of too many objects. So I do not conclude, as Wright does, that the proof of Frege’s theorem by itself establishes logicism. It only shows the beautiful, deep, and surprising result that arithmetic is interpretable in Frege arithmetic, a theory whose sole nonlogical axiom is HP. Wright argues, though, that since HP is analytic, the proof yields “an upshot still worth describing as logicism, albeit rather different from the conventional understanding of the term.” I might be prepared to agree that something describable as logicism in a different understanding of that term would have been established if HP had been shown to be analytic or akin to something properly called a definition. But I doubt that it can be. Having to discuss whether HP is analytic is rather like having to consider whether hydrogen sulfide is deflogisticated. One can certainly see reasons why one might be tempted to call H2 S dephlogisticated: but if I am right in thinking that to deflogisticate is to combine with oxygen, there are conclusive reasons for not doing so. The main reason why the notion of analyticity is all but useless is discussing propositions of mathematics like HP is that, although an analytic statement is supposed to be one that is true in virtue of the meanings of the terms contained in sentences expressing it (and syntactic features of those sentences), the phrase “true in virtue of meanings” leaves it indeterminate how much mathematics may be used to get from facts about meanings to the truth of the statement, or, more exactly, how much mathematics it is allowable to use 3 Wright, Crispin, Frege’s Conception of Numbers as Objects, Scots Philosophical Monographs, vol. 2, Aberdeen, Aberdeen University Press (1983). The derivation is on pp. 154–168. The discussion of numbertheoretic logicism III is on pp. 153–154.
Is Hume’s Principle Analytic?
5
in deriving the statement (or the statement that that statement is true) from reports of meanings. In brief, we are not told how strong the mathematics is that “in virtue of” permits. The stronger the mathematics permitted, the greater the number of analytic mathematical truths, of course. The point, in essence, is due to Gödel and is different from the objection raised by the question “Why mathematics rather than geology?” In the interest of trying to get at what’s really at issue between Wright and myself, however, I shall ignore the standard difficulties presented by “analytic,” including the uncertainty what the interest or point of classifying a statement as analytic is and the worry that complex logical argumentation might itself create semantic content, 4 and suppose that I understand the concept sufficiently well, well enough at least to know what’s meant by calling “all vixens are foxes,” etc. analytic and by saying that there is a semantic connection between “vixen” and “fox.” At the outset, let me acknowledge that I have no knock-down argument that will persuade a diehard defender of the claim that HP is analytic to abandon the view. All I shall offer are what strike me as some rather, and perhaps sufficiently, weighty considerations against that position. At first glance, HP might certainly seem analytic. In its statement “number” means “cardinal number” and, one would naturally wonder, isn’t it a matter of the semantic connection between “cardinal number” and “one–one correspondence” that two concepts have the same cardinal number just when things falling under one of them can be put in one–one correspondence with those falling under the other? Isn’t the cardinal of x the same as that of y just when there’s a one–one correspondence between x and y, and that because of what “cardinal number” means? So isn’t the left-hand side of HP close enough in meaning to the right-hand side for it to count as analytic? Doesn’t the left-hand side have the same sense as the right? 5 Let me begin to respond to this argument by recalling two features that analytic statements have been traditionally supposed to enjoy: first, they are true; secondly and roughly speaking, they lack content, i.e., they make no significant or substantive claims or commitments about the way the world is; in particular, they do not entail the existence either of particular objects or of more than one object. (It may be held that some analytic statement might entail the existence of at least one object, as will be the case if every logical truth counts as analytic.) Some have been tempted by the idea of analytic statements that happen not to be true, e.g., “the present king of France is a royal.” On the view in question, the semantic connection between “king” and “royal” suffices to ensure the analyticity of the entire statement, despite the failure of its subject to denote. But analytic statements are, and (since we are playing along) are analytically, 4 This possibility is suggested by a remark of Frege about condensation in §23 of his Begriffsschrift. 5 Thanks here to Arthur Skidmore.
6
The Arché Papers on the Mathematics of Abstraction
analytic truths, and the view may be put aside. The example is worth noting, however, for, as I am going to suggest later, HP suffers from a defect similar to that of “the present king of France is a royal,” which would not be analytic even if there were presently a (unique) king of France, since, of course, it would not be analytic that there is one. The main significant worry for the defender of the analyticity of HP concerns the quite strong content that it appears to possess. HP has consequences having to do with certain features of the domain of objects over which its firstorder variables range, in particular with the number of those objects there are. Much of the most interesting work in mathematical logic in the last 20 years or so has dealt with comparisons of strength of various logical and mathematical statements, examining which well-known theorems of mathematics can be derived from which logical principles (and vice versa!) in which background theories. We now know that Frege arithmetic is equi-interpretable with full second-order arithmetic, “analysis,” and hence equi-consistent with it. Learning that HP is analytic would not help us in the slightest with the problem of assessing the strength of various theorems, fragments, and subtheories of analysis, all of which would, I suppose, have to count as analytic. The first part of my worry about content is that HP, when embedded into axiomatic second-order logic, yields an incredibly powerful mathematical theory. Wright will say: Hooray! Math is analytic after all. But we don’t know what follows from its being so and we will have to study the subanalytic to see what (logically) entails what just as hard as before. It is known that HP does not follow (a word I will not surrender) from the conjunction of two of its strongest consequences: the (interesting) statements that nothing precedes zero and that precedes is a one–one relation. If HP is analytic, then it is strictly stronger (another non-negotiable term) than some of its strong consequences. It is also known that arithmetic follows from these two statements alone, and that arithmetic is strictly weaker than even their disjunction. 6 Faced with these results, how can we really want to call HP analytic? Frege, for a lengthy stretch of his career, held that the existence of infinitely many objects could be seen to follow from a set of principles and definitions that could, by his lights, be counted as analytic. He abandoned the view in 1906, according to Dummett, when he realized that his attempted patch to Basic Law V would not work. It is doubtful that Russell could be considered a logicist in the full sense of the term while writing Principia, whose stated aim is to analyze the notions employed in mathematics, not to show arithmetic to be a branch of logic. Despite the Gödel incompleteness theorems and Russell’s protestations that the axiom of infinity was no logical truth, it was a central tenet of logical positivism that the truths of mathematics were analytic. Positivism was dead by 1960 and the more traditional view, that analytic truths 6 For proofs of these results, see my “On the Proof of Frege’s Theorem,” in Benacerraf and His Critics, Adam Morton and Stephen Stich, eds., Cambridge, MA, Blackwell (1996), pp. 143–159.
Is Hume’s Principle Analytic?
7
cannot entail the existence either of particular objects or of too many objects, has held sway since. Wright wishes to overthrow the tradition, but it should be asked how a statement that cannot hold if there are only finitely many objects can possibly be thought to be analytic, a matter of meanings or “conceptual containment.” On the symbolization that I prefer, HP reads: ∀F∀G(#F = #G ↔ F ≈ G) where “F ≈ G” is an abbreviation for a second-order formula expressing that there is a one–one correspondence between the objects falling under the concept F and those falling under the concept G. We need not here write out the formula, but must remember that it contains some first-order quantifiers. We must also remember the grammatical category of “#,” “octothorpe”: it is a function-sign, which when attached to a monadic second-order variable like “F,” produces a term of the same type as individual variables that occur in “F ≈ G.” It is essential to the proof of Frege’s theorem that octothorpe be so construed. Thus octothorpe denotes a total function from concepts to objects. Logic, plus the convention that function signs like octothorpe denote total functions, will guarantee that ∀F∃!x #F = x is true. It will not guarantee that HP is. HP entails, as Wright has put it with exemplary force and Cartesian clarity, that there is a partition of concepts into equivalence classes, in which two concepts belong to the same class if and only if they are equinumerous. If there are only k objects, k a finite number, then, since there are k + 1 natural numbers ≤ k, there will be k + 1 equivalence classes, viz. a class containing each concept under which zero objects fall, a class containing each concept under which exactly one objects falls, . . . , and a class containing each concept under which all k objects fall. (We need not here assume that concepts are individuated extensionally.) Thus, if there are only k objects, there is no function mapping concepts to objects that takes non-equinumerous concepts to different objects, for there won’t be enough objects around to serve as the values of the function, since k + 1 are needed. So if HP holds—even if only the left–right direction (the same direction as in the fatal Basic Law V) holds— there must be infinitely many objects. One person’s tollens is another’s ponens, and Wright happily regards the existence of infinitely many objects, and indeed, that of a Dedekind infinite concept, as analytic, since they are logical consequences of what he takes to be an analytic truth. He would also regard the existential quantification of HP (over the positions occupied by octothorpe) as analytic. But what guarantee have we that there is such a function from concepts to objects as HP and its existential quantification claim there to be? I want to suggest that HP is to be likened to “the present king of France is a royal” in that we have no analytic guarantee that for every value of “F,” there is an object that the open definite singular description “the number
8
The Arché Papers on the Mathematics of Abstraction
belonging to F” denotes. I shall also suggest that there may be some analytic truths in the vicinity of HP with which it is being confused. I hope that the suggestions will do justice both to the thought that there is a strong semantic connection between “the number of . . . ” and “one–one correspondence” and to the traditional idea that analytic truths do not entail the existence of a lot of objects. Our present difficulty is this: just how do we know, what kind of guarantee do we have, why should we believe, that there is a function that maps concepts onto objects in the way that the denotation of octothorpe does if HP is true? If there is such a function then it is quite reasonable to think that whichever function octothorpe denotes, it maps non-equinumerous concepts to different objects and equinumerous ones to the same object, and this moreover because of the meaning of octothorpe, the number-of-sign or the phrase “the number of.” But do we have any analytic guarantee that there is a function that works in the appropriate manner? Which function octothorpe denotes and what the resolution is of the mystery how octothorpe gets to denote some one definite particular function that works as described are questions we would never dream of trying to answer. (Harold Hodes’ article “Logicism and the ontological commitments of arithmetic” 7 contains much wisdom about these mysteries of mathematical reference.) Nevertheless, it would seem that if there is such a function, then whichever function octothorpe does denote, it also does the trick. 8 Thus, I am moved to suggest, very tentatively and playing along, that the conditional whose consequent is HP and whose antecedent is its existential quantification might be regarded as analytic. The conditional will hold, by falsity of antecedent, in all finite domains. By the axiom of choice, the antecedent will be true in all infinite domains, but then, we may suppose, nothing will prevent the consequent from being true. I also find plausible the suggestion that the right-to-left half of HP, which states that if F and G are equinumerous, then their numbers are identical, is analytic. It is the left-to-right half, which states that if F and G are not equinumerous, then their numbers are distinct, that blows up the universe. (E.g, consider the concept non-self-identical; call its number zero. Now consider the concept identical with zero; call its number one. By the left-to-right half of HP: since the concepts are not equinumerous, zero is not one.) The analogy with Basic Law V is obvious. Frege divided Basic Law V into Va, the left-to-right half, and its converse Vb. It was the left-to-right half that gave rise to Russell’s paradox. Vb has considerable claim to being regarded as a logical truth: (a) it is valid under standard semantics, thanks to the axiom of extensionality; (b) if the Fs are the Gs, as the antecedent asserts, then whatever 7 Hodes, Harold “Logicism and the Ontological Commitments of Arithmetic,” Journal of Philosophy 81 (1984), pp. 123–149. 8 Hartry Field has made a similar suggestion in his review of Wright’s Frege’s Conception of Numbers as Objects, which is reprinted in Field, Hartry, Realism, Mathematics, and Modality, Oxford, Blackwell (1989), pp. 147–170.
9
Is Hume’s Principle Analytic?
“extension” may mean, the extension of the Fs is the extension of the Gs; and (c) if the antecedent holds, then the concepts Fand G bear a relation to each other that Frege called the analogue of identity. Thus under each of three familiar systems of formula-evaluation, Vb can never turn out false. In the case of both HP and Basic Law V, we have a principle whose left-to-right half requires that there be a function from concepts to objects respecting certain non-equivalences of those concepts. Unless enough objects exist, these nonequivalences cannot be respected. All that the right-to-left halves demand is that the equivalences be respected, as they can be trivially, by mapping all concepts to one and the same object. ∀F∀G(∀x(Fx ↔ Gx) → #F = #G), which has the same form as Basic Law Vb, can equally justifiably be claimed to be a logical truth, and the stronger ∀F∀G(F ≈ G → #F = #G) much more plausibly thought analytic, in virtue of the meaning of “#,” than its converse. There is a further difficulty, or at any rate a further aspect of the same difficulty: If numbers belonging to concepts F and G are supposed to be identical if and only if F and G are equinumerous, then how do we know that, for every concept, there is such a thing as a number belonging to that concept? We should not be led astray by the concision, symmetry, and apparent familiarity and obviousness of #F = #G ↔ F ≈ G into ignoring the fact that octothorpe is a function sign (for a function of higher type). Like constants and the usual sort of function sign, it may help in concealing significant existential commitments. (Perhaps because of that danger, Quine, concerned with ontology and logic’s role in its study, almost entirely avoids constants and function signs in his textbook Methods of Logic.) An analogy may help: if volumes are supposed to be translation- and rotation-invariant, finitely additive, and non-trivial, with singletons and balls of radius r having volumes 0 and 4πr 3 /3, respectively, then, as the “paradoxical” Banach–Tarski theorem shows, not every bounded set of points in three-space has a volume. It would thus be illegitimate to introduce a sign for a totally defined function from bounded sets of points in three-space to real numbers and assume that the function was translation-invariant, etc. And one had therefore better not say: it is analytic that volume is translation-invariant, etc., and it is analytic that there is always such a thing as the volume of any bounded set of points in three-space, for the conjunction of the two statements claimed to be analytic is false. Similarly, if numbers are supposed to be identical if and only if the concepts they are numbers of are equinumerous, what guarantee have we that every concept has a number? 9 Or, if we take ourselves to know that with every concept there is functionally associated some object, then how do we know that the associated object is a number belonging to F? 9 Profound thanks here to Peter Clark.
10
The Arché Papers on the Mathematics of Abstraction
It will be useful here to formulate HP in a way that expressly brings out its existential commitments. Let Numbers be the statement: for every concept F, there is a unique object xsuch that for every concept G, x is a number belonging to G if and only if F is equinumerous with G. Is Numbers analytically true? I see no reason at all to believe that it is analytic that for every F, there is such a (unique) object x. To reply that it is, since Numbers follows from HP, and HP is analytic, would seem to beg a question that ought not to be begged. Even more strongly, I don’t see any reason to think that it’s analytic that objects can be so assigned to concepts that any two concepts are assigned the same object if and only if they are equinumerous. It is not only the existence of a function of higher type making such an assignment of objects to concepts that seems synthetic to me: the weaker modal claim that objects can be so assigned strikes me as synthetic as well. I repeat that one person’s ponens is another’s tollens and admit again that I don’t have a knock-down argument against Wright’s view. I now want to raise some objections to Wright’s notion of a reconceptualization and his use of the term “explanation.” Discussing Frege’s (more-or-less) analogous case of directions and parallelism, Wright says, “we have the option . . . of re-conceptualizing, as it were, the state of affairs which is described on the right. That state of affairs is initially given to us as the obtaining of a certain equivalence relation . . . ; but we have the option, by stipulating that the abstraction is to hold, of so reconceiving such states of affairs that they come to constitute the identity of a new kind of thing . . . of which, by this very stipulation, we introduce the concept.” Part of the problem with this suggestion is this: in HP, numbers belonging to concepts are themselves among the objects over which the first-order variables on the right-hand side range. Talk of reconceptualizing a state of affairs would be in order only if the objects supposedly introduced by stipulation were new, objects that had not been previously quantified over. Whether old objects can be chosen to be identical or not under the right conditions would not seem to be a matter that it could be up to us to decide. It is here that the analogy between directions and lines and numbers and concepts breaks down: no one supposes that directions are any sort of constituent of lines, but on the Fregean treatment of number, numbers quite definitely are objects that both fall under concepts and are associated with concepts, as their numbers. However, when the objects allegedly introduced by this sort of stipulation are already objects quantified over in the equivalence relation, unexpected, and sometimes unwelcome, results can occur when we attempt to identify certain of them. We can’t, for example, stipulate that old objects be assigned to concepts in such a way that if some old object falls under one concept but not another then the two concepts are to be assigned different objects.
Is Hume’s Principle Analytic?
11
Wright says, “The concept of direction is thus so introduced that that two lines are parallel constitutes the identity of their direction. It is in no sense a further substantial claim that directions exist and are identical under the described circumstances. But nor is it the case that, by stipulating that the principle is to hold, we thereby forfeit the right to a face-value construal of its left-hand side and thereby to the type of existential generalization which a face-value construal would license.” All well and good for directions, maybe, but what if the objects introduced on the left are already among those discussed on the right? Could there not then be a danger that a “substantial further claim” about those very objects, taken together, would be entailed? And of course there is such a danger: the generalized biconditional, or the biconditional with its free variables, taken as an axiom, might then entail that, e.g., there are many, many objects, too many for it to be capable of being regarded any longer as analytic. One might think: but does that not automatically show that HP isn’t analytic? How can an analytic truth be false in certain domains, indeed false in all the finite domains? There is of course a reply that is ready to hand, viz. that it’s analytically false that the objects that exist constitute any one of those finite domains. The response strikes me as incredible, but again, I don’t have a knock-down argument against the analyticity of HP, only a bunch of considerations. (Heidegger would hardly have welcomed the response, “Because, analytically, there is always the number of things that there are; so there couldn’t have been nothing rather than something.”) One final remark on reconceptualization. How can one call the left-hand side of HP a reconceptualization of the right if it can’t always be made to hold whenever the right-hand side does? Of course if the variables range over a set, one can always pick some new objects to play the role of the numbers belonging to subsets of that set, but why is one so sure one can do this if there is no set of objects over which the variables range? Wright’s idea that the role of HP is that of an explanation also worries me. In Frege’s Conception of Numbers as Objects, Wright writes: “the fundamental truths of number theory would be revealed as consequences of an explanation: [note the colon] a statement whose role is to fix the character of a certain concept.” 10 In the present paper, Wright calls HP “a principle whose role is to explain, if not exactly to define, the general notion of cardinal number.” Wright is impressed by the form of HP: a biconditional whose right limb is a formula defining an equivalence relation between concepts F and G and whose left limb is a formula stating when the cardinal numbers of F and G are the same. Since the sign for cardinal numbers does not occur in the right 10 Wright (1983), p. 153.
12
The Arché Papers on the Mathematics of Abstraction
limb, can one not appropriately say that HP explains the concept of a cardinal number by saying what it is for two cardinal numbers, both referred to by expressions of the form “the number of . . . ” to be identical? 11 Certainly. HP states a necessary and sufficient condition for an identity #F= #G to hold. Moreover the formula defining this condition doesn’t contain #. So if one wants merely to sum up this state of affairs by saying that HP explains the concept of cardinal number, I would not object. However, it is hard to avoid the impression that more is meant, that Wright holds that to call a statement an explanation of a concept is to assign it an epistemological status importantly similar to the one it was once thought analytic judgments, including definitions, enjoy. It is to this further suggestion that I wish to demur. I can’t help suspecting that Wright is using “explanation,” at least in the phrase “explanation of a concept,” as a term of art, as a member of the same family circle as “analytic,” “definition,” or “conceptual truth,” that the only reason he does not call HP an “analytic definition” is that it is not of the form: Definiendum(x) ≡ Definiens(x), and that he supposes it to be a super-hard truth like “all bachelors are unmarried” or “all equivalence relations are transitive.” The phrase “whose role” occurs in both quotations and may suggest that Wright thinks that HP has one and only one [pre-eminent] role, for “whose” seems in both places to mean “of which the” rather than “of which a.” This thought seems to me to be incorrect. HP might be taken as an axiom, the sole (non-logical) axiom in some axiomatization of arithmetic. It might be a sentence we want to show to be needlessly strong for some purpose, e.g., deriving arithmetic. It might serve as something to be obtained from Basic Law V. It might be used as an example of a beautiful proposition. Etc. etc. But there’s no such thing as the [unique] role of HP. It is certainly true that one of the ways in which HP can be used is to fix the character of a certain concept. Here’s how: lay Hume down. Then the concept the number of . . . will have been fixed to be such that numbers belonging to concepts will be the same if and only if the objects falling under one of the concepts are in one–one correspondence with those falling under the other. But Hume is no different in this regard from any other statement that we might choose to take as an axiom. The axiom of choice fixes the concept of set in a similar manner. Laid down, it determines that for any set of disjoint nonempty sets, there is a set with exactly one member in common with each of those sets. The principle of mathematical induction fixes the character of the natural numbers. The statement that bananas are yellow fixes the character of the concept of a banana. So nothing is said when it is said that one of the roles of HP is to fix the character of the concept of cardinal number. And HP doesn’t have a unique role. 11 I am grateful to Wright and Richard Heck for helpful comments on the whole of this paper but am particularly grateful to them here.
Is Hume’s Principle Analytic?
13
Let me now defend myself about the “bad company” argument. What I think I was doing was illustrating that what is called (unfortunately, as Wright has stated) “contextual definition” is not, in general, a permissible way of introducing a concept. I didn’t mean to be arguing that it never was and gave the example of the principle governing truth-values as another example of a legitimate contextual definition. Different examples had different purposes. I cited Hodes splendid observation that the relation-number principle (the relation-number belonging to R is identical with that belonging to S if and only if R and S are isomorphic relations) leads to the Burali-Forti paradox in order to point out that Basic Law V was not an isolated case and that HP might well be expected to be powerful if consistent (as it is). I gave the example of parities in order to show that one couldn’t say that a contextual definition is OK if only it is consistent. (I had thought of nuisances, but I seemed to recall actually having heard of the “parity” of a set, and the notion is in any case a natural one.) The example of a principle true iff there are no more than two members was designed to show that one didn’t need heavy involvement with set theory to find a contextual definition incompatible with HP. And did I ever say that it would be impossible to demarcate the good contextual definitions from the bad? I merely said that it would seem to be a problem we have no hope of solving at present. I have to reserve judgment on the question whether Wright has solved the problem, but I certainly hope he has. Wright says I was wrong to say that there is no notion that V**, my revision of Basic Law V, is analytic of; what is true, he says, is “that there is no prior, no intuitively entrenched notion, no notion given independently, which V** is analytic of.” I happily accept the correction. I now want to make a somewhat conciliatory remark. I have been aspersing, at great length, the idea that HP is an analytic truth, all the while taking “analytic” to bear something like the sense it has in current philosophical discourse, namely, “truth in or by virtue of meanings.” I think that is the sense in which Wright uses the term too. But there may be another notion of analyticity on which the analyticity of HP might well be more plausible. It is the idea of Gödel’s, as outlined in both his paper “Russell’s mathematical logic” and his 1951 Gibbs lecture to the American Mathematical Society, 12 according to which a proposition is analytic if it is true “owing to the nature of the concepts occurring therein.” 13 Concepts, he says in the Gibbs lecture, “form an objective reality of their own, which we cannot create or change, but only perceive and describe.” By reflection, which of course 12 See also George Boolos, “Introductory Note to Kurt Gödel’s ‘Some Basic Theorems on the Foundations of Mathematics and their Implications’ ,” in Kurt Gödel, Collected Works, vol. III, Unpublished Essays and Lectures, Solomon Feferman et al., eds., Oxford, Oxford University Press (1995), pp. 290–304. 13 Gödel also describes propositions as analytic if they are true in virtue of the meanings of the terms expressing them, but it should be understood that his notion of meaning is much broader than that of “linguistic” meaning. For example, Gödel held that it is a matter of the meanings of “set” and “∈” that the axioms of set theory hold. The difference between the sense he attached to “meaning” and “concept” would not seem to be particularly significant.
14
The Arché Papers on the Mathematics of Abstraction
includes philosophical or mathematical or other intellectual work, we can sometimes arrive at an understanding of the natures of certain concepts that is sufficient to enable us to see the truth of certain propositions in which they occur. With the passage of time, our understanding of those concepts may improve and the truth of ever more analytic propositions become evident to us. Perhaps, as Schoenfield has ironically suggested, the rejection of the “axiom” of constructibility is one example of improvement in our perception of the meaning (in Gödel’s sense) of “set” or of the nature of sets. The thought that understanding of abstract objects may be achieved through a sort of perception of them, which is crucial to Gödel’s conception of the analytic, will certainly strike many contemporary philosophers as unacceptably mystical and at any rate highly implausible. (Perhaps, paradoxically, there is even a tinge of materialism in the suggestion that our knowledge of abstract objects arises from “something like a perception” of them: could there not be ways in which we interact with abstracta that yield knowledge of them that are not at all like perception?) But if—IF—a Gödelian notion of analyticity could be made out, then HP might well be among the first candidates for this new sort of analytic truth. Perhaps by taking the thought in the right way, we can “see” that if nothing exists, then zero, at least, has to exist, for it is then the number of things there are, and therefore that something does exist after all, but then there have to exist two things, for . . . This Fregean argument may strike one, as it does me, as a good example of the kind of reflection Gödel might have thought showed that the proposition that there are infinitely many natural numbers is analytic, on his understanding of “analytic,” if not on that of most of us who use the word. Maybe in the end we would also thus “see” the truth of HP. But even on such a Gödelian view of the analytic, at least two difficulties would confront the view that HP is analytic. The first is that (it is not neurotic to think) we don’t know that second-order arithmetic, which is equi-consistent with Frege Arithmetic, is consistent. Do we really know that some hotshot Russell of the 23rd century won’t do for us what Russell did for Frege? The usual argument by which we think we can convince ourselves that analysis is consistent—“Consider the power set of the natural numbers . . . ”—is flagrantly circular. Moreover, although we may think Gentzen’s consistency proof for PA provides sufficient reason to think PA consistent, we have nothing like a similar proof for the whole of analysis, with full comprehension. We certainly don’t have a constructive consistency proof for ZF. And it would seem to be a genuine possibility that the discovery of an inconsistency in ZF might be refined into that of one for analysis. Saying exactly which theories are known to be consistent is a difficult problem made even more difficult when one hears of respected mathematicians telling of their failed attempts to prove Q inconsistent, but ZF and analysis, and therefore also Frege arithmetic, are theories that are surely in the black area, not the grey. While we may regret that these theories may well be consistent and that it
Is Hume’s Principle Analytic?
15
would probably be wise to bet on their consistency, we must not despair: we do not know that they are and need not yet give up hope that someone will one day prove in one of them that 0 = 1. Uncertain as we are whether Frege arithmetic is consistent, how can we (dare to) call HP analytic? One final worry, perhaps the most serious of all, although one that may at first appear to be dismissible or silly or trivial: as there is a number, zero, of things that are non-self-identical, so, on the account of number we have been considering, there must be a number of things that are self-identical, i.e., the number of all the things that there are. Wright has usefully dubbed this number, #[x: x = x], anti-zero. On the definition of ≤, according to which m ≤ n iff ∃F∃G(m = #F ∧ n = #G∧ there is a one–one map of F into G), anti-zero would be a number greater than any other number. 14 Now the worry is this: is there such a number as anti-zero? According to Zermelo–Fraenkel set theory, there is no (cardinal) number that is the number of all the sets there are. The worry is that the theory of number we have been considering, Frege Arithmetic, is incompatible with Zermelo–Fraenkel set theory plus standard definitions, on the usual and natural readings of the non-logical expressions of both theories. To be sure, as Hodes once observed in conversation, if #α is taken to denote the cardinal number of α when α is a set and some favorite objects that is not a cardinal number when α is a proper class, then HP will be a theorem of von Neumann set theory. But on that definition of #, # will not be translatable as “the cardinal number of.” ZF and Frege arithmetic make incompatible assertions concerning what cardinal numbers there are. And of course, the response “Well, these are just formalisms; the question of their truth or falsity doesn’t arise or makes no sense” is hardly available to one claiming that HP is analytic, i.e., an analytic truth. So one who seriously believes that it is has to be bothered by the incompatibility of the consequence of Frege arithmetic that there is such a number as anti-zero with the claim made by ZF + standard definitions (on the natural reading of its primitives) that there is no such number. It is thus difficult to see how on any sense of the word “analytic,” the key axiom of a theory that we don’t know to be consistent and that contradicts our best-established theory of number (on the natural readings of its primitives) can be thought of as analytic.
14 By the Schröder–Bernstein theorem, which can be proved in second-order logic, ≤ is anti-symmetric: if m ≤ n ≤ m, then m = n.
IS HUME’S PRINCIPLE ANALYTIC? 1 Crispin Wright
1. It was George Boolos who, following Frege’s somewhat charitable lead at Grundlagen §63, first gave the name, “Hume’s Principle”, to the constitutive principle for identity of cardinal number: that the number of F’s is the same as the number of G’s just in case there exists a one-to-one correlation between the F’s and the G’s. The interest—if indeed any—of the question whether the principle is analytic is wholly consequential on what has come to be known as Frege’s Theorem: the proof, prefigured in Grundlagen §§82–3 and worked out in some detail in Wright [1983] 2 that second-order logic plus Hume’s Principle as sole additional axiom suffices for a derivation of second-order arithmetic— or, more cautiously, for the derivation of a theory which allows of interpretation as second-order arithmetic. (Actually I think the caution is unnecessary— more of that later.) Analyticity, whatever exactly it is, is presumably transmissible across logical consequence. So if second-order consequence is indeed a species of logical consequence, the analyticity of Hume’s Principle would ensure the analyticity of arithmetic—at least, provided it really is second-order arithmetic, and not just a theory which merely allows interpretation as such, which is a second-order consequence of Hume’s Principle. What significance that finding would have would then depend, of course, on the significance of the notion of analyticity itself. Later I shall suggest that the most important 1 This paper first appeared in the Notre Dame Journal of Formal Logic 40, [1999], pp. 6–30. Reprinted by kind permission of the editor and the University of Notre Dame. 2 At pp. 158–169. An outline of a proof of the Peano Axioms from Hume’s Principle is also given in the Appendix to Boolos [1990]. The derivability of Frege’s Theorem is first explicitly asserted in Charles Parsons [1964]; see remark at p. 194. My own ‘rediscovery’ of the theorem was independent. I do not know what form of proof Parsons had in mind, but the reconstruction of the theorem is trickier than Frege’s own somewhat telegraphic sketch suggests. For an excellent recent overview of the ins and outs of the matter— early on, they remark that
§§82–3 offer severe interpretative difficulties. Reluctantly and hesitantly, we have come to the conclusion that Frege was at least somewhat confused in these two sections and that he cannot be said to have outlined, or even to have intended, any correct proof there (p. 407) —see Boolos and Heck [1998].
17 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 17–43. c 2007 Springer.
18
The Arché Papers on the Mathematics of Abstraction
issues here are ones which are formulable without recourse to the notion of analyticity at all—so that much of the debate between Boolos and me could have finessed the title question. Boolos wrote that “having to discuss whether Hume’s Principle is analytic is rather like having to consider whether hydrogen sulphide is dephlogisticated” 3 —a question formulated, I suppose he meant, in a discredited theoretical vocabulary. That would be consistent, of course, with there being a good question nearby of which that was merely a theoretically unfortunate expression; it would also be consistent with there being enough sense to the theoretically unfortunate question to allow of a negative answer in any case. I myself do not believe that when the dust settles on analytical philosophy’s first century, our successors will find that the notion of analyticity was discredited by any of the well-known assaults. In particular, the two core lines of attack in “Two Dogmas of Empiricism”, namely. that the notion resists all non-circular explanation and that no statement participating in general empirical theory can be immune to revision, set an impossible—Socratic—standard for conceptual integrity and confuse analyticity with indefeasible certainty, respectively. What is undeniable, though, is that the status and provenance of analytic truths, and the cognate class of a priori necessary truths, would have to be a lot clearer than philosophers have so far managed to make them before a positive answer to our title question could be justified and shown to have the sort of significance which early analytical philosophy would have accorded to it. Boolos thought the situation was of the second kind noted: that the question is theoretically flawed but allows of well-motivated—though less than “knockdown”—arguments for a negative answer. To the best of my knowledge— I’m drawing just on three of his papers 4 which are reprinted in the excellent Demopoulos collection, 5 plus his ipsonymous paper in Richard Heck’s volume for Michael Dummett 6 —he proffered exactly five such arguments. In what follows I shall briefly explore how a character I shall call the neo-Fregean might respond to each of these arguments. Each is interesting, some are very searching, but—if I’m right—none does irreparable damage.
2. 2.1
The ontological concern
The ontological concern is epitomized in the following passage: I want to suggest that Hume’s Principle is to be likened to ‘the present King of France is a royal’ in that we have no analytic guarantee that for every value of ‘F’, there is an object that the open definite singular description, ‘the 3 Boolos [1997], p. 247. 4 Boolos [1987, 1990, 1986]. 5 Demopoulos [1995]. 6 Boolos [1997], Heck [1997].
19
Is Hume’s Principle Analytic? number belonging to F’ denotes . . . Our present difficulty is this: just how do we know, what kind of guarantee do we have, why should we believe, that there is a function that maps concepts to objects in the way that the denotation of octothorpe [that is: ‘#’, Boolos’s symbol for the numerical operator] does if HP is true? . . . do we have any analytic guarantee that there is a function that works in the appropriate manner? 7
The basic thought is that Hume’s Principle says too much to be an analytic truth. As normally conceived, analytic truth must hold in any possible domain. On a (purportedly) more relaxed conception, some analytic truths are allowed to hold in any non-empty domain. But how can a principle which entails— indeed, is strictly stronger than is necessary to entail—that there are infinitely many objects—indeed infinitely many objects of a special sort—possibly count as analytic? Here is the neo-Fregean reply. There is, to be sure, a perfectly good sense in which whatever is entailed by certain principles together with truths of logic may be regarded as entailed by those principles alone. In this sense it is undeniable that Hume’s Principle does entail the existence of infinitely many objects—at least if second-order consequence is a species of entailment. But the manner of the entailment is important. Hume’s Principle is a second-order universally quantified biconditional. As such, we are not going to be able to elicit the existence of any objects at all out of it save by appropriate input into (instances of ) its right-hand side. Thus we get the number zero by taking the instance of Hume’s Principle: Nx : x = x = Nx : x = x ↔ x = x
1≈1
x = x
(1)
together with its right-hand side as a minor premise. Compare the fashion in which we derive the direction of the line a from an instantiation of Frege’s illustrative equivalence for directions: (DE)
Da = Da ↔ a//a
together with its right-hand side as a minor premise: the necessary truth, modulo the existence of line a, that that line is parallel to itself. Sure, in the case of zero the minor premise: x = x
1≈1
x = x
(2)
can be established in second-order logic. So the existence of zero follows from this truth of logic, together with Hume’s Principle. If, accordingly, the latter can be regarded as, in all relevant respects, having a status akin to that of a definition, then the existence of zero is a consequence of logic and definitions. But that was exactly the classical account of analyticity: the analytical truths were to be those which follow from logic and definitions. So the existence of zero would be an analytic truth. And now with that in the bag, as it were, 7 Boolos [1997], p. 251. See more generally pp. 248–254, ibid.; also Boolos [1987] at p. 231 and Boolos [1990] at pp. 246–8. (The latter references are to the pagination in Demopoulos [1995].)
20
The Arché Papers on the Mathematics of Abstraction
nothing stands in the way of regarding x = 0 1≈1
x =0
(3)
as also an analytic truth, since it follows in second-order logic given only that there is such a thing as zero. But that is the right-hand side for the application of Hume’s Principle which, following Frege, we use to obtain the number one. So its existence is also analytic. We may now proceed in similar fashion to obtain each of the finite cardinals from putatively analytic premises, in second-order logic. Our result is thus not quite—when done this way— that it is analytic that there is an infinity of finite cardinals, but rather that of each of the finite cardinals, it is analytic that it exists. Doubtless this will be equally offensive to the traditional understanding of analyticity—the (as nearly as possible) existentially neutral understanding of analyticity—called forth in the above quotation from Boolos. But my point now is simply that, for the reasons just sketched, that understanding of analyticity had to be in jeopardy all along provided there is a starting chance that Hume’s Principle has an epistemic status relevantly similar to that of a definition. In sum: on the classical account of analyticity, the analytical truths are those which follow from logic and definitions. So if the existence of zero, one, and so on. follows from logic plus Hume’s principle, then provided the latter has a status relevantly similar to that of a definition, it will be analytic, on the classical account, that n exists, for each finite cardinal n. The idea which standardly accompanies the classical conception, that—with perhaps a very few, modest exceptions—existential claims can never be analytically true, is thus potentially in tension with the classical conception. If Hume’s Principle has a status not relevantly different from that of a definition, then we learn that the classical conception will not marry with this standardly accompanying idea. The core of the neo-Fregean stance is that Hume’s Principle does have such a status: that it may be seen as an explanation of the concept of cardinal number in general, covering the finite cardinals as a special case. Boolos asks, “If numbers are supposed to be identical if and only if the concepts they are numbers of are equinumerous, what guarantee have we that every concept has a number?” 8 Earlier he suggested, in the passage quoted, that there is no such guarantee—or anyway no “analytic guarantee”—proposing a parallel between the principle and the statement, “The present King of France is a royal”— something which is analytically true, modulo its existential presupposition. This is also Hartry Field’s position in his critical notice of Wright [1983]. 9 But I think this seemingly sane and reserved position is unstable. Consider the case of direction again. How do we know there are any objects which behave in the way that the referents of direction terms ought to behave, 8 Boolos [1997], p. 253. 9 Field [1984].
Is Hume’s Principle Analytic?
21
given their introduction by the direction equivalence (DE), that is, given that they are identical just in case the associated lines are parallel and distinct just in case they are not? Shouldn’t we just say that provided there are such things as directions in the first place, that will be the condition for their identity and distinctness? Well, if this were the right view of the matter, there could be no objection to making the presupposition explicit. The following principle would then count as absolutely analytic: that for any lines a and b, ((∃x)(∃y)(x = Da & y = Db)) → (Da = Db ↔ a//b)
(4)
But think: how are we to understand the antecedent of this? The condition for its truth must now incorporate some unreconstructed idea of what it is for contexts of the form, ‘ p = Da’ and ‘q = Db’ to be true—unreconstructed because Field and Boolos have just rejected the proposed sufficient conditions for the truth of such contexts, where ‘ p’ and ‘q’ are, respectively, direction terms, incorporated in DE. However, no other such sufficient condition has been proposed. So, if we side with Field and Boolos, we don’t have the slightest idea, actually, of what satisfaction of the antecedent of the supposedly more modest and reserved formulation could consist in. True, the reserved formulation could be made to raise an intelligible issue if relativised to an antecedently given domain of quantification—the issue would be whether any of the objects thereby already recognised, perhaps certain equivalence classes, are appropriately identified and distinguished in the light of relations of parallelism among lines. But Frege, remember, was trying to address the question how we come by and justify the conception of a domain of abstracta in the first place. If it is insisted that abstraction principles always stand in need of justification by reference to an antecedently given domain of entities, that’s just to presuppose—not argue—that they are useless in that project. And it is so far to offer no alternative conception of how the project might be accomplished. The neo-Fregean contention, by contrast, is that, under the right conditions, such principles are available to fix the truth-conditions of contexts of identity for a certain kind of thing and thereby—given appropriate input on their right-hand sides—to contribute towards determining that, and how it is possible for us to know that things of that kind exist. Boolos’s question “If numbers are supposed to be identical if and only if the concepts they are numbers of are equinumerous, what guarantee have we that every concept has a number?”, raises a doubt in the way he presumably wished to do only if it is granted that the existence of numbers is a further fact, something which the (mere) equinumerosity of concepts may leave unresolved. But the neo-Fregean’s intention in laying down Hume’s Principle as an explanation is so to fix the concept of cardinal number that the equinumerosity of concepts F and G is itself to be necessary and sufficient, without further ado, for the identity of the number of Fs with the number of Gs, so that nothing more is required for the existence of those numbers beyond the equinumerosity of the concepts. This idea is discussed more fully in the early sections of
22
The Arché Papers on the Mathematics of Abstraction
Wright [1997] and in Bob Hale’s [1997]. The key idea is that an instance of the left-hand side of an abstraction principle is meant to embody a reconceptualisation of the type of state of affairs depicted on the right. Here is not the place to pursue this crucial idea further. My point is merely that Boolos’s question either ignores this aspect of the neo-Fregean position or assumes it is ill-conceived.
2.2
The epistemological concern
A recurrent element in Boolos’s misgivings about Hume’s Principle concerns its proof-theoretic strength—more accurately, the strength of the system which results from its addition to axiomatic second-order logic. In part this concern relates to the ontological issues just reviewed. But there is a separate strand, nicely captured by a passage towards the end of “Is Hume’s Principle Analytic?” Boolos was the first to show that second-order logic plus Hume’s Principle is equi-interpretable with second-order arithmetic, and hence that each is consistent if the other is. 10 But he was not himself inclined to take that result as settling the question of the consistency of Hume’s Principle. He writes: . . . (it is not neurotic to think) we don’t know that second-order arithmetic . . . is consistent. Do we really know that some hotshot Russell of the 23rd Century won’t do for us what Russell did for Frege? The usual argument by which we think we can convince ourselves that analysis is consistent—“Consider the power set of the set of natural numbers . . . ”—is flagrantly circular . . . Uncertain as we are whether Frege arithmetic is consistent, how can we (dare to) call HP analytic? 11
Now, I do not myself know whether disclaiming knowledge of the consistency of Frege arithmetic is neurotic or not. But we must surely look askance at the presupposition of the concluding question, which arguably—as did Quine— confuses analyticity and certainty, or anyway insists that certainty is a precondition for warranted analyticity claims. That seems to me a great mistake. There is nothing incoherent in the idea that we can be defeasibly justified in believing or claiming to know that a proposition is true which, if true, is analytic. The neo-Fregean claim, remember, is that Hume’s Principle serves as an explanation of the concept of cardinal number. If it harbours some subtle inconsistency, then of course it fails as such an explanation—just as Basic Law V failed as an explanation of a coherent notion of set. But we can surely be fairly confident—though by all means with our eyes open—that Hume’s Principle is successful in that regard, and correspondingly confident that it enjoys the kind of truth possessed by any successful implicit definition—and hence is analytic in whatever may be the attendant sense. 10 Boolos [1987]. For a detailed proof, see the first appendix to Boolos and Heck [1998]. 11 Boolos [1997], pp. 259–60.
Is Hume’s Principle Analytic?
2.3
23
The concern about the universal number
The construction of the finite cardinals on the basis of Hume’s Principle relies entirely on the legitimacy of applying the numerical operator to some necessarily empty concept at the first stage, the concept not self-identical being the standard choice. On the face of it there should accordingly be no obstacle to applying the operator to the complement of any such concept, so arriving at the universal number, anti-zero—the number of absolutely everything that there is. Certainly Hume’s Principle as standardly formulated poses no obstacle to such an application. As Boolos puts it, As there is a number, zero, of things that are non-self-identical, so, on the account of number we have been considering, there must be a number of things that are self-identical, i.e., the number of all the things that there are. 12
Now, Hume’s Principle can be no less dubious than any of its consequences, one of which is the claim then that there is such a number. But . . . the worry is this: is there such a number as anti-zero? According to [ZF] there is no cardinal number that is the number of all the sets there are. The worry is that the theory of number we have been considering, Frege arithmetic, is incompatible with Zermelo–Frankel set theory plus standard definitions . . . one who seriously believes that [HP is an analytic truth] has to be bothered by the incompatibility of the consequence of Frege arithmetic that there is such a number as anti-zero with the claim made by ZF plus standard definitions . . . that there is no such number. 13
This objection, Boolos wrote, although it “may at first appear to be dismissible as silly or trivial”, is “perhaps the most serious of all”. It’s certainly an arresting objection, about which there is a good deal to say. Clearly there would be great discomfort in regarding any principle as analytically true if the cost of doing so was regarding Zermelo–Frankel set theory as analytically false. A first rejoinder would be that any such upshot would depend on cross-identification of the referents of terms in Frege arithmetic and terms in Zermelo–Frankel set theory—the “standard definitions” to which Boolos alludes. Who said numbers like anti-zero had to be sets, after all? However the more general worry underlying Boolos’s point—the worry about the coherence of Hume’s Principle with standard set theory—need not depend on such cross-identification. Grant the plausible principle (to which I shall return below) that there is a determinate number of F’s just provided that the F’s compose a set. Zermelo–Frankel set theory implies that there is no set of all sets. So it would follow that there is no number of sets. Yet for all we have so far seen, the property, set, lies within the range of the secondorder quantifiers in Hume’s Principle and the usual proof, via the reflexivity of equinumerosity, should therefore serve to establish, to the contrary, that there 12 Boolos [ibid.], p. 260. 13 Ibid.
24
The Arché Papers on the Mathematics of Abstraction
is such a number. So there would seem to be a collision with Zermelo-Frankel set theory in any case, whether or not anti-zero is identified with a set. However I think there is good reason to expect a principled and satisfying response to this general trend of objection. Consider the direction equivalence, DE, again. The reflexivity of the relation, . . . is parallel to . . . , ensures in the presence of DE that a has a direction, no matter what straight line a may be. But the question arises: what of the implications of DE for the case where a and b fail to be parallel because they are not even lines, as for example my hat fails to be parallel to my shoe. We might have been tempted to allow that the D-operator is totally defined—to allow that every object, without restriction, has a direction: in the case of an object which fails to be parallel to anything else because it is merely not a line, this would then be a direction that nothing else has. But a moment’s reflection shows that is not an option: if the failure of parallelism between my hat and my shoe is down to the unsuitability of either object to be parallel to anything, then by the same token they are not self parallel, and DE provides no incentive to regard either as having a direction at all. Moral: just as not every object is suitable to determine a direction, so we should not assume without further ado that every concept—every entity an expression for which is an admissible substituen for the bound occurrences of the predicate letters in Hume’s Principle—is such as to determine a number. That’s only a first step, of course. What is wanted for the exorcism of antizero is nothing less than grounds for affirming that whereas the concept, not self-identical, or any other self-contradictory concept, is a suitable case for application of the numerical operator, its complement is not. Here are two, independent such lines of thought: The first line is directed specifically at anti-zero. To accept Frege’s insight that statements of number are higher-level—that they state things of concepts—is quite consistent with the familiar observation that a restriction is needed which he does not draw. The basic case in which the question, how many F’s are there? makes sense—or at least has a determinate answer— is that of a special class of substitutions for ‘F’: what are sometimes called ‘count nouns’, or expressions for ‘sortal concepts’. While it is by no means the work of a moment to make this notion sharp, the usual intuitive understanding is that a sortal concept is one associated both with a criterion of application— a distinction between the things to which it applies and those to which it does not —and a criterion of identity: some principle determining the truth values of contexts of the form, ‘X is the same F as Y ’. ‘Tree’, ‘person’, ‘city’, ‘river’, ‘number’, ‘set’, ‘time’, ‘place’, are all, in at least certain uses, sortal concepts in the intended sense. By contrast, ‘red’, ‘composed of gold’, ‘large’—in general, purely qualitative predicates, predicates of constitution, and attributive adjectives—although syntactically admissible substituents for occurrences of the predicate letters in higher-order logic, are not. Call the latter class of expressions: mere predicables. Where F is a mere predicable, then, the suggestion is that the question, how many F’s are there? is ceteris paribus
Is Hume’s Principle Analytic?
25
deficient in sense and ‘the number of F’s’, accordingly, has no determinate reference. It is easy to see that ‘is self-identical’ is a mere predicable. For reflect that—prescinding from any cases of vagueness—mere predicables do nevertheless subserve determinate questions of cardinal number when their scope is restricted to that of some specific sortal concept: thus there can be a determinate number of red apples in the bowl, of gold rings in the jeweller’s window, and of large women at the reception. So if ‘self-identical’ were a sortal concept, it should follow that there can be determinate numbers of red self-identicals in the bowl, golden self-identicals in the jeweller’s window, and large self-identicals at the reception. However since ‘F and self-identical’ is equivalent to ‘F’, it follows that there can be no such determinate number wherever there is no determinate number of Fs. So self-identity is not a sortal concept. If we take it that, save where F is assured an empty extension on purely logical grounds, 14 only sortal concepts, and concepts formed by restricting a mere predicable to a sortal concept, have cardinal numbers, it follows that there is no universal number. To be sure, this first consideration will of course not engage the question whether we may properly conceive of a number of all ordinals, or all cardinals, or all sets—in general, cases where we are concerned with the results of applying the numerical operator to concepts which are (presumably) sortal but “dangerously” big. And as we saw, a variant of Boolos’s objection, that there is a potential clash of Hume’s Principle with Zermelo–Frankel set theory, does equally arise in those cases. However, a principled objection to the idea that there should be determinate numbers associated with these concepts may be expected to issue from the second line of thought, which concerns the tantalising notion of indefinite extensibility. As noted a little while ago, it seems natural and well motivated to suppose that the Fs should have a determinate cardinal number just when they compose a set. But a long tradition in foundational studies would argue that set-hood cannot be the right way to conceive of Frege’s intentionally all-inclusive domain of objects: that Cantor’s paradox shows, in effect, that there can be no universal set—no absolutely all-embracing totality which is subject, for example, to the operations and principles that provide for the proof of Cantor’s theorem. That is not the same as saying that unrestricted first-order quantification is illegitimate—a concession which would, of course, be fatal to Frege’s 14 A plausible general principle (suggested to me by Bob Hale) of which this exception would be a special case would be this: that a non-sortal concept, F, may nevertheless have a determinate cardinal number if every sortal restriction of it has the same cardinal number. This would not, of course, legitimate anti-zero, since the cardinality of sortal restrictions of the form, Gx & x = x, will vary with that of G. But it would save the standard Fregean definition of zero. (Would there be any instances of this principle other than those mere predicables which are necessarily uninstantiated?) More generally, we might—indeed, ought to—allow that a non-sortal F may determine a number if we know that all and only F-things are G, where G is sortal and non-indefinitely extensible. (But again, are there any such cases?)
26
The Arché Papers on the Mathematics of Abstraction
whole project. The point is rather that the objects that lie in the range of such unrestricted quantification compose not a determinate totality but one that is, in the phrase coined by Michael Dummett, “indefinitely extensible” 15 —a totality of such a sort that any attempt to view it as a determinate collection of objects will merely subserve the specification of new objects which ought, intuitively, to lie within the totality but cannot, on pain of contradiction, be supposed to do so. I do not know how best to sharpen this idea, still less how its best account might show that Dummett is right, both to suggest that the proof-theory of quantification over indefinitely extensible totalities should be uniformly intuitionistic and that the fundamental classical mathematical domains, like those of the natural numbers, or the reals, should also be regarded as indefinitely extensible. But Dummett could be wrong about both those points and still be emphasising an important insight concerning certain very large totalities— ordinal number, cardinal number, set, and indeed “absolutely everything”. If there is anything at all in the notion of an indefinitely extensible totality—and there are signs that the issue is now being taken up in productive ways 16 — one principled restriction on Hume’s Principle will surely be that F and G not be associated with such totalities. So that is a second definite programme for understanding how, in particular, not self-identical might determine a cardinal number even though self-identical does not. Indeed, when the range of both individual and higher-order variables is unrestricted, the complement of any determinate finite concept is presumably always an indefinitely extensible totality.
2.4
The concern about surplus content
This is the objection I find it hardest to be sure I properly understand. Here is one of Boolos’s expressions of it: It is known that Hume’s Principle does not follow . . . from the conjunction of two of its strong consequences: . . . that nothing precedes zero and that precedes is a one–one relation. If HP is analytic, then it is strictly stronger . . . than some of its strong consequences. It’s also known that arithmetic follows from these two statements alone . . . faced with these results, how can we really want to call HP analytic? 17
The objection is developed and endorsed by Richard Heck in recent work, 18 and I shall rely on his interpretation of it. Heck emphasises that there is a long conceptual leap involved in advancing to the concept of cardinal number enshrined in Hume’s Principle in full generality for one whose previous 15 Dummett first introduced this notion—which of course ultimately derives from one strand in Russell’s Vicious Circle Principle—in his [1963] (reprinted in M. Dummett, Truth and Other Enigmas, London: Duckworth 1978, pp. 186–201). It is central to the argument of the concluding chapter of Dummett [1991]. See also his “What is Mathematics About?” in Dummett [1993] at pp. 429–45. 16 See for instance Clark [1998], Oliver [1998], and Shapiro [1998]. 17 Boolos [1997], p. 249. 18 Heck [1997a].
Is Hume’s Principle Analytic?
27
acquaintance with cardinal number—a pre-Cantorian as it were—is restricted to finite arithmetic and its applications. The length of the leap is reflected in the results about the proof-theoretic strength of various systems, including Fregean arithmetic—i.e., Hume’s Principle plus second-order logic—secondorder Peano arithmetic and certain intermediaries which, building on work of Boolos’s, Heck demonstrates. 19 Here is his conclusion: . . . HP, conceptual truth or not, cannot be what underlies our knowledge of arithmetic. And no amount of reflection on the nature of arithmetical thought could ever convince one of HP, nor even of the coherence of the concept of cardinality of which it is purportedly analytic. Granted, any rationalist project of this sort will have to invoke a distinction between the ‘order of discovery’ and the ‘order of justification’. But the objection is not that Hume’s Principle is not known by ordinary speakers, nor that there was a time when the truths of arithmetic were known, but HP was not. It is that, even if HP is thought of as ‘defining’ or ‘introducing’ or ‘explaining’ our present concept of cardinality, the conceptual resources required if one is so much as to recognise the coherence of this concept (let alone HP’s truth) vastly outstrip the conceptual resources employed in arithmetical reasoning. Wright’s version of logicism is therefore untenable. 20
Heck goes on to consider whether some form of Hume’s Principle restricted to finite concepts might be resistant to the particular objection, that is, whether such a principle might be appreciable as a correct digest of its constitutive principles by one possessed just of the conceptual resources deployed in finite cardinal arithmetic and its applications. That is an interesting question, on which he offers interesting formal and informal reflection. But I have a prior difficulty in seeing that the original objection, concerning the conceptual excess of Hume’s Principle over second-order Peano arithmetic, does any serious damage to any contention that the neo-Fregean should want to make. Grant that a recognition of the truth of Hume’s Principle cannot be based purely on analytical reflection upon the concepts and principles employed in finite arithmetic. The question, however, surely concerned the reverse direction of things: it was whether access to those concepts and validation of those principles could be achieved via Hume’s Principle, and whether Hume’s Principle might in its own right enjoy a kind of conceptual status that would make that result interesting. The latter is, in effect, exactly the question raised by our title. But no particular view of it can be motivated merely by the reflection that the conceptual resources involved in Hume’s Principle, insofar as an extension of the notion of cardinal number to the infinite case is involved, considerably exceed those involved in ordinary arithmetical competence. More: it is unclear how anyone wishing to demonstrate the analyticity of arithmetic could clear-headedly acquiesce in the rules of debate implicit 19 See Heck [1997a], Section 4. 20 Heck [1997a], pp. 597–8.
28
The Arché Papers on the Mathematics of Abstraction
in Heck’s discussion. Those rules require that one canvasses some principle which is supposedly analytic of ordinary arithmetical concepts in the precise sense that it could be recognised by reflection as systematising those ordinary concepts and their proof theory. But, of course, an axiom could, in that sense, be analytic of a thoroughly synthetic theory, and itself as synthetic as that theory. (There might be a single such axiom which could be reflectively recognised as systematising exactly Euclidean geometry.) To be sure, it is a necessary condition of the success of the neo-Fregean project that the relevant principle does more than generate a theory within which arithmetic can be interpreted—there has to be a tighter conceptual relationship than that. But it is no necessary condition for the satisfaction of this necessary condition that there be no conceptual surplus of the axiom over the theory. And it is no sufficient condition of the analyticity of such an axiom that there be none; for again, a reflectively correct digest of a synthetic theory will be itself synthetic.
2.5
The concern about bad company
Boolos’s final objection is perhaps the most interesting and challenging of all. It begins with the excellent observation that there are close analogues of Hume’s Principle, specifically, principles taking the form of second-order abstractions, linking the obtaining of a (second-order logically definable) equivalence relation on concepts to the identity condition for certain associated objects, which are self-consistent (that is, the systems consisting of secondorder logic plus one of these principles are, arguably, consistent) yet which are inconsistent with Hume’s Principle. A nice example is what I have elsewhere called the Nuisance Principle (NP). The nuisance associated with the concept F is the same as the nuisance associated with the concept G just in case the symmetric difference between F and G—the range of things which are either F or G but not both—is finite. Straightforward set-theoretic reasoning leads to the conclusion that any universe in which NP is satisfied must be a finite one. 21 But it is, apparently, a self-consistent principle—it does have finite models. If Hume’s Principle is analytic, then NP is analytically false. But with what right could we make that claim—isn’t the analogy between the two principles near enough perfect? This challenge—there dubbed the ‘Bad Company’ objection—is treated in some detail in Wright [1997] on which Boolos’s “Is Hume’s Principle Analytic?” was commentary. My suggestion in that paper was that the first step to disarming it is a deployment of (something very close to) Hartry Field’s notion of conservativeness. A principle, or set of principles, is conservative with respect to a given theory when, roughly, its addition to that theory results
21 For details see Wright [1997] at pp. 221–5.
Is Hume’s Principle Analytic?
29
in no new theorems about the old ontology. 22 One could hope that Hume’s Principle will be conservative with respect to any theory for which secondorder Peano arithmetic is conservative (that is, one would hope, any theory whatever). By contrast, the consistent augmentation of any theory, T , by NP will result in a theory of which it is a consequence that all categories of the original ontology of T are at most finitely instantiated. No pure definition could permissibly have that effect. So no merely conceptual-explanatory principle— no principle whose role, as that of abstractions is supposed to be, is merely to fix the truth-conditions of a range of contexts featuring a new kind of singular-term forming operator and is otherwise to be as close as possible to that of a pure definition—can permissibly have it either. Since it has consequences for the size of extensions of concepts which are quite unrelated to that which it purportedly serves to introduce, NP thus cannot be viewed as such a conceptual-explanatory principle. Moreover, any abstraction principle which clashes with Hume’s Principle by requiring the finitude of any domain in which it is to hold will be in like case. And indeed any abstraction principle which places an upper bound, finite or infinite, on the size of the universe will be non-conservative with respect to some consistent theory of things other than the abstracts it concerns. The particular analogy is therefore broken: Hume’s Principle, there is undefeated reason to hope, is conservative with respect to every consistent theory concerning things other than its own special ontology—the cardinal numbers. (That is, note, a kind of weak analyticity: if there were a possible world in which Hume’s Principle failed, it would have to be by dint of its misrepresentation of the nature of the cardinals in that world.) NP and its kin, by contrast, come short by this constraint. An abstraction is acceptable only if it is conservative with respect to every consistent theory whose ontology does not include its proper abstracts. It is a logical abstraction just in case its abstractive relation is definable in higherorder logic. The company kept by Hume’s Principle is thus, we may presume, that of conservative, logical abstractions. But are these all Good Companions? 22 A tidied version of the characterisation offered in Wright [1997] (at note 49, p. 232) would be as follows. Let:
()
(∀αi )(∀α j ) ((αi ) = (α j ) ↔ αi ≈ α j ),
be any abstraction. Introduce a predicate, Sx, to be true of exactly the referents of the -terms and no other objects. Define the -restriction of a sentence T to be the result of restricting the range of each objectual quantifier in T to non-S items—thus each sub-formula of T of the form (∀x)Ax is replaced by one of the form (∀x)(¬ Sx → Ax) and each sub-formula of the form (∃x)Ax is replaced by one of the form (∃x)(¬ Sx & Ax). The -restriction of a theory θ is correspondingly the theory containing just the -restrictions of the theses of θ . Let θ be any theory with which is consistent. Then is conservative with respect to θ just in case, for any T expressible in the language of θ , the theory consisting of the union of () with the -restriction of θ entails the -restriction of T only if θ entails T. The requirement on acceptable abstractions is, then, that they be conservative with respect to any theory with which they are consistent. (The tidying referred to, for which I am indebted to Alan Weir, consists in having the reference to the -restriction of θ , rather than as originally one simply to θ , in the clause for ‘conservative with respect to θ ’.)
30
The Arché Papers on the Mathematics of Abstraction
Recent critics of neo-Fregeanism have observed that they are not, so that the fifth concern extends beyond the point that Boolos himself took it to. I pursue the matter in the first Appendix.
3. It should now be apparent why I suggested earlier that my debate with Boolos could as well have proceeded, near enough, without recourse to the notion of analyticity. The point is simply that each of Boolos’s objections is, in effect, independent of the problematical aspects of that notion: what was really bothering him was not whether Hume’s Principle is analytic, but whether it is true, and whether and how we might be warranted in regarding it as being so. Thus, without any really significant loss, the five points of concern might be formulated as: 1. With what right do we regard ourselves as warranted in accepting a principle with such rich ontological implications—how do we know that there is any function which behaves as the referent of octothorpe must? 2. What warrant do we have for confidence that the strong theory—Fregean arithmetic—to which Hume’s Principle gives rise is a consistent theory? 3. Is not its inconsistency with Zermelo–Frankel set theory (plus standard definitions) a strong ground for doubting the truth of Hume’s Principle? 4. What warrant is there for accepting a principle which is supposed to provide a foundation for arithmetic yet has so much surplus content over arithmetic? 5. With what right do we accept a principle which seems to be on all fours with other consistent principles which are inconsistent with it?
These are all good concerns, and I hope I have indicated, point-by-point, something of the direction in which the neo-Fregean should try to launch respective responses to them. The crucial point remains that the notion of analyticity is not required to formulate the concerns. What is really at stake, rather, is the nature of our entitlement to Hume’s Principle. A worked-out account of the notion of analyticity, in all its varieties, might well provide an answer to the question. But the answer the neo-Fregean wants to give is not hostage to the provision of such an account. Let me rapidly recapitulate that answer. The neo-Fregean thesis about arithmetic is that a knowledge of its fundamental laws (essentially, the Dedekind-Peano axioms)—and hence of the existence of a range of objects which satisfy them—may be based a priori on Hume’s Principle as an explanation of the concept of cardinal number in general, and finite cardinal number in particular. More specifically, the thesis involves four ingredient claims: 23 (i) that the vocabulary of higher-order logic plus the cardinality-operator, octothorpe or ‘Nx: . . . x . . . ’, provides a sufficient definitional basis for a statement of the basic laws of arithmetic; 23 I here rely again on formulations given in Wright [1998a].
Is Hume’s Principle Analytic?
31
(ii) that when they are so stated, Hume’s Principle provides for a derivation of those laws within higher-order logic; (iii) that someone who understood a higher-order language to which the cardinality operator was to be added would learn, on being told that Hume’s Principle governs the meaning of that operator, all that it is necessary to know in order to construe any of the new statements that would then be formulable; (iv) finally and crucially, that Hume’s Principle may be laid down without significant epistemological obligation: that it may simply be stipulated as an explanation of the meaning of statements of numerical identity, and that—beyond the issue of the satisfaction of the truth-conditions it thereby lays down for such statements—no competent demand arises for an independent assurance that there are objects whose conditions of identity are as it stipulates.
The first and third of these claims concern the epistemology of the meaning of arithmetical statements, while the second and fourth concern the recognition of their truth. With which of them would Boolos disagree? Even with a qualification I will come to in a minute, I think he had no quarrel with the first; nor, of course, with the second, which is just the point proved by Frege’s Theorem. And to accept just these two claims, of course, is already to acknowledge a substantial Fregean achievement: the analytical reduction of the primitive vocabulary of arithmetic to a base that contains just one nonlogical expression, the cardinality operator; and a demonstration that, on that basis, the fundamental laws of arithmetic can be reduced to just one: Hume’s Principle itself. The qualification concerning the first claim concerns the interpretation of the phrase “sufficient definitional basis”. No question of course but that Frege shows how to define expressions which comport themselves like those for successor, zero, and the predicate ‘natural number’, thus enabling the formulation of a theory which allows of interpretation as Peano arithmetic. But—as we remarked right at the start—it is one thing to define expressions which, at least in pure arithmetical contexts, behave as though they express those various notions, another to define those notions themselves. And it is the latter point, of course, that is wanted if Hume’s Principle is to be recognised as sufficient for a theory which not merely allows of pure arithmetical interpretation but to all intents and purposes is pure arithmetic. How is the stronger point to be made good? Well, I imagine it will be granted that to define the distinctively arithmetical concepts is so to define a range of expressions that the use thereby laid down for those expressions is indistinguishable from that of expressions which do indeed express those concepts. The interpretability of Peano arithmetic within Fregean arithmetic ensures that has already been accomplished as far as all pure arithmetical uses are concerned. So any doubt on the point has to concern whether the definition of the arithmetical primitives which Frege
32
The Arché Papers on the Mathematics of Abstraction
offers, based on Hume’s Principle and logical notions, are adequate to the ordinary applications of arithmetic. Did Frege succeed in showing how the concepts of arithmetic, as understood both in their pure and applied uses, can be understood simply on the basis of second-order logic and the numerical operator, as constrained by Hume’s Principle, or could someone fully understand the entirety of the construction without having the slightest inkling of the ordinary meaning of arithmetical claims? The matter needs more detail than I will offer here, but I think it’s clear that Frege did succeed in the more ambitious task, and a crucial first step in seeing that he did so is to realise that Hume’s Principle provides for the proof of a very important principle, dubbed N q by Bob Hale, to the effect that for each numeral, ‘n f ’, defined in Frege’s way, we can establish that n f = Nx : Fx ↔ there are exactly n Fs where the second occurrence of ‘n’ is schematic for the occurrence of an arabic numeral as ordinarily understood. 24 It follows that each Fregean numeral has exactly the meaning in application which it ought to have. That seems to me sufficient to ensure that Hume’s Principle itself enforces the interpretation of Fregean arithmetic as genuine arithmetic, and not merely a theory which can be interpreted as such. If this is right, then the key philosophical issues must concern the third and fourth claims. The importance of the third claim derives from the consideration that Hume’s Principle is not, properly speaking, an eliminative definition— it allows the construction of uses of the numerical operator which it does not in turn provide the resources eliminatively to define. Its claim to serve as an explanatory basis for arithmetic must therefore depend on its ability somehow to explain such uses in a non-strictly definitional fashion. Arguing the point requires stratifying occurrences of the numerical operator in sentences of Fregean arithmetic according to the degree of complexity of the embedding context, and making a quasi-inductive case: first, that a certain range of basic uses are unproblematic, and second, that at every subsequent stage, the type of occurrence distinctive of that stage may be understood on the basis of an understanding of the mode of occurrence exemplified at the immediately preceding stage. There are some complications with this; I’ve tried to work through the point in some detail elsewhere, 25 and will not repeat the detail here. For what it’s worth, it is Michael Dummett, rather than Boolos, who has been the most vociferous opponent of the third neo-Fregean claim.
24 I reproduce in the second Appendix the proof of this claim given at pp. 366–8 of Wright [1998]. 25 See Section V of Wright [1998] and—for a supplementary consideration in response to an objection
of Dummett’s—Section VI of Wright [1998a].
Is Hume’s Principle Analytic?
33
It is the fourth claim—the claim that Hume’s Principle can be laid down as an explanatory stipulation, without further epistemological obligation— which seems to me to be the heart of the issue. Boolos was indeed uncomfortable with this claim, suspecting that more had been smuggled into the notion of explanation in this setting than was consistent with the seeming modesty of the explanatory thesis. But I do not feel that I have understood his reservations terribly well. If nominalism is a misconception—if it is possible to know of abstract entities and their properties at all—then it has to be because we have so fixed the use of statements involving reference to and quantification over such entities as to bring the obtaining of their truthconditions somehow within our powers of recognition. And whatever this fixing consisted in, it has to have been something we did by way of determination of meaning, and it should therefore have involved no epistemological obligations which are not involved in the construction of concepts and the determination of meanings generally. I really do not see why the fashion in which Hume’s Principle—if it indeed succeeds in doing so—determines the truthconditions of statements which configure the cardinality operator with secondorder logical concepts, should be epistemologically any more problematical than any definition or other form of stipulation whose effect is to fix the truthconditions of statements containing a targeted (type of ) term. It is of course— always—another question whether those truth-conditions are satisfied: something which a definition, without supplementary considerations, is powerless to determine. But a good abstraction principle always determines very explicitly what those supplementary considerations are to be—you have only to look at its right-hand side. If there are good reservations about this way of looking at Hume’s Principle, I do not think that they have yet been compellingly formulated. Whatever the ultimate assessment of that issue may prove to be, it is my hope that the foregoing overview of Boolos’s misgivings about the analyticity of Hume’s Principle may serve as a reminder of two things: first (we owe it to Frege to recognise) that there is still an unresolved debate to be had about the viability of something that is, in all essential respects, a Fregean philosophy of arithmetic and real and functional analysis; 26 second, that the progress made in the modern debate is owing in very considerable measure to George’s brilliant and unique articles on the issues.
26 This is a point that Boolos enthusiastically accepted:
. . . I want to endorse Wright’s . . . suggestion that the problems and possibilities of a Fregean foundation for mathematics remain [wide?] open and [his] remark . . . that ‘the more extensive epistemological programme which Frege hoped to accomplish in Grundgesetze is still a going concern. (Boolos [1997], p. 246). For interesting preliminary steps towards the extension of the neo-Fregean programme to the classical theory of the reals, see Bob Hale [2000].
34
The Arché Papers on the Mathematics of Abstraction
Appendix A. Conservativeness and modesty In their [2000], Shapiro and Weir observe that there are pairs of abstractions which result by various kinds of selection for φ in (D)
(∀F)(∀G)( F = G ↔ (φ F & φG) ∨ (∀x)Fx ↔ Gx)27
which are jointly unsatisfiable yet which are presumably conservative in the germane sense. For instance, take φ respectively as ‘is the size of the universe and some limit inaccessible’ and ‘is the size of the universe and some successor inaccessible’. (The Neo-Fregean should resist any tendency to impatience at the rarefied character of the example. These notions are definable in higherorder logic.) Any instance of schema (D) entails that some F is φ. So the two indicated abstractions respectively entail that the universe is limit-inaccessible sized and that it is successor-inaccessible sized. It cannot be both. Yet neither implication places any overall bound on the size of the universe—so these abstractions do not involve the kind of non-conservativeness which NP entrained. Still, they cannot both be in good standing. And if either is not, then it seems that neither should be. But by what (well-motivated) principle might they be excluded? What virtue does Hume’s principle have which they lack? What is intuitively salient about any D-schematic abstraction (henceforward “Distraction” 28 ) is that, the entailment notwithstanding, it provides no motive to believe that there is a concept which falls under its particular selection for ‘φ’—the result is obtained merely by exploitation of the embedded antinomy. For on the assumption of (∀F)¬(φ F) any Distraction entails Basic Law V: (∀F)(∀G)( F = G ↔ (∀x)(Fx ↔ Gx)) and thereby Russell’s Paradox. Such abstractions thus have no more bearing on the truth of the relevant ‘(∃F)φ F’ than instances of the following schema have: (∀F)F is φ-terological ↔ F does not apply to itself or φ F which likewise, on the assumption of (∀F)¬φ F entail the well-known Heterological paradox: (∀F)F is heterological ↔ Fdoes not apply to itself. 27 This is schema (D) discussed in some detail in Wright [1997]; see pages 216 and following. 28 Alan Weir’s puckish term.
Is Hume’s Principle Analytic?
35
Again, we can select for ‘φ F’ that F is the size of the universe and some limit inaccessible, or the size of the universe and some successor inaccessible, or that F applies to God, or the Devil . . . and proceed to infer that the universe is limit-inaccessible in cardinality, or successor inaccessible, or that God, or the Devil, exists. It is long familiar how Liar-family paradoxes can occur not merely in contexts of self-contained aporia but may be exploited to yield unmotivated a priori resolutions of intuitively unrelated issues. The Cretan and the Curry Paradox are the best known examples of the latter. The schema for φ-terologicality, and Distractions as a class, merely provide two more. This perspective offers the option of a ‘holding’ response to the Shapiro/ Weir objection: “You persuade me”, the neo-Fregean may say, “that the general idea that a concept may be defined by stipulation of its satisfactionconditions is somehow confounded by the possibility of pairwise incompatible yet consistent instances of the rubric for φ-terologicality and I will concede that the neo-Fregean conception of an abstraction principle is put in similar difficulties by conservative yet pairwise incompatible instances of (D).” This response is dialectically strong. Who would suppose that roguish cases like “heterological” and instances of φ-terologicality somehow show that we may no longer in good intellectual conscience regard the general run of definitions of the form: X is F if and only if . . . X . . . as successful in fixing concepts? But then someone who had no other objection to the claim of Fregean abstractions to play the role of truth-condition fixers for the kinds of context that feature on their left-hand sides should not be fazed by roguish instances of (D). 29 It is only a holding response, however. It refurbishes one’s confidence that it has to be possible to draw the distinction which the neo-Fregean needs, but it does not draw it. The fact remains that just as a general explanation is owing of which are the pukka definitions of satisfaction-conditions, and which may be dismissed as rogues, so we still need a characterisation of which are the good abstractions and which are the (conservative but still) bad Distractions. In Wright [1997], motivated in part by the desire to legitimate Boolos’s axiom New V: (∀F)(∀G)( F = G ↔ (Big(F) & Big(G) ∨ (∀x)Fx ↔ Gx)) (where F is Big just if it has a bijection with self-identity) I ventured an additional conservativeness constraint which would be tolerant of at least some instances of schema (D) but would reject the majority. Roughly, it was that those consequences of such an abstraction which follow by exploitation of its “paradoxical component” have to be in ‘independent good standing’. I shall here attempt briefly to clarify and assess this proposal. 29 Cf. Wright [1997], pp. 220–1.
36
The Arché Papers on the Mathematics of Abstraction
Distractions entail conditionals of the form: ¬(∃F)(φ F) → (∀F)(∀G)( F = G ↔ (∀x)(Fx ↔ Gx)) The immediate intent of the proposed constraint is that anything derivable by the reductio of the antecedent of such a conditional afforded by its paradoxical consequent should be in independent good standing. New V fares well by this proposal: that there is a concept which is Big should presumably be a result in ‘independent good standing’ however that idea is filled out—for that selfidentity itself is Big follows from the definition of ‘Big’ in second-order logic. Of course, any abstraction will entail some such conditional. So the proposed constraint is quite general. How does Hume’s principle fare by it? Well enough, presumably, though in a different way. We may, for instance, obtain a relevant conditional by selecting ‘at least countably infinite’ for φ. But this time the resources required to make good the consequences of the denial of the antecedent are afforded not just by second-order logic but by Hume’s Principle itself, via its independent proof of the infinity of the number series. Indeed, it is just because it independently entails that denial that we are able to show that Hume’s Principle entails the selected conditional in the first place. By contrast, the kinds of roguish Distraction illustrated presumably fail the test. The only resources they have to show, e.g. that the universe is limit-inaccessible, or successor inaccessible, or whatever, are those furnished by the inconsistency of Basic Law V and the consequent modus tollens on the relevant conditional. So: an abstraction is good only if any entailed conditional whose consequent is Basic Law V (or, therefore, any other inconsistency) is such that all further consequences which can be obtained by discharging the antecedent are in independent good standing, as may be attested by their derivation in pure higher-order logic (like the case of New V) or their independent derivability from the abstraction in question (like the case of Hume’s Principle). But this is unclear in a crucial respect: what is the relevant sense of ‘independent derivability’? Clearly it would not be in keeping with the intended constraint if there were merely some collateral derivation of just the same suspect kind. The ‘independent derivation’ must be bona fide, must not proceed by “paradoxexploitative” means, as I expressed the matter. But what does that mean? In particular, how might it be characterised so as not to outlaw any proof by reductio ad absurdum? One possible response—the one I offered in Wright [1997]—was that a relevantly narrow sense of “paradox-exploitative” may be captured by reinvoking the previous (Fieldian) notion of conservativeness in the following way: a derivation from a conservative abstraction is paradox-exploitative just if there is a representation of its form of which any instance is valid and of which some instance amounts to a proof of the non-conservativeness of another abstraction. For instance, the derivation of the successor-inaccessibility of the universe from the Distraction canvassed above is paradox-exploitative because it may be schematised under a valid form of which another instance is a
Is Hume’s Principle Analytic?
37
derivation, from the appropriately corresponding Distraction, that the universe contains exactly 144 objects. The only Distractions which are good are those which are both conservative and such that any of their consequences which may validly be derived by paradox-exploitative means, in the stipulated sense, may also validly be derived by non-paradox-exploitative means. Otherwise put: the second conservativeness constraint is that the paradox-exploitative derivations from an abstraction have to be conservative with respect to the results obtainable from it non-paradox-exploitatively. That was the essence of my previous proposal. In practice, its application would work like this. We would be defeasibly entitled to accept any (presumably) conservative abstraction, A, from which we had so far been able to construct no paradox-exploitative derivation—no proof of a valid form of which another instance demonstrated the non-conservativeness of another abstraction. But once we had such a derivation, it would then be inadmissible to accept A until we had found another non-paradox-exploitative derivation from it of the same conclusion: a formally valid derivation of which, so far as we could tell, no other instance was a proof of the non-conservativeness of another abstraction. That is apt to seem uneasily complex and less clearly motivated than one would wish. And one might worry about its reliance on our ability to judge non-paradox-exploitative derivations. However the play with ‘paradoxexploitation’, and its characterisation in terms of non-conservativeness may now seem inessential. The basic idea was that some abstractions—the Distractions and some others—are at the service of non-cogent proofs. We can tolerate this in particular cases so long as such proofs are matched by cogent ones of the same things. The natural—surely correct—objection to the derivation of, say, the successor inaccessibility of the universe from the appropriate Distraction is that it is unconvincing because “You could just as well prove the opposite—or anything—like that”, where “like that” means: by laying down a different (presumably consistent) Distraction and reasoning in just the same way. So a natural thought would be that we should ban those distractions—or abstractions generally—some of whose consequences are such as to deserve that complaint. That would suggest the following stipulation: that an abstraction A is unacceptable, at least pro tempore, if every proof it has yielded of some consequence C is such that, schematised so that any instance of it is valid, some other (conservative) abstraction yields a proof of the same form of something inconsistent with C. But there are still a number of salient concerns. First, it is not clear that any purpose is served by the continuing insistence on derivations of a given valid form. Why not just say that pairwise incompatible but individually conservative abstractions are ruled out—however the incompatibility is demonstrated— and have done with it? For think: if each such pair can be shown to be incompatible by proofs of a given single form, then the more complex formulation of the constraint is unnecessary; but if some pair cannot—if no derivation of C
38
The Arché Papers on the Mathematics of Abstraction
from A is of a valid form shared by some derivation of not-C from A*—then there will still be pairwise incompatible but conservative abstractions which survive the new test. So there will still be Bad Company, for which some further treatment will need to be devised. Besides, how is the proposed constraint meant to be applied to semantical—model-theoretic—demonstrations of consequence—for which of course, in the case of higher-order abstractions, there need be no effectively locating a corresponding derivation in higher-order logic? This whole direction was stimulated by the desire to save some ‘good’ Distractions, par excellence New V. It is therefore germane that, as Shapiro has since observed, New V itself is in any case non-conservative!—specifically that it entails that the universe can be well-ordered, and hence that the nonabstracts can. 30 This result, to be sure, does not show that there is nothing to be gained from attempting to refine the second conservativeness constraint of my [1997]—that it has no point. But it should occasion a re-think of the motivation for the general direction. I think there is something else amiss with the rogue Distractions— something which the second proposed constraint may well indirectly approximate but does not bring out with sufficient clarity. Start from the point that definitions proper should be innocent of substantive implications for the universe over which they range. Abstractions cannot in general match that, since in conjunction with logically (or other forms of metaphysically) necessary input, they may carry substantive implications for the abstracts whose concept they serve to introduce and hence—since those abstracts will be viewed, at least by neo-Fregeans, as full-fledged participants in the universe—at least some substantive implications for the universe as a whole. But to the extent that it is proposed to regard them as meaning-constituting stipulations, and hence as approximating definitions as nearly as possible, the character and scope of such implications needs to be curtailed. In brief: the requirement has to be that the only implications they may permissibly carry for the, as it were, enlarged universe in which their own abstracts participate must originate in what they imply—whether proof- or model-theoretically—about the abstracts they specifically concern. Hume’s Principle, for instance, implies of any object whatever that it participates in an at least countably infinite universe; but it carries that implication only via its entailment of the infinity of the cardinal numbers. This is a different requirement to Field conservativeness. A non-Field conservative abstraction—one that, as we put it intuitively above, entails new results about the prior ontology—may of course violate it. But it is possible for an abstraction to be immodest—for it to carry implications for other objects in 30 See Weir and Shapiro [1999]. In rough outline: we can derive the Burali-Forti paradox on the assumption that the concept, Ordinal, is not Big; but if Ordinal is Big, then there is a 1–1 correlation between Ordinal and x = x. So x = x may be well ordered by that correlation.
Is Hume’s Principle Analytic?
39
the universe which cannot be shown to originate in implications it carries for its own proper abstracts—without thereby being demonstrably non-conservative. Consider the Limit-inaccessible Distraction again. As noted, this entails that the universe is limit-inaccessibly sized. But because, unlike NP, it places no upper bound on the size of the universe, it is not non-conservative in the way that NP is—it limits the extension of no other concept. And if it is nonconservative in any other way, we have yet to see how. However it is immodest. For its requirement that the universe have a certain kind of cardinality does not originate in any requirement that it imposes on its own abstracts. It is easy to overlook the force of “originate” here. The Limit-inaccessible Distraction, for instance, entails that any finite concept is non-φ. So it will allow singleton concepts to generate ‘well-behaved’ abstracts—abstracts whose identity and distinctness is governed by ordinary extensionality—of which there should therefore be no fewer than there are objects in the universe. 31 Thus this particular Distraction will indeed entail that its own abstracts are limit-inaccessible in number, from which the limit inaccessibility of the universe follows. 32 But—this is the crux—the result about the abstracts is not needed for the proof of the limit-inaccessibility of the universe. The Distraction provides no way of recognising the limit-inaccessibility of the universe which goes via a prior recognition of what it entails about its own proper abstracts. Rather the inference is the other way about: the proof that the Distraction entails that result about the universe as a whole is needed in order to obtain the result about its own abstracts. That is immodesty. Conservativeness constrains the kind of consequences which an acceptable abstraction is allowed to have: it is not allowable that there be any claim exclusively concerning the non-abstracts which was previously unprovable but which the abstraction, coupled with previous theory—now explicitly restricted to the previous ontology—enables us to prove. Modesty, by contrast, constrains the kind of ground which an acceptable abstraction can provide for consequences, not per se non-conservative, about the ontology of a theory in which that abstraction participates: such consequences must be grounded in what it requires of its own proper abstracts. But although the two constraints may seem different in character in this way, they are aspects of a single point au fond. Remember that the role of a legitimate abstraction, as I have repeatedly stressed, is merely to fix the truth-conditions of a class of contexts featuring a novel term-forming operator. It cannot have more than that role and yet retain the epistemically undemanding character of a meaning-stipulation. Logical abstractions, to be sure, are so designed that, consistently with their playing just this role, logical resources may enable us to show that there are abstracts of the kind they concern and to establish things about them. But no abstraction can be deemed to discharge the intended limited role successfully 31 Assuming that there no fewer singleton concepts than there are objects. 32 On standard cardinal-arithmetical assumptions.
40
The Arché Papers on the Mathematics of Abstraction
if, in conjunction with some consistent theory, it carries implications for the combined ontology which cannot be shown to derive from implications it has for its own abstracts. Non-conservativeness is (normally) one graphic way of failing that test. But if even a conservative abstraction entails some conclusion about the combined ontology which cannot be justified by reference to what it entails about its own abstracts, then knowledge of the truth of the abstraction cannot be founded in stipulation. Such an abstraction implicitly claims something about the world which might—for all we have shown to the contrary—be justified by reference to what it entails about its own abstracts; that is why we cannot accuse it of non-conservativeness. But equally, so long as we have no such justification, we have no defence against the suggestion that the abstraction is known only if we know that the world must be that way in any case, whether or not the abstracts themselves make it so. That would seem to demand knowledge about how the world would be even if the abstracts did not make it so. And that in turn is a substantial piece of collateral information which, by being prerequisite if we are to claim to be justified in laying down the abstraction in the first place, gives the lie to any claim that the abstraction is justified merely as a meaning-stipulation. In sum: an abstraction is modest if its addition to any theory with which it is consistent results in no consequences (whether proof- or model-theoretically established) for the ontology of the combined theory which cannot be justified by reference to its consequences for its own abstracts. And again, justification is the crucial point: an abstraction may fail this constraint even though every consequence it has for the ontology of a combined theory may be seen to follow from things it entails about its proper abstracts; in particular, it will not count if, as in the case of the Limit-inaccessible Distraction, a consequence for the combined ontology is needed as a lemma in the proof that the abstracts have a property from which that very consequence follows. Further clarification is needed of several matters: what kinds of proof should count in favour of the modesty of an abstraction—what it is to show that an abstraction independently carries certain implications for its own abstracts; whether the modesty constraint is effective against the general run of pairwise incompatible but (presumptively) conservative abstractions illustrated by Shapiro and Weir; what other constraints on Good Companions may be properly motivated. At the time of writing, these are largely open issues.
B. B.1
Proof of the principle, Nq Stage-setting
We assume the standard recursive definitions of the numerically definite quantifiers: (∃0 x)Fx ↔ (∀x)¬Fx (∃n+1 x)Fx ↔ (∃x)(Fx & (∃n y)(Fy & y = x)),
41
Is Hume’s Principle Analytic?
and let ‘n f ’ abbreviate Frege’s definiens for n. Define ‘Pxy’ (immediate predecession) as (∃F)(∃w)(Fw & y = Nv : Fv & x = Nz : [Fz & z = w]) Define ‘Nat(x)’ (x is a natural number) as x = 0 ∨ P ∗ 0x where ‘P*xy’ expresses ancestral predecession. Let ‘(∃R)(F 1–1 R G)’ express that there is a one–one correspondence between F and G. We take three lemmas from the proof of the Peano axioms from HP outlined in the concluding section of Frege’s Conception (numbering as there assigned): Lemma 51: (∀x)(Nat(x) → x = Ny : [Nat(y) & P ∗ yx]—every natural number is the number of its ancestral predecessors. Lemma 52: (∀x)(Nat(x) → ¬P ∗ xx)—no natural number ancestrally precedes itself. Lemma 5121: (∀x)(∀y)(Nat(x) & Nat(y) → (Pxy → (∀z)(Nat(z) & (P ∗ zx ∨ z = x) ↔ (Nat(z) & P ∗ zy)))—if one natural number immediately precedes another, then the natural numbers which ancestrally precede the second are precisely the first and those which ancestrally precede the first. Finally, recall that Frege’s 0 is Nx : x = x and that each successive n + 1 f is N x : [x = 0 ∨ · · · ∨ x = n f ]. Each of these objects qualifies as a natural number in the light of the above definition of ‘Nat(x)’. Proof: 0 f qualifies by stipulation; n + 1 f qualifies if n f does—take ‘F’ in the definition of ‘Pxy’ as ‘[x = 0 ∨ · · · ∨ x = n f ]’ and ‘w’ as ‘n f ’ to show that P(n f , n + 1 f ); then reflect that Pxy → P ∗ xy and that P ∗ xy is transitive. (Frege’s Conception, Lemmas 3 and 4, respectively.)
B.2
Proof of Nq for Frege’s natural numbers
Induction Base: To show Nx : Fx = 0 f ↔ (∃0 x)Fx, it suffices to reflect that the left-hand side holds just if (∃R)(Fx 1–1 R x = x), which in turn holds just if ¬ (∃x)Fx. 33 Induction Hypothesis: Suppose Nx : Fx = n f ↔ (∃n x)Fx. We need to show that it follows that Nx : Fx = (n + 1) f ↔ (∃n+1 x)Fx. 33 As George Boolos remarked to me, Frege himself observes, at Grundlagen §75 and §78, that he is in a position to obtain proofs of N q for 0 f and 1 f , respectively.
42
The Arché Papers on the Mathematics of Abstraction
(Left-to right) Consider any F such that Nx : Fx = (n + 1) f . By Lemma 51 and the reflection that Nat(n f ), n f = Nx : [Nat(x) & P ∗ xn f ]. So by the Hypothesis (∃n x)(Nat(x) & P ∗ xn f ). But by Lemma 52, ¬P ∗ n f , n f . So (∃n x)(Nat(x) & (P ∗ xn f ∨ x = n f ) & x = n f ). So (∃y)(Nat(y) & (P ∗ yn f ∨ y = n f ) & (∃n x)(Nat(x) & (P ∗ xn f ∨ x = n f ) & x = y)). So by the recursion for the quantifiers (∃n+1 x)(Nat(x) & (P ∗ xn f ∨ x = n f ). But by Lemma 5121 and since P(n f , n + 1 f ), we have that (∀x)(Nat(x) & (P ∗ xn f ∨ x = n f ) ↔ Nat(x) & P ∗ (x, n + 1 f )). So (∃n+1 x)(Nat(x) & P ∗ (x, n + 1 f )). That establishes the desired result for one concept of which (n + 1) f is the number. But by HP, any G such that (n + 1) f = Nx : Gx will admit a one-one correspondence with that concept. So a lemma to the following effect will now suffice: (∀F)(∀G)((∃R)(F 1−1 R G) → ((∃n+1 x)Fx ↔ (∃n+1 x)Gx) A proof by induction—strictly, at third-order—suggests itself: Base: It suffices to show (∀F)(∀G)((∃R)(F 1−1 R G) → ((∀x)¬Fx ↔ (∀x)¬Gx)) Hypothesis: Suppose (∀F)(∀G)((∃R)(F 1−1 R G) → ((∃n x)Fx ↔ (∃n x) Gx). Consider any H such that (∃n+1 x) Hx. Then (∃x)(Hx&(∃n y)(Hy&y = x)). Let a be such that Ha & (∃n y)(Hy & y = a). Let J be one–one correlated with H by R. Let b be such that Jb & Rab. Then R one–one correlates Hx & x = a with Jx & x = b. So, by the Hypothesis, (∃n x)(Jx & x = b). So (∃x)(Jx & (∃n x)(Jx & x = b)). So (∃n+1 x)Jx. (Right-to left) Consider any F such that (∃n+1 x)(Fx). Then there is some a such that Fa & (∃n y)(Fy & y = a). So by the Hypothesis Ny (Fy & y = a) = n f . So, by HP, there is an R such that (Fy & y = a) (1–1 R ) (Nat(x) & P ∗ xn f ). Let R # correlate (Fy & y = a) with (Nat(x) & P ∗ xn f ) in just the fashion of R, and let it also correlate a with n f . Then (Fy) 1–1 R # (Nat(x) & (P ∗ xn f ∨ x = n f )). But, as established above (∀x)(Nat(x) & (P ∗ xn f ∨ x = n f ) ↔ Nat(x) & P ∗ (x, n + 1 f )). So Nx : Fx = (n + 1) f .
References George Boolos [1986] “Saving Frege from Contradiction” in Proceedings of the Aristotelian Society 87, pp. 137–51; reprinted in Demopoulos, ed. [1995], pp. 438–52. George Boolos [1987] “The Consistency of Frege’s Foundations of Arithmetic” in Judith Jarvis Thompson, ed. [1987], pp. 3–20; reprinted in Demopoulos, ed. [1995], pp. 211–33. George Boolos [1990] “The Standard of Equality of Numbers” in Boolos, ed. [1990a], pp. 261– 77; reprinted in Demopoulos, ed. [1995] pp. 234–54. George Boolos, ed. [1990a] Meaning and Method: Essays in Honor of Hilary Putnam, Cambridge: Cambridge University Press. George Boolos [1997] “Is Hume’s Principle Analytic?” in Heck, ed. [1997], pp. 245–61. George Boolos and Richard. G. Heck, Jr. [1998] “Die Grundlagen der Arithmetik §§82–3” in Schirn, ed. [1998], pp. 407–28.
Is Hume’s Principle Analytic?
43
Peter Clark [1998] “Dummett’s Argument for the Indefinite Extensibility of Set and Real Number” in Grazer Philosophische Studien 55, New Essays on the Philosophy of Michael Dummett, eds. J. Brandl and P. Sullivan (Vienna: Rodopi), pp. 51–63. William Demopoulos, ed. [1995] Frege’s Philosophy of Mathematics, Cambridge, Mass.: Harvard University Press. Michael Dummett [1963] “The Philosophical Significance of Gödel’s theorem”, Ratio 5, pp. 140–55. Michael Dummett [1991] Frege: Philosophy of Mathematics, London: Duckworth. Michael Dummett [1993] The Seas of Language, Oxford: The Clarendon Press. Hartry Field [1984] Critical Notice of Crispin Wright Frege’s Conception of Numbers as Objects, Canadian Journal of Philosophy 14, pp. 637–62; reprinted as “Platonism for Cheap? Crispin Wright on Frege’s Context Principle” in Field [1989], pp. 147–70. Hartry Field [1989] Realism, Mathematics and Modality, Oxford: Basil Blackwell. Bob Hale [1997] “Grundlagen §64”, Proceedings of the Aristotelian Society XCV11, pp. 243– 61. Bob Hale [2000] “Reals by Abstraction”, Philosophia Mathematica 8, pp. 100–23. Richard G. Heck, Jr., ed. [1997] Language, Thought and Logic, Oxford: The Clarendon Press. Richard G. Heck, Jr., ed. [1997a] “Finitude and Hume’s Principle”, Journal of Philosophical Logic 26, pp. 589–617. Alex Oliver [1998] “Hazy Totalities and Indefinitely Extensible Concepts: An Exercise in the Interpretation of Dummett’s Philosophy of Mathematics” in Grazer Philosophische Studien 55, New Essays on the Philosophy of Michael Dummett, eds J. Brandl and P. Sullivan (Vienna: Rodopi), pp. 25–50. Charles Parsons [1964] “Frege’s Theory of Number” in Philosophy in America, ed. Max Black, London: Allen and Unwin, pp. 180–203; reprinted in Demopoulos, ed. [1995], pp. 182–210. Matthias Schirn, ed. [1998] Philosophy of Mathematics Today, Oxford: The Clarendon Press. Stewart Shapiro [1998] “Induction & Indefinite Extensibility: The Gödel Sentence is True but Did Someone Change the Subject”, Mind 107, pp. 597–624. Stewart Shapiro and Alan Weir [1999] “New V, ZF and Abstraction”, Philosophia Mathematica 7, pp. 293–321. Judith Jarvis Thompson, ed. [1987] On Being and Saying: Essays in Honor of Richard Cartwright, Cambridge, Mass.: MIT Press. Crispin Wright [1983] Frege’s Conception of Numbers as Objects, Aberdeen: Aberdeen University Press. Crispin Wright [1997] “On the Philosophical Significance of Frege’s Theorem” in Heck [1997], pp. 201–44. Crispin Wright [1998] “On the Harmless Impredicativity of N= (‘Hume’s Principle’)” in Schirn [1998], pp. 339–68. Crispin Wright [1998a] “Response to Dummett” in Schirn [1998], pp. 389–405.
FREGE, NEO-LOGICISM AND APPLIED MATHEMATICS 1 Peter Clark Philosophy Department, School of Philosophical and Anthropological Studies, University of St Andrews, St Andrews, Fife, Scotland, KY16 9AL, UK
1.
Introduction—logicism and neo-logicism
A little over one hundred years ago (the letter is dated July 28, 1902) Frege wrote to Russell in the following terms: 2 I myself was long reluctant to recognize ranges of values and hence classes; but I saw no other possibility of placing arithmetic on a logical foundation. But the question is how do we apprehend logical objects? And I have found no other answer to it than this, We apprehend them as extensions of concepts, or more generally, as ranges of values of functions. I have always been aware that there are difficulties connected with this, and your discovery of the contradiction has added to them; but what other way is there?
Frege here poses an extremely good question, a recent answer to which this paper is really devoted. Whatever view one may finally adopt about whether the new answer succeeds one has to recognise that it is surely a very remarkable fact that one hundred years after the discovery of the Zermelo–Russell contradiction which follows from Basic Law Five of Frege’s Grundgesetze (Frege (1893)) we should now be actively discussing not as a purely historical enterprise but as a viable possibility in the foundations of mathematics, the programme more or less explicitly laid out in the earlier work of Frege Die Grundlagen der Arithmetik (Frege (1884)). This is because recent research has highlighted three crucial facts. First and most importantly that in full conformity with the spirit of Frege’s programme the deduction of the axioms of (Second Order) Peano Arithmetic from principles of higher order logic and “definition” 3 does not require appeal to Basic Law Five. Second that 1 This paper first appeared in Induction and Deduction in the Sciences, F. Stadler (ed.) [2004], Dordrecht, Kluwer Academic Publishers. Reprinted by kind permission of Springer Academic Publishers. 2 Frege (1902), pp. 140–41. 3 Actually one has to be careful how this result is stated. Formally the central result is that if a formalisation of the key ‘definition’ is added as an axiom to standard axiomatic second order logic, second
45 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 45–60. c 2007 Springer.
46
The Arché Papers on the Mathematics of Abstraction
the principles of logic and “definition” (this definition has become known as Hume’s Principle) which are employed are consistent, 4 and third that even the proofs of the axioms for Peano Arithmetic given by Frege himself in the Grundgesetze der Arithmetik do not depend essentially upon Basic Law Five. 5 These technical facts, of which more later, open the way for the revival of Frege’s programme, they make it possible but they do not determine its form. What was Frege’s programme and what is its revived form explicitly? Frege’s programme was the result of an answer to the famous question raised in Section 62 of the Grundlagen viz.: “how then are numbers given to us, if we cannot have any ideas or intuitions of them?” The Fregean answer was “by explaining the senses of identity statements in which number words occur”. That explanation was to be provided at least in part by what has come to called Hume’s Principle: the claim that the cardinal numbers corresponding to two concepts are identical if and only if the two concepts are equinumerous. I say in part by Hume’s Principle because as Frege had already argued in another context at Section 56 of the Grundlagen whatever the merits of Hume’s Principle it can’t explain the senses of identity statements in which number words occur of the form “the number of F’s is n”, where n is not given in the form of “the number of G’s”, for some G. Frege then adopted the explicit definition of number in terms of classes or extensions “the number of F’s is the class of all concepts G, equinumerous with F”. But this explicit definition together with Basic Law Five, the comprehension axiom for class existence entails Hume’s Principle. With Axiom Five in place it looked as if Frege’s programme could be carried out. It was now possible to show that second order logic (Frege (1879)) together with Basic Law Five entails the Peano–Dedekind axioms for arithmetic. As such the truths of arithmetic could be seen to be analytic, they could all be seen to be consequences of general logical laws together with suitable implicit definitions (like Basic Law Five which implicitly defines the notion of an extension). Further arithmetic could be seen as a body of truths about independently existing objects—the finite cardinals—which were logical objects, order arithmetic (arithmetic with the full second order induction axiom) can be interpreted in the resulting theory, often called Frege Arithmetic. Certainly the result seems to have been known to Geach in the forties, Dummett in the fifties and was first recently explicitly noted by Charles Parsons in his 1965 paper “Frege’s Theory of Number” (Parsons (1965) reprinted in Demopoulos (1995), pp. 182–210). A very closely related result was published by Timothy Smiley in 1981 (Smiley (1981)). The full significance of the result as well as a well developed proof was given by Wright (1983). The result has been systematically investigated by George Boolos (who discovered it independently in the early eighties and Richard Heck (see especially Boolos (1987a), (1998), papers 17, 18 and 19, Heck (1993). The most accessible proof can be found in the Appendix to Boolos (1990a), Boolos (1998) paper 13). 4 In his 1983 Wright conjectured that the system of axiomatic second order logic together with Hume’s Principle is consistent but did not establish it. Burgess (1984), Hazen and Hodes provided elementary consistency proofs with ω and ω + 1 as domains. George Boolos however established the central consistency result which is that the theory is equi-consistent with analysis (see his (1987b), (1990b) and Boolos and Heck, paper 20 of Boolos (1998)). 5 This was certainly known to Frege (see especially Heck (1995)), but in a letter to Russell he dismissed the possibility of basing his system on Hume’s Principle, saying only that it faced difficulties which were different from those facing the attempt to use Basic Law Five (Frege (1902), letter xxxvi/7, p. 141).
Frege, Neo-Logicism and Applied Mathematics
47
logical in the sense that knowledge of which requires nothing beyond knowledge of logic and definitions. If this could be extended to the Real numbers and other parts of mathematics then a foundation for mathematics would have been established on which mathematics was presented as uncontaminated by empirical notions, presented as a body of truths in its full classical form and shown to be applicable to reality, since it can be shown to be in fact the more and more elaborate drawing of consequences from meaning postulates. But deduction applies to everything that can be thought. In fact it seems to me that the fundamental logicist thought can be put simply as follows: there can be no thought without representation, there can be no representation without concepts and there can be no concepts without number. Of course the serpent had already entered this Eden, with the introduction of Basic Law Five which says (∀F)(∀G)(Ext(F) = Ext(G) ↔ (∀x)(Fx ↔ Gx)) But by the Comprehension Principle for Second Order Logic there is a property corresponding to the formula of Second order logic (∃F)(Ext (F) = x &¬ Fx). Russell’s paradox immediately results from allowing this property to fall under the universal quantifier (∀F) in Basic Law Five. Another way of putting the same point is to note that the Russell reasoning shows that it is a theorem of Second order logic that there is no function from properties to objects such that distinct properties (i.e. non-coextensive properties) are associated with distinct objects. This is just what Basic Law Five read from left to right in contrapositive form asserts there is. 6 So much for logicism. What about neo-logicism, the revived form of Frege’s programme? We should let Wright and Hale, the main proponents of this view speak for themselves. They say: 7 Neo-Fregeanism holds that Frege need not have taken the step which lead to this unhappy conclusion [The appearance of the Russell contradiction]. At least as far as the theory of natural number goes, it is possible to accomplish Frege’s central mathematical and philosophical aims by basing the theory on Hume’s Principle, adjoined as a supplementary axiom to a suitable formulation of second order logic. Hume’s Principle cannot, to be sure, be taken as a definition in any strict sense—any sense requiring that it provide for the eliminative paraphrase of its definiendum (the numerical operator, “the number of . . . ”) in every admissible type of occurrence. But this does not preclude its being viewed as an implicit definition, effecting an introduction of a sortal concept of cardinal number and, accordingly, as being analytic of the concept—and this, the neo-Fregean contends, coupled with the fact that Hume’s Principle so conceived requires a prior understanding only of second order logical vocabulary, is enough to sustain an account of the foundations of Arithmetic that deserved to be viewed as a form of logicism which, whilst not quite logicism in the sense of a reduction of arithmetic to logic, preserves the essential core and content of Frege’s two fundamental theses. 6 This is Frege’s own generalisation of the lesson of Russell’s paradox. (See also Boolos (1993)). 7 Hale and Wright (2000), Introduction.
48
The Arché Papers on the Mathematics of Abstraction
So two conditions will have to hold for this position to viable: first what has come to be known as Frege’s theorem (which certainly does hold), that is the mathematical claim that the Dedekind–Peano axioms postulates for number theory in their second order form can be derived from a combination of Second order logic and Hume’s Principle and second that it will have to be shown that Hume’s Principle and other so called abstraction principles which share its form constitute a legitimate means of introducing the names of numbers (of abstract objects in general) by, in effect, stipulation by implicit definition. One needs to be careful to state the claim of neo-logicism properly, it is: (i) Hume’s Principle is a stipulation which gives the truth conditions of a restricted class of statements of numerical identity (ii) The resulting explanation of the concept of number is complete however, in that it suffices for the second order derivation of the basic laws of arithmetic (iii) The existence of numbers is something discovered and not stipulated (the Platonism of Frege’s original theory is preserved) (iv) Our (a priori) knowledge of number is derived from a principle whose truth is a matter of stipulation. 8
Abstraction principles of which Hume’s Principle is a paradigm example come in two types conceptual abstractions and objectual ones, but all have the following form. There is a domain of entities, denoted say, by α, β, etc., and a relation R defined over them. Then an abstraction principle has the form ((α) = (β)) ↔ R(α, β) Where R( , ) is an equivalence relation among the α and β’s. An abstraction principle may be called a logical abstraction when the relation R( , ) is definable in purely logical vocabulary, e.g. equinumerosity among concepts or ordinal similarity among binary relations. Under the classical canonical interpretation (α) is the equivalence class of α under the relation R and exists (where it does) in virtue of a set existence axiom. That is the existence and uniqueness of (α) has in effect to be guaranteed by a separate principle of set or class existence. Wright and Hale however argue that in certain cases logical abstraction principles can play the role of stipulations and if the relation on the right hand side of the iff is ever satisfied then no further question concerning the existence of the (α) need arise. Conceptual abstraction principles are those in which α’s are concepts (as in the case of Hume’s Principle) and objectual abstraction principles are those in which the field of the equivalence relation comprises objects. In both cases and this is crucially so the abstracta the (α) are objects, so in the case of conceptual abstractions acts as a type down operation, from concepts to objects. 8 See also Demopoulos (1998), (2000).
Frege, Neo-Logicism and Applied Mathematics
49
Of course Wright and Hale do not argue that it is always legitimate to introduce abstracta in this way. Two examples of conceptual logical abstraction principles which fail to introduce abstracta are Basic Law Five and what might be called Ordinal Hume which is the claim that (∀ R)(∀ S)(Ord R = Ord S ↔ R is similar to S) This has the form of an abstraction principle since similarity is an equivalence relation among binary relations. But Ordinal Hume leads directly to the Burali–Forti paradox. 9 Hume’s Principle in logical abstraction form says: (∀ F)(∀ G)(NxFx = NxGx ↔ (∃ R)(F1 − 1R G)) where (∃R)(F1−1R G) is an abbreviation for the standard formulation in the vocabulary of second order logic of the formula expressing that there is a relation R which establishes a one to one correspondence between the things falling under F and those falling under G, that is ((∃R)(∀x)(Fx→ ∃!y)(Gy & R(x,y)) & ∀z(Gz→ ∃!w)(Fw & R(w,z))) and the operator Nx . . . x is a term forming operator. Wright has argued that there are general principles which can distinguish between good and bad abstraction principles and in any case as is well known there is no similar problem about Hume’s Principle, since it is known to be consistent. Like Basic Law Five Hume’s Principle asserts the existence of a function from concepts to objects but unlike Basic Law Five it asserts that merely non-equinumerous concepts (not non-co-extensive concepts) can be sent to distinct objects and this is possible provided that the domain is (Dedekind) infinite. For a domain of k objects there are k + 1 non-equinumerous concepts definable over it, so no finite domain can satisfy Hume’s Principle. The values of the second order variables for Frege are concepts and objects are denoted by terms that may appear on either side of the identity sign, so the terms like NxFx denote objects. As such Nx . . . x will have to be thought of as a term forming operator and our theory of second order logic plus Hume’s Principle must be sufficiently strong to have as a theorem (∀F)(∃!x)(x = NxFx). In general one would expect that we would have to have an axiom asserting that for each F, NxFx was a term. This fact seems to me to have the profoundest significance for the claim that Hume’s Principle and other abstraction principles can be thought of analytic stipulations introducing the names of numbers (or other abstract objects). Nor do I see any of the various equivalent methods of adding Hume’s Principle to second order logic in order to derive Frege’s theorem as avoiding this difficulty, for they will all have to guarantee that (∀ F)(∃!x)(x = NxFx) holds in some way or other. If it is guaranteed then we may proceed as follows: consider the concept non-self identical. It is a truth of logic that the concept non-self identical is equinumerous with the concept non-self identical, so 9 See Hodes (1984) and Fine (1998).
50
The Arché Papers on the Mathematics of Abstraction
the right hand side of Hume’s principle for F and G both non-self identical reduces to a logical truth. So we may detach and assert as a theorem Nx ∼ (x = x) = Nx ∼ (x = x) and so infer (∃!y)(Nx ∼ (x = x) = y) and so introduce the name “zero” to designate that unique object. But that hinges on us being able to read the stipulation, to understand the stipulation, as providing a context into which it is appropriate to quantify in. But what guarantees that is the correct reading of the stipulation? This is my first sceptical argument then: it essentially concerns how the underlying logic is to function. On the one hand to presume NxFx is a term because it appears on one side of the identity sign seems to me to beg the question, while if the question is not begged and say a free logic is employed then I fail to see how the required existential postulates will form any intended contrast with “mere axiomatic postulation”. 10 But I will not dwell further on this matter here.
2.
Anti-zero Following Boolos we can write Hume’s Principle in the form ∗
F = ∗ G ↔ (F1 − 1G)
and understand it as asserting that there exists a total function from concepts to objects, call it *, such that non-equinumerous concepts are assigned distinct objects (that is the contrapositive of Hume’s principle read from left to right). Adding this principle to Second Order logic allows us to prove in the Fregean way that, with the successor relation defined in the usual manner viz., S(n, m) ↔ (∃ F)(∃y)(Fy & ∗ F = n & ∗ (x : Fx & x = y) = m) (i) the successor relation is functional and one–one (ii) but that with zero defined as *(x:x = x), (∀x) ∼ S(0,x) and n is Finite iff n is zero or Sˆ(n,0) where Sˆ is the ancestral relation of S. But that is just to say that the natural numbers form a Dedekind infinite sequence (that is (∃f)[(∀x)(∀y)(fx = fy → x = y) & (∃x)(∀y)(fy = x)]). The key step is clearly to be able to prove that every number has a successor and Frege’s proof works precisely because n is an object that can be proved not to fall under the concept “being less than n”. This was, one of, Frege’s triumphs in the Grundlagen. It might therefore seem 10 Wright and Hale consistently draw attention to the difference in methodology they see between mere axiomatic stipulation and their proposed methodology, whereby abstracta are taken as stipulations. I have not dwelt on what I regard in this respect as a separate issue namely that of the very strong existential import of second order logic. This is most clearly seen in a point that George Boolos constantly emphasised that is the very strong existential commitments embodied in the Second Order Comprehension Principle (∃X)(∀x)(Xx↔A(x)) where A(x) is a formula of second order logic not containing X free. The issue of the existential presuppositions of the Hume’s principle has been addressed by Shapiro and Weir (2000) and by Demopoulos in his review of M. Schirn (1998) in the Journal of Symbolic Logic. Shapiro and Weir conclude that “the neologicist has no non-question begging account of how there could be an epistemologically innocent route to the demonstration of platonistically construed mathematical existence claims.” (p. 188).
Frege, Neo-Logicism and Applied Mathematics
51
little more than trite to claim that H.P. is an axiom of infinity, it clearly is, since it forces the domain to be infinite. No doubt it is, but the principle has many other odd entailments too. Some of which are surprisingly and worryingly strong. Since the function * is total we can take it that it is a theorem of Frege Arithmetic that (∀F)(∃!x)(x = *F). Call this theorem (**). Now as Frege noted if we let F be “. . . is a member of the natural sequence of numbers”, it follows that there is a number of finite numbers, Frege’s ∞1 and that since it succeeds itself it is not a finite number (since it is a theorem of the formal system implicit in the Grundlagen that if Finite(n) then ∼S(n,n)). Again it follows immediately from (**) that if we take F to be the concept “is self identical” then (∃!x)(x = *(x = x)), that is, that there is a number of all things that there are. This latter looks like a very strong claim indeed. But we should note immediately, as Boolos pointed out, that we cannot show within the theory (Frege Arithmetic) that the two numbers are distinct. 11 In what we might call the standard model of Hume’s Principle, with domain N, the natural numbers in which all infinite concepts are assigned the object zero, while all finite concepts are assigned the cardinality of the corresponding subset of N plus one, the object associated with *Finite (n) is zero, as is the object associated with *(x = x), since neither are finite but the set of all ordinals less than or equal to Aleph one is also a model for the theory (that is for Frege Arithmetic) and in this case *Finite (n) is assigned Aleph zero while *(x = x) is assigned Aleph one. So we certainly cannot prove in Frege Arithmetic that *Finite (n) = *(x = x). But a worry still remains for as Boolos asked “is there such a number as anti-zero?” 12 That seems to me a very good point indeed. For that, if anything is, seems to be a substantial matter, which cannot be decided by stipulation. Indeed this issue seems to generalise as Boolos pointed out into a general issue about the compatibility of two conceptions of cardinal number. One derived from what one might call the pure theory of cardinal number based on Hume’s Principle and one derived from set theory as we shall see below. However it is possible to dissolve this worry and Wright has done so. He points out that it was a prime tenet of Frege’s view that numbers were the numbers of sortal concepts. Clearly the concept “self identical” is not a genuinely sortal concept so it is not an appropriate instance of the theorem above (**). Thus as Wright remarks in his reply to Boolos: Moral: just as not every object is suitable to determine a direction, so we should not assume without further ado that every concept—every entity an expression for which is an admissible substituted for the bound occurrences of the predicate letters in Hume’s Principle—is such as to determine a number. 11 Boolos (1987b), p. 197. Page reference to the reprint in Boolos (1998). 12 Boolos (1997), p. 314. Page reference to the reprint in Boolos (1998).
52
The Arché Papers on the Mathematics of Abstraction
He goes on to say: So self-identity is not a sortal concept. If we take it that, save where F is assured an empty extension on purely logical grounds, only sortal concepts, and concepts formed by restricting a mere predicable to a sortal concept, have cardinal numbers, it follows that there is no universal number. 13 So the function * is not to be regarded as having as its domain all concepts, rather there is to be natural and principled restriction on its domain of interpretation. It is to be confined to sortal concepts. However it is now unclear whether we can any longer regard Hume’s Principle as a purely logical abstraction principle since it is not clear that the notion of being a sortal concept can be expressed in purely logical vocabulary. But as Wright was quick to note if we do accept the need to restrict the domain of the function * to genuinely sortal concepts then a second difficulty looms and to some of us it looms very large.
3.
The good company objection
Let us follow Wright’s lead and consider the sortal concepts “set”, “ordinal” and “cardinal number”. According to Hume’s Principle these concepts too ought to have a number associated with them by (**). But then we immediately invite the observation that there is bound to be conflict with the notion of number as embodied in set theory, say ZFC. According to ZFC there are no numbers associated with the concepts “set”, “ordinal” and “cardinal number” precisely because the extensions of such concepts do not form sets and ZFC embodies the principle: no set no cardinal number. The collections we have been considering are proper classes, not sets, so there is no number associated by ZFC (or set theory) with them. Considerations such as these suggested to Boolos that there were two conflicting conceptions of number in play. He remarks: 14 Two thoughts about the concept of number are incompatible: that any zero or more things have a (cardinal) number, and that any zero or more things have a number (if and) only if they are the members of some one set. It is Russell’s paradox that shows the thoughts incompatible: the sets that are not members of themselves cannot be the members of any one set. The thought that any (zero or more) things have a number is Frege’s; the thought that things have a number only if they are the members of a set may be Cantor’s and is in any case a commonplace of the usual contemporary presentations of the set theory that originated with Cantor and has become ZFC.
In a similar vein he elaborated on the issue: 15 The worry is that the theory of number we have been considering, Frege Arithmetic, is incompatible with Zermelo–Fraenkel set theory plus standard 13 Wright (2000) “Is Hume’s principle Analytic?” in the Notre Dame Journal of Formal Logic. 14 Boolos (1995) “Frege’s Theorem and the Peano Postulates”, in Boolos (1998), p. 291. 15 Boolos (1997), p. 314. Page reference is to Boolos (1998).
Frege, Neo-Logicism and Applied Mathematics
53
definitions, on the usual and natural readings of the non-logical expressions of both theories. To be sure, as Hodes once observed in conversation, if *a is taken to denote the cardinal number of a when a is a set and some favourite object that is not a cardinal number when a is a proper class, then HP will be a theorem of Von Neumann set theory. But on that definition of *, *will not be translatable as ‘the cardinal number of.’ ZF and Frege arithmetic make incompatible assertions concerning what cardinal numbers there are. And of course, the response ‘Well, these are just formalisms; the question of their truth or falsity doesn’t arise or makes no sense’ is hardly available to one claiming that HP is analytic, i.e., an analytic truth. So one who seriously believes that has to be bothered by the incompatibility of the consequence of Frege arithmetic that there is such a number as anti-zero with the claim made by ZF + standard definitions (on the natural reading of its primitives) that there is no such number.
Since to be in the company of set theory is to be in very good company indeed, let us call this objection to the stipulatory nature of Hume’s Principle, the Good Company objection. Now there is a clear response which can be made by someone who wishes to defend the idea that the truth of Hume’s principle can be simply stipulated and that is that the conflict alluded to above is illusory. 16 After all set theory assigns numbers to sets, but Frege arithmetic assigns numbers to concepts. Frege arithmetic assigns a cardinal number to the Russell concept “non-self membered set” but in virtue of the way cardinal numbers are introduced in ZFC, no cardinal number can be assigned to a nonset. (It is worth recalling that in ZFC if X is any set, then there is an ordinal number α and a bijection f: α →X. For any set X the cardinality of X is the least ordinal α such that there is a bijection f: α →X. A cardinal number is an ordinal number α such that for no β < α is there a bijection f: β → α.) The response to the Good Company objection in short then is Frege arithmetic assigns a number to proper classes, ZFC is silent. In any case certainly there is no conflict. But this dismissal of the good company objection is too swift. Let us go back to the first quotation from Boolos. He rightly points out that two conceptions of number are incompatible. The first conception is embodied in Hume’s Principle and says that every sortal concept has a number and the other says that every sortal concept has a number if and only if the extension of that concept is a set. Call these the Fregean and non-Fregean conceptions respectively. Clearly Russell’s paradox does indeed show that these two conceptions are incompatible. “Non-self membered set” is certainly a sortal concept and so has a number, by the Fregean conception. But it is provably the case that the concept “non-self membered set” has no set as its extension so on the second conception by the contrapositive of the only if clause there can be no number corresponding to the concept. Now imagine someone, we had better call him “Anti-Hero” or simply “Villain” (in this context) who believes this: the conception of number we have is revealed in mathematical practice and he holds to the non-Fregean concept of number. Is it conceivable that he holds an analytically false belief, 16 This objection was put to me by Stewart Shapiro and Fraser MacBride.
54
The Arché Papers on the Mathematics of Abstraction
or a belief that a stipulation may show to be false? Our grasp of number is what we do with it in mathematics. Now it is certainly true that the nonFregean concept of number outlined above is entirely adequate for the whole of mathematics since it realised by ZFC (though ZFC does not entail it). The extra content which Anti-Hero in fact believes over ZFC is the claim that the mathematical universe is exhausted by ZFC. That claim certainly seems believable. What it certainly does not seem to be is analytically false and it seems to have enormous evidence in favour of it. It may seem that Anti-Hero is merely legislating on the content of all possible mathematics and that can’t be right. But Anti-Hero doesn’t believe that his view is analytically true, what he certainly believes is that it is not analytically false however. Perhaps it was something like this that Boolos had in mind when he wrote of Anti-Hero’s view that: “It is in any case a commonplace of the usual contemporary presentations of the set theory that originated with Cantor and has become ZFC”. Clearly this matter must be connected with Boolos’s rejection of the notion of proper class. He took the view that they are in fact just “a manner of speaking” in the sense that they really play the role of abbreviations whose use can always be eliminated in any formal theory by replacing them by their defining formulas and this latter view I think is directly connected with his support for the iterative conception of set and the claim that ZFC exhausts that iterative hierarchy. Whatever may be the case about this it does seem to me that Boolos’s Good Company objection is a compelling one, for the set theoretic conception of number is a perfectly viable one and it surely cannot be rejected as analytically false. A possible response would be to say that there are two conceptions of cardinal number one for concepts and one for sets, but then what would have happened to the Fregean foundational programme that this was the correct ontological and epistemological account of the nature of and of our knowledge of the cardinal numbers? We would then seem to be left with an impenetrable problem about reference, we have an account of Frege numbers, we have an account of “set” numbers but neither is apparently related to each other or the numbers of ordinary arithmetic. That response would amount to abandoning the foundational programme and would hardly be acceptable to the neo-Fregean.
4.
Indefinitely extensible concepts again—proper classes
Perhaps it was some considerations such as these which lead Wright to endorse the principle at the heart of the non-Fregean concept of number as characterised above. He writes: 17 Grant the plausible principle that there is a determinate number of F’s just provided the F’s compose a set 17 Wright (2000), reprinted in Hale and Wright (2000), pp. 307–32. Reference is to p. 314.
Frege, Neo-Logicism and Applied Mathematics
55
and later . . . it seems natural and well-motivated to suppose that the F’s should have a determinate cardinal number just when they compose a set
If this is the line to follow then what we are looking for is some further principled restriction on range of the quantifiers in Hume’s Principle over and above that of the restriction to sortal concepts. Wright proposes that the concepts falling under the range of the universal quantifiers in Hume’s principle should be restricted to those which are “definite”. Recall that a concept is definite just when it is not “indefinitely extensible”. One sense of indefinite extensibility goes back to Russell and Poincaré. Russell, in 1903 had thought that the contradiction derivable from Basic Law V of Frege’s Grundgesetze showed that not every property determines a class simpliciter. 18 The fundamental question as he saw it was then “to determine, which propositional functions define classes which are single terms as well as many, and which do not?”. By 1906, after reading Poincaré he had changed his view. He wrote “the contradictions result from the fact that, according to current logical assumptions, there are what we may call selfreproductive processes and classes. That is, there are some properties such that, given any class of terms all having such a property, we can always define a new term also having the property in question.” 19 In formulating the Vicious Circle Principle he made a very similar claim: “Thus all our contradictions have in common the assumption of a totality such that, if it were legitimate, it would at once be enlarged by new members defined in terms of itself”. 20 Both the Russell of this period and Poincaré would have agreed that the objects falling under such self-reproductive properties like ordinal or set form no totality. As is very well known Dummett has recently revived this idea. 21 As Wright remarks of the idea: 22 I do not, myself, know how best to sharpen this idea, still less how its best account might show that Dummett is right both to suggest that the proof-theory of quantification over indefinitely extensible totalities should be uniformly intuitionistic and that the fundamental classical mathematical domains, like those of the natural numbers, or the reals, should also be regarded as indefinitely extensible. But Dummett could be wrong about both those points and still be emphasizing an important insight concerning certain very large totalities—ordinal number, cardinal number, set, and indeed ‘absolutely everything’.
So the idea that Wright is proposing then is this I take it. We should restrict the range of the quantifiers in Hume’s principle to definite concepts, that is 18 Russell (1903). 19 Russell (1906), p. 144. 20 Russell (1908), p. 63. 21 Dummett (1991), (1963) and Clark (1998). 22 Wright (2000), p. 316 of Hale and Wright (2000).
56
The Arché Papers on the Mathematics of Abstraction
those that are not indefinitely extensible. But one has to be careful here for there are at least two conceptions of indefinite extensibility, if there are any, one which is very closely related to the Russell–Wright line of thought which sees indefinite extensibility associated with concepts generating or holding of very large totalities, with in effect, proper classes (in the presence of strong versions of the axiom of choice all the proper classes we have been considering are equinumerous with the Universe) and one which Dummett has developed in which both N and R are indefinitely extensible. Deploying Dummett’s conception will of course have essentially the same effect as deploying a suggestion due to Heck 23 to the effect that analytic core of Hume’s Principle is Finite Hume. This is the Principle that (∀F)(∀G)(Fin(F) & Fin(G) → (∗ F = ∗ G ↔ (F1 − 1R G)) where the notion of finite can be spelled out in purely logical vocabulary (as the negation of the second order sentence expressing that F is (Dedekind) infinite). This is of course not in the form of an abstraction principle and so is unacceptable as an implicit definition for the neo-Fregean. The closely related principle (∀F)(∀G)((∗ F = ∗ G ↔ Not Fin(F)V Not Fin(G) V (F1 − 1G)) although it is an abstraction principle (the right hand side being an equivalence relation) would not suffice either since it conflicts with the Cantorian conception of cardinal number by assigning the same number to all infinite concepts. However what would seem to capture Wright’s restriction is precisely the notion of set as opposed to proper class. I take it what we want to say is that N and R are definite while “set”, “ordinal” and “cardinal number” are not. Can such a notion of definite be made out? Well of course it can: the natural candidate is, a concept F is definite iff it has a set as its extension. This will do exactly what Wright wants. But it would make the understanding of Hume’s principle parasitic upon the notion of set and our grasp of the set theoretic Universe. It could hardly then be argued that the truth of Hume’s Principle was guaranteed by stipulation. We have already noted that in the presence of a very strong choice principle the concepts we wish to exclude from the domain of Hume’s Principle are equinumerous with the Universe. We could use this fact to independently give a justification of Wright’s New Hume which would have the form (∀F)(∀G)(∗ F = ∗ G ↔ (InDef (F) & InDef(G)) v (F1 − 1G)) where InDef (F) would be a condition stating that F is equinumerous with the Universe. 23 Heck (1997c).
Frege, Neo-Logicism and Applied Mathematics
57
This would still be a logical abstraction principle, provided that the condition InDef (F) were expressed by the formula: (∃R)((∀x)(Fx → (∃!y)Rxy) & (∀y)(∃!x) (Fx & Rxy)) 24 However one proceeds here there looks to be a dilemma: either restrictions on the range of the quantifiers in Hume’s principle are actually motivated by set theoretic considerations or some notion of definiteness can really be made out, which itself does not rely upon set existence principles. These would serve as a principled restriction but in either case clearly something other than the notion of (sortal) concept and of second order logic is motivating our understanding of Hume’s Principle and thus of our knowledge of number. Whatever route is the correct one, it seems hardly possible to regard the laying down of such principles as guaranteeing their truth by stipulation.
5.
Frege and the application of arithmetic
It is certainly true that Frege put the application problem at the heart of his philosophy of mathematics. To the question: “why does arithmetic apply to reality?”, the logicist provides the clear answer because it applies to everything that can be thought. It is the most general science possible. The partial contextual definition, provided by Hume’s Principle and the fundamental thought that numerical concepts are second level concepts yield Frege’s account of the applicability of mathematics. In the simplest case for which the question arises—the application of the cardinal numbers—the solution is that arithmetic is applicable to reality because the concepts, under which things fall, themselves fall under numerical concepts. Thus it is possible to prove in second order logic that ∃n xFx − F falls under the numerical property expressed by the numerically definite quantifier ∃n x if and only if the Frege numeral introduced by the partial contextual definition (Hume’s Principle) is indeed n. In other words the theorem that ∃n xFx ≡ n = NxFx can be obtained, from Hume’s Principle in second order logic. But there is a real difficulty with Frege’s solution to the application problem and that is that we are provided by Hume’s principle with at best a partial contextual definition. The principle cannot settle the truth conditions of sentences of the form q = NxFx where q is not given in the form NxHx for some H, this of course is the famous Julius Caesar problem. In the case of pure arithmetic the Julius Caesar problem can 24 Boolos suggested that we might regard the principle.
(∀F)(∀G)(EXT(F) = EXT(G) ↔ (InDef(F) & InDef(G)) v (∀x) (Fx ↔ Gx)) as a repair of Basic Law Five. This might well then be used as a justification of set theory based upon the Principle of the Limitation of size. But like that justification however the power set axiom might well prove a difficulty when taken together with the separation schema. Knowing that for example N is not “too big” does not help us with the claim that P(N) is not too big. Similarly knowing that X is definite might tell us very little about or nothing at all about the definiteness of P(X). The effect of introducing New Five has been very carefully studied by Shapiro and Weir (see Shapiro and Weir (1999)).
58
The Arché Papers on the Mathematics of Abstraction
properly be regarded as irrelevant, since there would be no other singular terms in the language but once the language is extended to include empirical singular terms, as it would in the language of applied arithmetic, Hume’s Principle will no longer settle the sense of numerical identities and the solution to the application problem will fail. Of course this issue does not arise in the formal language of the Grundgesetze, which is a language of pure arithmetic, since in that language all the objects it is possible to refer to are already given as extensions (value ranges) the identity conditions for which are given by Basic Law Five. But as Michael potter has put it: 25 “a formal language in which Julius Caesar cannot be spoken of is one in which he cannot be counted, and in such a language the applicability of arithmetic remains unexplained. At some stage in the development we shall have to extend the formal language by adding some empirical vocabulary, and we shall then have to address the Julius Caesar problem just as before.” The question naturally arises as to whether the neo-logicist fares any better than Frege with respect to the application problem. This seems very unlikely since the neo-logicist 26 relies exclusively upon Hume’s Principle and therein lays the real difficulty, as Frege long ago knew. In the same letter which I quoted at the beginning of this paper he wrote to Russell about the idea of letting his programme rest on Hume’s Principle alone the following: 27 We can also try the following expedient, and I hinted at this in my Foundations of Arithmetic. If we have a relation (ξ, ζ ) for which the following propositions hold (i) from (a, b) we can infer (b, a) and (2) from (a, b) and (b, c) we can infer ; then this relation can be transformed into an equality (identity), and can be replaced by writing, e.g., “§a = §b”. If the relation is, e.g., that of geometrical similarity, then “a is similar to b” can be replaced by saying “the shape of a is the same as the shape of b”. This is perhaps what you call “definition by abstraction”. But the difficulties here are not the same as in transforming the generality of an identity into an identity of range of values.
References Boolos, G. (1987a), “Saving Frege from Contradiction” Proceedings of the Aristotelian Society 87 (1987), pp. 137–51. Boolos, G. (1987b), “The Consistency of Frege’s Foundations of Arithmetic” in Thomson (1987), pp. 3–20. Boolos, G. (1990a), “The Standard of Equality of Numbers” in Boolos (1990b), pp. 261–77. Boolos, G. (Ed.) (1990b), Meaning and Method: Essays in Honor of Hilary Putnam (Cambridge: Cambridge University Press, 1990). 25 Potter (2000), p. 108. 26 Hale and Wright treat the Julius Caeser problem very seriously indeed, they say of it that it is “one
of the hardest the neo-Fregean must solve” (Hale and Wright 2000, pp. 14–16). They devote the whole of essay 14 of their (2000), (pp. 335–396) to the topic. 27 Frege (1902), p. 141.
Frege, Neo-Logicism and Applied Mathematics
59
Boolos, G. (1993), “Whence the Contradiction?” Proceedings of the Aristotelian Society, Supp. Vol 67 (1993), pp. 213–33. Boolos, G. (1997), “Is Hume’s Principle Analytic?”, in Heck (1997a), pp. 245–62. Boolos, G. (1998), Logic, Logic and Logic (Cambridge MA: Harvard University Press, 1998). Boolos, G. & Heck, R. (1998), “Die Grundlagen der Arithmetik §§82–3”, in Schirn (1998), pp. 407–28. Brandl, J. & Sullivan, P. (Eds.) (1998), New Essays on the Philosophy of Michael Dummett (Vienna: Rodopi, 1998). Burgess, J. P. (1984), Review of Wright [1983], Philosophical Review 93, 1984, pp. 638–40. Clark, P. (1998), “Dummett’s Argument for the Indefinite Extensibility of Set and Real Number” in Brandl & Sullivan (1998), pp. 51–63. Demopoulos, W. (Ed.) (1995), Frege’s Philosophy of Mathematics (Cambridge: Harvard University Press, 1995). Demopoulos, W. (1998), “The Philosophical Basis of our Knowledge of Number”, Noûs 32 (1998), pp. 481–503. Demopulos, W. (2000), “On the origin and Status of Our Conception of Number” Notre Dame Journal of Formal Logic, 41 (2000), pp. 210–26. Dummett, M. (1963), “The Philosophical Significance of Gödel’s theorem”, Ratio 5 (1963), pp. 140–55. Dummett, M. (1991), Frege: Philosophy of Mathematics (London: Duckworth, 1991). Fine, K. (1998), “The Limits of Abstraction” in Schirn (1998), pp. 503–629. Frege, G. (1879), Begriffsschrift (Halle: L. Nebert, 1879). Frege, G. (1884), Die Grundlagen der Arithmetik (Breslau: W. Koebner, 1884); reprinted with English translation by J. L. Austin as The Foundations of Arithmetic (Oxford: Blackwell, 1950). Frege, G. (1893), Die Grundgesetze der Arithmetik vol. 1 (Jena: H. Pohle, 1893), part translated into English by Montgomery Furth in The Basic Laws of Arithmetic (Berkeley: University of California Press, 1964). Frege, G. (1902), Letter XV7 [xxxvi/7] in Frege (1980), pp. 139–42. Frege, G. (1980), Philosophical and Mathematical Correspondence, ed. G. Gabriel, 1980. Hale, R. and Wright, C. (2000), The Reason’s Proper Study (Oxford: Oxford University Press, 2001). Heck, R. Jr. (1993),“The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik”, Journal of Symbolic Logic 58 (1993), pp. 579–601. Heck, R. Jr. (1995), “Frege’s principle” in J. Hintikka (ed.), From Dedekind to Godel, 1995, pp. 119–42. Heck, R. Jr. (1997a), Language, Thought and Logic (Oxford: Oxford University Press, 1997). Heck, R. Jr. (1997b), “The Julius Caesar Objection”, in Heck (1997a), pp. 273–308 (a). Heck, R. Jr. (1997c), “Finitude and Hume’s Principle”, Journal of Philosophical Logic 26, (1997), pp. 589–617. Hodes, H. (1984), “Logicism and the Ontological Commitments of Arithmetic”, Journal of Philosophy 81 (1984), pp. 123–49. Parsons, C. (1965), “Frege’s Theory of Number” in Mathematics to Philosophy, pp. 150–75. Potter, M. (2000), Reason’s Nearest Kin (Oxford: Oxford University Press, 2000). Russell, B. (1903), The Principles of Mathematics (London, George Allen and Unwin, 1903). Russell, B. (1906), “On some difficulties in the theory of transfinite numbers and order types”, reprinted in D. Lackey ed. Bertrand Russell Essays in Analysis (London, George Allen and Unwin, 1973), pp. 135–64. Russell, B. (1908), “Mathematical logic as based on the theory of types”, reprinted in R. C. Marsh ed. Logic and Knowledge (London, George Allen and Unwin, 1956), pp. 59–102. Schirn, M. (1998), Philosophy of Mathematics Today (Oxford: Clarendon Press, 1998). Shapiro, S. and Weir, A. (1999), “New V, ZF and Abstraction”, Philosophia Mathematica 1999, pp. 293–321. Shapiro, S. and Weir, A. (2000), “Neo-logicist Logic is not epistemically innocent” Philosophia Mathematica, pp. 160–89. Smiley, T. (1981), “Frege and Russell”, Epistemologica, 1981, pp. 51–6.
60
The Arché Papers on the Mathematics of Abstraction
Wright, C. (1983), Frege’s Conception of Numbers as Objects (Aberdeen: Aberdeen University Press, 1983). Wright, C. (1997), “The Philosophical Significance of Frege’s Theorem” in Heck (1997), pp. 201–45. Wright, C. (1998a), “On the Harmless Impredicativity of N=”, in Schirn (1998), pp. 339–68. Wright, C. (1998b), “Response to Dummett” in Schirn (1998), pp. 389–406. Wright, C. (2000), “Is Hume’s Principle Analytic”, Notre Dame Journal of Formal Logic, 40 (2000), pp. 6.
FINITUDE AND HUME’S PRINCIPLE 1 Richard G. Heck, Jr Brown University, Providence RI, U.S.A. E-mail:
[email protected]
Abstract The paper formulates and proves a strengthening of ‘Frege’s The-
orem’, which states that axioms for second-order arithmetic are derivable in second-order logic from Hume’s Principle, which itself says that the number of Fs is the same as the number of Gs just in case the Fs and Gs are equinumerous. The improvement consists in restricting this claim to finite concepts, so that nothing is claimed about the circumstances under which infinite concepts have the same number. ‘Finite Hume’s Principle’ also suffices for the derivation of axioms for arithmetic and, indeed, is equivalent to a version of them, in the presence of Frege’s definitions of the primitive expressions of the language of arithmetic. The philosophical significance of this result is also discussed.
1.
Opening
In recent work, 2 George Boolos has, with an eye towards philosophical issues I shall discuss in Section 3, investigated the relative strengths of two sorts of systems of second-order arithmetic. The more familiar of these originates with the work of Dedekind and Peano; the less familiar, with that of Frege. Dedekind–Peano systems characterize the natural numbers in terms of properties of the sequence of natural numbers; these systems may be thought of as axiomatizations of finite ordinal arithmetic. The Fregean systems, on the other hand, characterize the natural numbers as finite cardinals. 3 Fundamental to such systems is an axiom specifying the condition under 1 This paper first appeared in the Journal of Philosophical Logic 26, [1997], pp. 589–61. Reprinted by kind permission of the editor and Springer Academic Publishers. 2 G. Boolos, “On the Proof of Frege’s Theorem”, in A. Morton and S. Stich, eds. Benacerraf and His Critics (Oxford: Blackwells, 1996), pp. 143–59. 3 For further discussion of this difference, see my “The Finite and the Infinite in Frege’s Grundgesetze der Arithmetik”, in M. Schirn, ed., Philosophy of Mathematics Today (Oxford: Oxford University Press, 1998), §5, pp. 429–66.
61 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 61–84. c 2007 Springer.
62
The Arché Papers on the Mathematics of Abstraction
which two concepts 4 have the same cardinal number, together with another specifying under what conditions a cardinal number is finite. What is perhaps the most familiar (second-order) Dedekind–Peano system is axiomatized as follows: 1. 2. 3. 4. 5. 6. 7.
N0 Nx & Pxy → Ny ∀x∀y∀z(Nx & Pxy & Pxz → y = z) ∀x∀y∀z(Nx & Ny & Pxz & Pyz → x = y) ¬∃x(Nx & Px0) ∀x(Nx → ∃y Pxy) ∀F[F0 & ∀x∀y(Fx & Pxy → Fy) → ∀x(Nx → Fx)]
Let us call this system PA2 (for second-order Peano arithmetic). I have here formulated its axioms using a relational expression ‘Pξ η’, rather than the more usual functional expression ‘Sξ ’, to facilitate comparison with Fregean systems. The most familiar Fregean system has but one ‘non-logical’ axiom, Hume’s Principle, which states that the number of Fs is the same as the number of Gs just in case the Fs and Gs are in one–one correspondence. Taking ‘Eqx (Fx;Gx)’ to abbreviate one of the (many equivalent) second-order formulae which define ‘the Fs correspond one–one with the Gs’ (or, in Frege’s terminology, ‘the Fs are equinumerous with the Gs’), Hume’s Principle (HP) is then: Nx : Fx = Nx : Gx ≡ Eqx (F x, Gx) The second-order theory whose sole non-logical axiom is HP is FA (for ‘Frege arithmetic’). Note that ‘Nx : x’ is a unary, second-level, term-forming operator: The result of substituting any formula (possibly containing further occurrences of ‘Nx : x’) for ‘x’ in ‘Nx : x’ is a term. The definition of finite or natural number can be given in different ways. In Frege’s work, 5 zero and the relation of predecession are defined and, famously, the finite numbers are defined as those to which zero stands in the weak ancestral of this relation. The necessary definitions are thus: 0 = Nx : x = x Pmn ≡ ∃F∃y[Fy & n = Nx : Fx & m = Nx : (Fx & x = y)] Frege defines the strong ancestral of a relation Rξ η as follows: R ∗ ab ≡ ∀F[∀z(Raz → Fz) & ∀x∀y(Fx & Rxy → Fy) → Fb] 4 I shall use this term to denote whatever are in the range of the second-order variables. Though my choice of terminology certainly suggests a view about what these are, my remarks here do not depend upon it. It is, of course, essential to the logicist project that second-order logic is logic, but this is not at issue among those whose positions we shall be discussing. 5 See, of course, G. Frege, The Foundations of Arithmetic, 2nd. ed., trans. by J. L. Austin (Evanston, IL: Northwestern University Press, 1953), §§74, 76, 83.
63
Finitude and Hume’s Principle
And he defines the weak ancestral of Rξ η thus: R ∗= ab ≡ R∗ ab ∨ a = b Frege’s definition of natural number is then: n is a natural number just in case P ∗= 0n. There are other ways to proceed, however. In sections K and of Grundgesetze der Arithmetik, Frege formulates a purely second-order definition of finitude, to state which we need an additional definition: 6 Btwx y (Rxy; a; b)(n) ≡ ∀x∀y∀z(Rxy & Rxz → y = z) & ¬R∗ bb & R ∗= an & R ∗= nb Thus, n is between a and b in the R-series if, and only if, Rξ η is a functional relation, in whose strong ancestral b does not stand to itself (i.e., there is no ‘loop’ from b to b), such that a stands in the weak ancestral of Rξ η to n, which in turn stands in the weak ancestral of Rξ η to b. Frege’s definition of finitude is then: 7 Finitex (Fx) ≡ ∃R∃x∃y∀z[Fz ≡ Btw(R; x; y)(z)] That is: A concept is finite just in case the objects falling under it may be ordered in a certain way, namely, as the objects between x and y in the Rseries, for some R, x and y. That this definition is correct follows from the central theorems of sections K and of Grundgesetze der Arithmetik, which are Theorems 327 and 348 of Grundgesetze: (327) Finite(F) → P ∗= (0, Nx : Fx) (348)P ∗= (0, Nx : Fx) → Finite(F) Thus, a concept is finite, in Frege’s sense, just in case its number is a natural number. Frege’s definition of natural number could, therefore, be replaced by: N(n) ≡ ∃F[Finite(F) & n = Nx : Fx] Of course, this definition will be adequate only in a theory strong enough to prove Theorems 327 and 348. 8 Analogues of these theorems are the crucial lemmas in the proofs of the main result of this paper (see Lemmas 3.1, 3.11, and 3.21). As we shall see, given Frege’s definitions of ‘0’ and ‘Pξ η’, Theorem 327 becomes a theorem of second-order logic. The proof of Theorem 348, however, must rely upon additional assumptions, for without additional assumptions, it is consistent that 6 G. Frege, Grundgesetze der Arithmetik (Hildesheim: Georg Olms Verlandsbuchhandlung, 1966). The definition is given in §158 of volume I. I shall insert the bound variables, such as ‘x’ and ‘y’ on the left-hand side here, into the definitions, but will drop them when doing so causes no confusion. 7 Frege does not explicitly formulate any such definition, but it is clear from the theorems proven in sections K and that this is what he intends. For further discussion, see my “The Finite and the Infinite”, op. cit. 8 Frege proves it in the system FA + FD. As we shall see, it is also provable in FAF + FD (and so in PAF + FD).
64
The Arché Papers on the Mathematics of Abstraction
all concepts have the number 0 (and, of course, it is consistent that some of these are not finite). In investigating the relative strengths of Dedekind–Peano and Fregean systems, there are two sorts of questions one might raise. First, one might inquire about the relative consistency of such theories. To ask whether FA is consistent relative to PA2 is to ask whether the consistency of FA would follow from that of PA2. One familiar sort of proof that it would consists in a demonstration that FA can be relatively interpreted in PA2. Roughly speaking, to interpret FA in PA2 is to give definitions of the primitives of FA in terms of the primitives of PA2, which definitions, when added to PA2, allow one to prove relativizations of the axioms of FA in PA2: By a relativization of a formula is meant, as usual, the result of restricting quantifiers occurring in the formula by means of some formula of PA2. 9 If FA can be interpreted in PA2, it follows immediately that, if there is a proof of a contradiction in FA, that proof can be mimicked in PA2, so that, if PA2 is (syntactically) consistent, so is FA. As it turns out, FA and PA2 are equi-interpretable—each can be interpreted in the other—and so equi-consistent—an inconsistency in either would imply an inconsistency in the other. Still, one might wonder whether FA is not, in some other sense, a stronger theory than PA2. This question is more easily understood when we have two theories formulated in the same language. Consider, for example, the following Dedekind–Peano system, which we shall call PAS (for ‘Strong’ Peano arithmetic): 1. 2. 3. 4. 5. 6. 7.
N0 Nx & Pxy →Ny ∀x∀y∀z(Pxy & Pxz → y = z) ∀x∀y∀z(Pxz & Pyz → x = y) ¬∃xPx0 ∀x(Nx → ∃yPxy) ∀F[F0 & ∀x∀y(Fx & Pxy → Fy) → ∀x(Nx→ Fx)]
Clearly, every axiom of PA2 is a theorem of PAS, but the converse does not hold. As far as the axioms of PA2 are concerned, zero could have as its predecessor Julius Caesar, so long as Caesar is not a natural number. Thus, PAS is strictly stronger than PA2. This is perfectly compatible with the fact that PA2 and PAS are equi-interpretable. (To interpret PAS in PA2, no ‘definitions’ are needed: Just restrict all the quantifiers in the axioms of PAS to the natural numbers, i.e., by the formula ‘Nx’.) 9 Of course, in the context of second-order logic, one must restrict not only the first-order, but also the second-order, quantifiers. As we replace ‘∀x A(x)’ by ‘∀x[R(x) → A(x)]’, so we replace ‘∀F A(F)’ by ‘∀F{∀x[Fx → R(x)] → A(F)}’.
Finitude and Hume’s Principle
65
The question under discussion here concerns the proof-theoretic strength of the two systems. This question is harder to raise when the theories under discussion are not formulated in the same language: Obviously, the axioms of FA are not going to be theorems of PA2, since the axioms of FA are not even sentences in the language of PA2. Nor would expanding the language of PA2 to include such formulae help. To consider the relative strength of theories formulated in different languages, what we require is a bridge theory which relates (the referents of) the primitives of PA2 to those of FA. We can then ask whether, with the aid of one or another bridge theory, the theorems of FA can be proven in PA2. 10 One might wonder what difference there is between the question whether FA can be relatively interpreted in PA2, and the question whether the theorems of FA can be proven in PA2, with the aid of some bridge theory. For, one might ask, if FA can be relatively interpreted in PA2, will that not itself guarantee that there is some bridge theory with the aid of which the theorems of FA can be proven in PA2? namely, that theory whose axioms are exactly the definitions used in interpreting FA in PA2? The answer to this question is “No”. One must not overlook the fact that, in relatively interpreting one theory in another, it may be essential to relativize the axioms of the former theory: The usual relative interpretation of FA in PA2, for example, requires that the quantifiers occurring in Hume’s Principle be restricted to the natural numbers. There is no necessity that there should be a way of mimicking this restriction in any bridge theory, and there is certainly no need that any particular bridge theory should impose such a restriction. Our chief interest here is in the relative strength of various Fregean systems and various Dedekind–Peano systems. We thus must make use of a bridge theory which relates (the referents of) their primitives. The bridge theory in which we shall be interested is that which has the following three axioms: 0 = Nx : x = x Pmn ≡ ∃F∃y[Fy & n = Nx : Fx & m = Nx : (Fx & x = y)] Nn ≡ P ∗= 0n This theory we shall call FD—for ‘Frege’s definitions’, since these are the definitions Frege uses in deriving axioms for arithmetic (in particular, those of PAS) in FA. 10 One might well wonder what such a bridge theory must be like, if the provability of the theorems of one system from those of another is to have the kind of interest it is here taken to have. I do not know how this question should be answered. Surely, however, it is sufficient if the axioms of the bridge theory are definitions of the primitives of one of the two theories in terms of the primitives of the other. The bridge theories we shall employ below are of this sort. The question we are considering is thus one of interpretability, rather than relative interpretability.
66
2.
The Arché Papers on the Mathematics of Abstraction
The systems
We will here investigate the relative strengths of five different systems of arithmetic. The Dedekind–Peano systems at which we shall look are PA2 and PAS, mentioned above, and a third system, PAF, whose axioms are those of PA2 plus: ∀x∀y(Nx & Pyx → Ny) This axiom, which we also call PAF (for ‘predecessors are finite’), states that any predecessor of a natural number is a natural number. 11 As we shall see, PAF is stronger than PA2 and weaker than PAS. Before discussing the variations on FA at which we shall look, let me make a remark about the background logic in which we shall be working. In the case of the Dedekind–Peano systems, the logic is usually taken to be standard (axiomatic) second-order logic, with full, impredicative comprehension. In discussing FA and its relations to the Dedekind–Peano systems, however, it is convenient to take the logic also to contain the axiom Boolos calls FE, for ‘functional equivalence’: ∀x(Fx ≡ Gx) → Nx : Fx = Nx : Gx This axiom is clearly valid on any extensional semantics for second-order logic and so should itself be regarded as a truth of (extensional, higher-order) logic. 12 The system whose axioms are those of second-order logic, plus FE, Boolos calls Log. We shall suppose our background logic, throughout, to be Log. The Fregean systems at which we shall look are FA and a variation on it, in which Hume’s Principle has been weakened by restricting its range of application. The axiom is HPF (for Finite Hume’s Principle): Finite(F) ∨ Finite(G) → [Nx : Fx = Nx : Gx ≡ Eqx (Fx; Gx)] Here, the formula ‘Finite(F)’ may be defined via any of the equivalent secondorder definitions of finitude: We shall take it to be defined as Frege defines it. HPF states that finite concepts have the same number if, and only if, they are equinumerous and that no infinite concept has the same number as any finite one—making no further claim about the conditions under which infinite concepts have the same number. (For all that HPF says, all infinite concepts could have the same number, so long as no finite concept also has that number.) Call the theory whose sole non-logical axiom is HPF, FAF (for finite Frege arithmetic). 11 G. Boolos remarked to me that, when presenting “On the Proof of Frege’s Theorem”, he has heard it objected that PAF—or, more precisely, its consequence NPZ, to be mentioned below—cannot be true, since −1 surely precedes 0. But the theories in which we are interested here are theories of cardinal or ordinal numbers, and ‘Pξ η’ is defined as a relation between such numbers. Negative numbers are neither ordinals nor cardinals. 12 As should the axiom schema: ∀x(Fx ≡ Gx) → A(F) ≡ A(G).
Finitude and Hume’s Principle
67
The results of this paper may now be summarized in the following diagram: FA ⇒ PAS ⇒ {PAF (⇒ FAF)} ⇒ PA2 Here, ‘⇒’ means: Is strictly stronger than, relative to the bridge theory FD; that is, ‘A ⇒ B’ means that every theorem of B is a theorem of A + FD, but that not every theorem of A is a theorem of B + FD. That FAF and PAF occur together in the braces indicates that they are equivalent, relative to FD: Every theorem of PAF is a theorem of HPF + FD, and every theorem of FAF is a theorem of PAF + FD. What we need to prove are thus the following: 1. 2. 3. 4.
FA ⇒ PAS PAS ⇒ PAF PAF is equivalent to FAF (modulo FD) PAF ⇒ PA2
Some of the required proofs have been discussed in detail by Boolos: I shall merely indicate how those proofs go. The main work of the present paper consists in establishing Theorem 3.
3.
On the philosophical significance of these results
Before turning to the proofs, let me make a couple of remarks about the inspiration for the present investigation and about its philosophical implications. In Frege’s Conception of Numbers as Objects, Crispin Wright rediscovered Frege’s Theorem, which states that the axioms of PAS are provable in FA + FD, proved it in some detail, and conjectured that FA is consistent, which it turned out to be. 13 On the basis of this result, Wright not only revived Frege’s logicist project, but claimed that it was substantially vindicated by the proof of Frege’s Theorem. If logicism were to be vindicated completely, of course, HP would have to be shown to be a logical truth, which it certainly cannot be, given our contemporary understanding of ‘logical truth’. Nevertheless, Wright argued, HP is ‘analytic’, whence the truths of arithmetic are logical consequences of an analytic truth and so, presumably, are themselves analytic. The sense in which HP is analytic is that, “even if inadequate as a definition, it nevertheless succeeds as an explanation; . . . it contrives to fix the meaning of the sorts of occurrence of [‘Nx : x’] which it fails to eliminate”. 14 In the paper mentioned at the outset, Boolos shows that FA is strictly stronger than PAS (and so PA2), relative to the bridge theory FD. His purpose is not primarily technical: He intends this to be one consideration in favor 13 C. Wright, Frege’s Conception of Numbers as Objects (Aberdeen: Aberdeen University Press, 1983). See especially Ch. 4. The consistency of the system was noted by Burgess, Hazen, and Hodes. Boolos later showed that FA and PA2 are equiinterpretable. For a proof, see Boolos and Heck, Jr., “Die Grundgesetze der Arithmetik §§82–3”, in Schirn, ed., pp. 407–28. 14 Wright, Frege’s Conception, p. 140. See also the statement of Number-theoretic Logicism (III) on p. 153.
68
The Arché Papers on the Mathematics of Abstraction
of the view that, contra Wright, HP is neither ‘analytic’, nor a ‘conceptual truth’, nor any such thing. 15 Boolos does not explain in detail why his result should trouble Wright, but his point seems to be that, since FA is significantly stronger than PAS (which is itself a stronger theory even than PA2, which is itself a very strong theory), it is implausible to claim that HP is a conceptual truth. It is difficult, however, to evaluate the force of this consideration: As Boolos recognizes, Wright would likely reply that since his view is the view that arithmetic is analytic—and, indeed, that the general theory of cardinality which FA embodies is analytic—he is simply being accused of holding that very view. Still, there is a stronger consideration in the vicinity. For the additional proof-theoretic strength of FA, as compared to PA2, reflects a very real, and very large, conceptual gap between second-order arithmetic and the general theory of cardinality. W. W. Tait has pointed out that ‘Hume’s Principle’ is something of a misnomer: In the passage Frege cites when introducing it, Hume is speaking not of cardinality in general but only of the cardinality of finite concepts (or sets, or whatever). 16 As of course he was. Prior to Cantor’s work on transfinite numbers, the view that all equinumerous concepts have the same number, whether they are finite or infinite, was almost universally rejected, because it gives rise to antinomies: For example, it implies that the number of natural numbers is the same as the number of even numbers, and that can seem absurd, because there are lots of natural numbers which are not even— indeed, according to Cantor, as many numbers as there are natural numbers. Cantor’s realization that one can coherently suppose, even in the infinite case, that all and only equinumerous sets have the same cardinality constituted as enormous a conceptual advance as his introduction of transfinite numbers was a mathematical advance. It is easy to forget this, so at home are we initiates with Cantor’s ideas. But it is just as easy to be reminded of it: One has an opportunity every time a student wanders into one’s office puzzled about these very antinomies. Indeed, my own work on this very paper was fundamentally altered by just such an experience. A friend of mine—a professional philosopher, and so no fool— was telling me about an objection one of his students had raised in lecture. The student had insisted that there is only one ‘kind’ of infinity, and my friend had been tempted to reply (but wanted to check with me first) that of course there was more than one kind of infinity, since both the natural numbers and the even numbers are infinite, and the infinities in question certainly cannot be of the same kind. He was troubled by my response. Not just philosophically troubled, mind you, but really bothered: As I conveyed Cantor’s ideas to him, 15 See G. Boolos, “Is Hume’s Principle Analytic?”, in R. Heck, ed., Language, Thought and Logic: Essays in Honor of Michael Dummett (Oxford: Oxford University Press, 1997), pp. 245–62. 16 Private communication. The term ‘Hume’s Principle’ should not confuse, however: It came into common use not because anyone thought Hume scooped Cantor, but because Frege introduced HP in §63 of Grundlagen by quoting from Treatise I iii 1.
Finitude and Hume’s Principle
69
he kept saying, “That’s very worrying”, over and over again. I had to do a lot of explaining before he was again at ease. He made the leap, but my experience served to remind me how great a conceptual leap he made at that point—and so how great a conceptual leap Cantor himself had made. I am not going to argue that HP isn’t a conceptual truth: On that question I regard myself, to steal a phrase of David Wiggins’s, as a militant agnostic. My point is that, once one has recognized just how great a conceptual advance is required if one is to acknowledge the truth of Hume’s Principle, one can no longer accept that Frege’s Theorem has the sort of epistemological interest Wright and others have wanted it to have. What is required if logicism is to be vindicated is not just that there is some conceptual truth or other from which what look like axioms for arithmetic follow, given certain definitions: That would not show that the truths of arithmetic, as we ordinarily understand them, are analytic, but only that arithmetic can be interpreted in some analytically true theory. 17 To put the point differently, if we are so much as to evaluate logicism, we must first uncover the ‘basic laws of arithmetic’, laws which are not just sufficient to allow us to prove translations of arithmetical truths, but laws from which arithmetical truths themselves can be proven. (The distinction is not a mathematical one, but a philosophical one.) But, if these ‘basic laws’ are to be the basic laws of arithmetic, they had better be ones upon which ordinary arithmetical reasoning relies. If Frege’s Theorem is to have the kind of interest Wright suggests, it must be possible to recognize the truth of HP by reflecting on fundamental features of arithmetical reasoning—by which I mean reasoning about, and with, finite numbers, since the epistemological status of arithmetic is what is at issue. For what the logicist must establish is something like this: That there is, implicit in the most basic features of arithmetical thought, a commitment to certain principles, the (tacit) recognition of whose truth is a necessary precondition of arithmetical reasoning, and from which all axioms of arithmetic follow. Having identified these basic laws, we will then be in a position to discuss the question whether they are analytic, or conceptual truths, or what have you. What used to be my favorite argument for the analyticity of HP went roughly like this: HP is a conceptual truth, because it is part of the very concept of cardinality that equinumerous concepts have the same cardinal number. 18 Perhaps, but the argument overlooks the fact that, though this may be true of our present concept of cardinality, ‘we’ did not even have this concept of cardinality until about a 120 years ago. A recognition of the very coherence 17 If analysis were analytic, as Frege thought it was, then Euclidean geometry would be interpretable in an analytically true theory, via Cartesian co-ordinates. Are we to conclude that Frege’s position was inconsistent, since he held that geometry is not analytic, but synthetic a priori? Surely not. 18 Compare C. Wright, “On the Philosophical Significance of Frege’s Theorem”, in Heck, ed, pp. 201– 44. A somewhat different version of this claim is that, even if HP is not analytic of any preexisting concept of cardinality, it is perfectly in order to introduce such a concept by means of HP. This version of the claim becomes important in certain contexts.
70
The Arché Papers on the Mathematics of Abstraction
of our present concept of cardinality requires the conceptual leap I discussed above, whence, even if HP is analytic of our present concept of cardinality, it is extremely odd to attempt to ground our knowledge of arithmetic, of all things, upon it. Moreover, there is demonstrably no way in which a recognition of the truth of HP can arise simply from reflection on the nature of ordinary arithmetical thought—not, that is, if the principles governing ‘ordinary arithmetical thought’ are captured by the axioms of PA2 (or even of PAS) and the outcome of ‘reflection’ is something that could be written down as a proof. That is what follows from the fact that HP is proof theoretically stronger than PAS (and so PAF and PA2). The disparity of strength parallels the conceptual disparity remarkably well—so well as to remind one why well-conceived technical investigations can be so philosophically fruitful. To summarize and emphasize: HP, conceptual truth or not, cannot be what underlies our knowledge of arithmetic. For no amount of reflection on the nature of arithmetical thought could ever convince one of HP, nor even of the coherence of the concept of cardinality of which it is purportedly analytic. Granted, any rationalist project of this sort will have to invoke a distinction between the ‘order of discovery’ and the ‘order of justification’. But the objection is not that Hume’s Principle is not known by ordinary speakers, nor that there was a time when the truths of arithmetic were known, but HP was not. It is that, even if HP is thought of as ‘defining’ or ‘introducing’ or ‘explaining’ our present concept of cardinality, the conceptual resources required if one is so much as to recognize the coherence of this concept (let alone HP’s truth) vastly outstrip the conceptual resources employed in arithmetical reasoning. Wright’s version of logicism is therefore untenable. Of course, this does not imply that no form of logicism is defensible. And careful examination of Boolos’s proofs itself reveals a way forward. The important observation is that the distinction between finitude and infinitude plays a major role in these proofs. Consider, for example, the sort of model Boolos uses to show that FA is stronger than PAS. Take the domain of the model to be the natural numbers, together with Caesar and Brutus. Given any term of the form ‘Nx : Fx’, assign it a value according to the following scheme: Caesar, if there are infinitely many Fs and infinitely many non-Fs Brutus, if there are infinitely many Fs, but only finitely many non-Fs n, if there are exactly n Fs, for some natural number n Interpret the primitives of PAS according to the ‘definitions’ of the bridge theory FD (thus guaranteeing that all axioms of the bridge theory are true in the model): Thus, ‘0’ denotes the number 0; ‘Nξ ’ is true of xiff x is a natural number; and ‘Pξ η’ is true of the pair < x, y > i just in case either x = y = Caesar, or x = y = Brutus, or y = x + 1. It should be clear that the axioms of PAS are all true in this model. But HP is not: For example, ‘Even(ξ )’ having
Finitude and Hume’s Principle
71
been appropriately defined, ‘Nx : [Nx & Even(x)] = Nx : Nx’ will be false in the model—the former term denoting Caesar, the latter, Brutus—even though the evens are equinumerous with the natural numbers. The important point is that, in this model, Hume’s Principle fails only for infinite concepts. Indeed, as Boolos essentially observes, HPF holds in every model of PAS + FD: So, in any model for PAS + FD, Hume’s Principle will fail to hold, if it does, only because there are some equinumerous infinite concepts which are assigned different numbers. The natural technical question is then: Is there a reasonable Dedekind–Peano system which, in the presence of FD, is equivalent to FAF? The answer is that there is: Relative to FD, FAF is equivalent to PAF. Now, in Grundgesetze, Frege actually derives the axioms of PAS in FA + FD, and these proofs do exploit the full power of HP (since the axioms of PAS are not provable in FAF + FD). But Frege’s proofs can easily be adapted to yield proofs, in FAF + FD, of the axioms of PAF: One need only relativize certain of the formulae appearing in those proofs to the natural numbers. Frege’s development of arithmetic thus does not depend essentially upon (though it may have been psychologically impossible without) the conceptual advance of which I have been speaking. This is striking enough, but it is all the more so since my objections to Wright’s attempt to ground arithmetic on Hume’s Principle simply cannot be raised against an attempt to ground it on HPF. For HPF’s weakness, as compared to HP, reflects the conceptual distance between them, too. There are two points to be made here: First, that recognizing the truth of HPF does not require making the conceptual advance made by Cantor; and, secondly, that one can be convinced of the truth of HPF merely by reflection on ordinary arithmetical thought. To take the first point: Just as HP may be thought of as the sole axiom of a general theory of cardinal numbers, HPF may be thought of as the sole axiom of a theory of finite cardinals. And since HPF makes no claims whatsoever about the conditions under which infinite concepts have the same cardinality, 19 it will not give rise to any of the antinomies generated by HP, whence one does not need to make Cantor’s leap before one can accept the truth of HPF. Indeed, not only could HPF have been recognized as true prior to Cantor’s work, it almost universally was. Bolzano, who was famously skeptical about HP in the infinite case, accepted HPF, 20 as did just about everyone else who considered the matter. For all that HPF says is that, in the finite case, all and only equinumerous concepts have the 19 This would be all the more clear were HPF formulated in a logic which allowed partial functions, so that it was defined only for finite concepts. But working in such a logic would complicate matters quite unnecessarily. Such a formulation would also answer the objection that, in its present form, HPF does not have the form of a Fregean abstraction. 20 See B. Bolzano, Paradoxes of the Infinite, trans., by F. Prihonsky (London: Routledge and Kegan Paul, 1950), §§21–2. In §22, Bolzano gives an argument for HPF similar to the one to be given in the next two paragraphs.
72
The Arché Papers on the Mathematics of Abstraction
same number—and who knows what we should say about the infinite ones, other than that none of them have got the same number as any of the finite ones. The second point is that this claim really is implicit in arithmetical reasoning and that one can convince oneself of its truth, come to understand why it is true, by (and perhaps only by) reflecting on basic aspects of arithmetical thought. Now, it is not initially obvious to what notion of finitude we might appeal in reflecting on our arithmetical thought and investigating whether a commitment to HPF is implicit in it. Nor is it clear whether that notion is itself a logical one. But I submit that the intuitive notion of a finite concept is that of one the objects falling under which can be counted, i.e., enumerated by means of some process which eventually terminates. Frege’s definition of finitude directly reflects this intuitive notion: For what the definition says is precisely that a concept is finite if, and only if, the objects falling under it can be ordered as a discrete sequence which has a beginning and an end. 21 Our intuitive notion of finitude can thus be straightforwardly transcribed into second-order logic—thereby showing, modulo the status of second-order logic itself, that this intuitive notion is a logical one. How then can one convince oneself of the truth of HPF? It suffices to realize that the process of counting, which lies at the root of our assignment of numbers to finite concepts, already involves the notion of a one–one correspondence: As Frege frequently points out, 22 to count is to establish a one–one correspondence between certain objects and an initial segment of a sequence of numerals, starting with ‘1’; the process ends with a numeral which names the number of objects counted. By the transitivity of ‘is equinumerous with’, concepts the objects falling under which are themselves equinumerous must be equinumerous with the same initial segments; 23 conversely, any concepts the objects falling under which can be put in one–one correspondence with the same initial segment must be equinumerous. So any two concepts the objects falling under which can be counted—i.e., any two finite concepts— will be assigned the same numeral by the process of counting—i.e., will have the same number—if, and only if, they are equinumerous. And, of course, no infinite concept will get assigned any number by the process of counting. That is enough to establish HPF. 21 That Frege intended his definition to correspond to this intuitive notion is, furthermore, clear from the way he proves Theorems 327 and 348 of Grundgesetze. See my discussion of this point in “The Finite and the Infinite”, op. cit. 22 See, e.g., G. Frege, “Review of E. G. Husserl, Philosophie der Arithmetik I”, in his Collected Papers, ed. B. McGuinness, trans., by H. Kaal (Oxford: Blackwell, 1984), p. 199, original page 319. 23 That there is only one such initial segment will follow from the finitude of the segments themselves, given Frege’s definition. For we shall be able to show that no distinct initial segments are in one– one correspondence. The proof will depend upon certain claims about the numerals themselves, claims corresponding to the axioms of PAF. See here Frege’s discussion of counting in Grundgesetze, Vol. I, §108. I should emphasize that the argument being given here can be formalized: Indeed, the proof of Theorem 3.1, below, can be read as a very rough formalization of it.
Finitude and Hume’s Principle
73
Of course, one might yet have all kinds of worries about the claim that HPF is a conceptual truth. There are two broad classes of such worries: Those which arise from its impredicativity, and those which rest upon the fact that it implies the existence of a lot of objects (infinitely many). I am not going to say anything here about questions of the former sort. 24 But, with regard to the latter, let me say that one needs to be very careful with such objections. Any principle sufficient to ‘ground’ arithmetic in the relevant sense obviously has to imply the existence of infinitely many objects: So one cannot object to someone who is trying to establish that the truths of arithmetic are conceptual truths, or logical consequences of such, by saying that the principle on which he proposes to base arithmetic cannot be a conceptual truth, because no conceptual truth can imply that there are infinitely many objects. One might as well object that the principle yields arithmetic, that his premises imply his conclusion, i.e., accuse him of holding his view. Or, better, one should just say, flatfootedly, that arithmetic can’t be ‘analytic’, in any reasonable sense, since it implies the existence of lots of objects. But that is not so much an objection as a refusal even to discuss the matter. For no one interested in the question whether arithmetic is ‘analytic’ is likely to be moved by that thought. But, having said all of that, let me emphasize that the importance of the question whether HPF is ‘analytic’, in the context of discussions of logicism, should not be allowed to obscure the fact that how we answer it does not affect the philosophical interest of the modification of Frege’s Theorem to be presented below. If HPF really is the ‘basic law of arithmetic’, in the relevant sense, that is philosophically important, whatever its epistemological status might turn out to be.
4.
The relative strengths of the systems
We turn now to the proofs of the four results mentioned at the end of Section 2. In this section, we prove Theorems 1, 2, and 4. We will prove Theorem 3 in the following section. Theorem 1: FA ⇒ PAS. Proof: In the last section, we saw a countermodel which establishes that HP is not a theorem of PAS + FD. That all the axioms of PAS are theorems of FA + FD is the content of Frege’s Theorem, first proven by Frege in Grundgesetze der Arithmetik (though very nearly proven in Die Grundlagen der Arithmetik). 24 I mean to include so-called ‘bad company’ objections. For discussion of these, see the papers of Wright and Boolos in Heck, ed., op. cit. For discussion of general concerns about the impredicativity of HP, see M. Dummett, Frege: Philosophy of Mathematics (Cambridge, MA: Harvard University Press, 1991), pp. 187–89 and 217–22; C. Wright, “The Harmless Impredicativity of Hume’s Principle”, in Schirn, ed., op. cit.; Dummett’s reply, in the same volume.
74
The Arché Papers on the Mathematics of Abstraction
As I have discussed Frege’s proof of Frege’s Theorem in detail elsewhere, and as adaptations of Frege’s proofs will be employed below, we need not dwell on it here. 25 Theorem 2: PAS ⇒ PAF. Proof: Clearly, every axiom of PAF other than PAF itself is a theorem of PAS + FD (indeed, of PAS by itself). That PAF is can be proven by induction. If a = 0, then all of its predecessors are finite, since it has none, by Axiom 5 of PAS. Suppose, then, that Na, that, if Na and Pya, then Ny, and that Pab. We must show that, if Pxb, then Nx. So suppose Pxb. By Axiom 4 of PAS, x = a, so Nx. Done. That not every theorem of PAS is a theorem of PAF + FD should be obvious: The axioms of PAF make no claims whatsoever about what the predecessors of objects which are not natural numbers might be, whereas the axioms of PAS state that predecession is one–one, not just on the natural numbers, but universally. Construction of a model is left to the reader. Theorem 4: PAF ⇒ PA2. Proof: Clearly, every theorem of PA2 is a theorem of PAF + FD. Again, that the converse (roughly speaking) is not true should be obvious: PA2 is completely silent on the question whether zero, or any other natural number, has predecessors which are not natural numbers. To construct a model, let the domain consist of the natural numbers and Julius Caesar. Assign denotations to terms of the form ‘Nx : Fx’ according to the following scheme: 0, if there are no Fs or if everything is F n, if there are nFs, for some finite n > 0 JC, if there are infinitely many Fs, but not everything is F Interpret the primitives of PA2 and PAF according to the axioms of FD. The axioms of PA2 may be verified, but PAF fails: The sentence ‘P (Nx : x = 0,0)’ is true in the model. Now, since, by Theorem 3, to be proven in the next section, PAF is equivalent to FAF (in the presence of FD), it follows that FAF is strictly stronger than PA2. The model just given also shows this directly. For the sentence ‘Finitex (x = x) & Nx : x = x = Nx : x = x & ¬Eqx (x = x; x = x)’ is true in the model, whence HPF is false in the model. 25 See my “The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik”, Journal of Symbolic Logic 58 (1993), pp. 579–601, reprinted, with a Postscript, in W. Demopoulos, ed., Frege’s Philosophy of Mathematics (Cambridge, MA: Harvard University Press, 1995), pp. 257–94. See also, of course, Wright, Frege’s Conception, Ch. 4.
Finitude and Hume’s Principle
5.
75
Theorem 3: PAF is equivalent to FAF
In the proofs to be given below, we shall appeal frequently to the following easy consequence of the second axiom of FD: Fa → P[Nx : (Fx & x = a); Nx : Fx] This is Theorem 102 of Grundgesetze, and I shall cite it as such below. The equivalence of PAF and FAF is, in essence, a consequence of the fact that Theorems 327 and 348 of Grundgesetze, mentioned above, can be proven both in PAF and in FAF, with the aid of the bridge theory FD. That is to say, Finite(F) ≡ P ∗= (0, Nx : Fx) can be proven in both theories. This fact will allow us to work back and forth between the condition of finitude, as it appears in HPF, and the claims about the natural numbers made in the axioms of PAF. The proof to be given here of the left-to-right direction—which is Theorem 327 of Grundgesetze—shows it to be a consequence simply of Frege’s definitions and the axiom FE of Log. ‘The number of a finite concept is natural number’ may thus be added to the list of arithmetical facts which are, modulo the status of second-order logic itself, undeniably logical truths. (Others are ‘0P1’ and ‘1P2’, which Boolos shows to be provable in FD.) We begin by noting that we can, without loss of generality, assume the relation which orders a finite set to be one–one (not just functional), and such that no object follows itself in the relevant series. We define: Betw(Q; a; b)(n) ≡ ∀x∀y∀z∀w(Qxy & Qzw → x = z ≡ y = w)& ¬∃x Q ∗ xx & Q ∗= an & Q ∗= nb Proposition : Log Finite(F) ≡ ∃Q∃a∃b∀x[Fx ≡ Betw(Q; a, b)(x)]. Proof: Right-to-left: Trivial, since if Betw(Q; a; b)(x), then Btw(Q; a,b)(x). Left-to-right: Assume that Fξ is finite, i.e., that for some Rξ η, a, and b: ∀x[Fx ≡ Btw(R; a, b)(x)]. If ¬∃xFx, let Qξ η be the universal relation; let a and b be whatever you like. Then for no x Betw(Q; a, b)(x). If ∃xFx, say x, then we have that Btw(R; a; b)(x), and so: ∀x∀y∀z(Rxy & Rxz → y = z) & ¬R∗ bb& ∀x[Fx ≡ R ∗= ax & R ∗= xb] Define: Qxy ≡ Rxy & Fx & Fy. Then ∀x[Fx ≡ Betw(Q; a, b)(x)]. The proof is straightforward; I shall not present the details here. 26 26 The plan of the proof is as follows. First, Qξ η is functional, since Rξ η is. Second, if Q ∗ xy, then R ∗ xy; so, if Q ∗= ax & Q ∗= xb, then R ∗= ax & R ∗= xb, so Fx. Third, if Q ∗ xx, then R ∗ xx and Fx, so R ∗= xb; so it is enough to prove that, if R ∗= xb, then ¬Q ∗ xx; do this by induction on the converse of Rξ η. Fourth, prove
that the converse of Qξ η is functional, using Theorem 133 of Begriffsschrift, the roll-forward theorem, and the fact that ¬∃x Q ∗ xx. Then prove that, if Fx, then Q ∗= ax, by noting that, if Fx, then R ∗= ax, so it is enough to prove that, if R ∗= ax, then, if Fx, then Q ∗= ax, which can be done by induction. Finally, prove that, if Fx, then Q ∗= xb, similarly.
76
The Arché Papers on the Mathematics of Abstraction
Lemma 3.1 (Theorem 327): FD Finite(F) → P ∗= (0, Nx : Fx). Proof: If ¬∃xFx, then Nx : Fx = 0 (by FE and the first axiom of FD), whence certainly P ∗= (0, Nx : Fx). So we suppose throughout that ∃xFx. Suppose F is finite; by the proposition, there are objects aand band a relation, Rξ η, which is one–one, in whose (strong) ancestral no object stands to itself, and which is such that x is F iff R ∗= ax and R ∗= xb. It will thus suffice to prove that P ∗= [0, Nx : (R ∗= ax & R ∗= xb)]. Since ∃xFx, for some x, R ∗= ax and R ∗= xb, so R ∗= ab. So it will be enough to prove that R ∗= ay → P ∗= [0, Nx : (R ∗= ax & R ∗= xy)] which we can prove by (logical) induction. For whenever R ∗= mn, we can prove that Fn by showing that Fm and: ∀x∀y(R ∗= mx & Fx & Rxy → Fy) (This fact is an easy consequence of the definition of the weak ancestral.) By comprehension, we may take F ξ to be: P ∗= [0, Nx : (R ∗= ax & R ∗= xξ )]. It will thus suffice to prove: (i) P ∗= [0, Nx : (R ∗= ax & R ∗= xa)] (ii) ∀y∀z{R ∗= ay & P ∗= [0, Nx : (R ∗= ax & R ∗= xy)] & Ryz → P ∗= [0, Nx : (R ∗= ax & R ∗= xz)]}
We are assuming, of course, that Rξ η satisfies the conditions mentioned above. For (i): By (102), P[Nx : (x = a & x = a), Nx : x = a]. Since x = a & x = a iff x = x, Nx : (x = a & x = a) = 0, by FE. Hence, P(0, Nx : x = a) and so P ∗= (0, Nx : x = a). So it will suffice to show that Nx : (R ∗= ax & R ∗= xa) = Nx : x = a, for which, by FE, it suffices to show that R ∗= ax & R ∗= xa ≡ x = a. From right-to-left, this is obvious. For the other direction, suppose that R ∗= ax and R ∗= xa and x = a. Then R∗ ax and R∗ xa, so, by the transitivity of the ancestral, R∗ aa. Contradiction. For (ii): Suppose the antecedent. By comprehension, we may take Fξ and a in (102) to be, respectively, R ∗= a ξ & R ∗= ξ z and z, whence: P{Nx : [(R ∗= ax & R ∗= xz) & x = z], Nx : (R ∗= ax & R ∗= xz)} Since P ∗= [0; Nx : (R ∗= ax & R ∗= xy)], it will be enough to show that Nx : (R ∗= ax & R ∗= xy) = Nx : [(R ∗= ax & R ∗= xz) & x = z] for which, by FE, it is enough to show that: ∀x[(R ∗= ax & R ∗= xy) ≡ (R ∗= ax & R ∗= xz) & x = z] Left-to-right: If R ∗= ax and R ∗= xy, R ∗= ay; since Ryz, then R ∗= az. And if x = z, then R ∗= zy and Ryz, so R∗ zz, contradiction. Right-to-left: Since R ∗= xz and
Finitude and Hume’s Principle
77
x = z, R∗ xz. We then have the following theorem of second-order logic, the roll-back theorem: 27 Q ∗ x y → ∃z(Qzy & Q ∗= xz) By the roll-back theorem, there is some w such that Rwz and R ∗= xw. Since Rξ η is one–one and Ryz, w = y, so R ∗= xy. Remark: The following are theorems of PAF: 28 N P Z : ¬∃xPx0 P1M F : ∀x yz(P ∗= 0z & Pxz & Pyz → x = y) Z E : Nx : Fx = 0 ≡ ¬ ∃x Fx Proof: Zero has no predecessor which is a natural number and, by PAF, only natural numbers precede natural numbers. So since zero is a natural number, it can have no predecessor at all. P1MF will follow immediately from Axiom 4 of PAF if we can show that x and y are themselves natural numbers. But this follows from PAF, since z is a natural number and x and y both precede it. For ZE: If ¬∃xFx, then ∀x(Fx ≡ x = x). So, by FE, Nx : Fx = Nx : x = x = 0. Suppose, then, that Nx : Fx = 0 and ∃x Fx, say, a. By (102), P[Nx : (Fx & x = a), Nx : Fx], so, by NPZ, Nx : Fx = 0. Contradiction. Lemma 3.11 (Theorem 348): PAF + FD P ∗= (0, Nx : Fx) → Finite(F). Proof: We prove the equivalent: P ∗= 0n → ∀F[n = Nx : Fx → Finite(F)] The proof is by (logical) induction. We must thus establish that: (i) ∀F[0 = Nx : Fx → Finite(F)] (ii) P ∗= 0n & ∀F[n = Nx : Fx → Finite(F)] & Pnm → ∀F[m = Nx : Fx → Finite(F)]
For (i): Suppose 0 = Nx : Fx. By ZE, ¬∃xFx, so F is finite. For (ii): Suppose the antecedent, and suppose further that m = Nx : Fx. We must show that F is finite. Suppose ¬∃xFx. Then, by ZE, m = Nx : Fx = 0, so 27 The roll-back theorem is proved by induction. We must show
(i) Qxw → ∃z(Qzw & Q ∗= xz) (ii) ∃z(Qzw & Q ∗= xz) & Qwv → ∃z(Qzv & Q ∗= xz) The proof of (i) is trivial: Take z to be x. For (ii), assume the antecedent. Take z in the consequent to be w. By hypothesis, Qwv. And since, by the antecedent, Q ∗= xz & Qzw, certainly Q ∗= xw. 28 In fact, PA2 + NPZ + P1MF is deductively equivalent to PAF. The proof of PAF given above, in the proof of Theorem 2, depends only upon NPZ and P1MF, and not on the full force of Axioms 4 and 5 of PAS.
78
The Arché Papers on the Mathematics of Abstraction
Pn0, contradicting NPZ. So ∃xFx, say a, and P[Nx : (Fx & x = a); Nx : Fx]. But also, by hypothesis, P(n, Nx : Fx), and, since P ∗= 0n; P ∗= (0, Nx : Fx). So, by P1MF, n = Nx : (Fx & x = a). Hence, by the induction hypothesis, Finitex (Fx & x = a). It is then a simple matter to show that F too must be finite. Corollary 3.12: PAF + FD Finite(F) ≡ P ∗= (0, Nx : Fx). Theorem 3.1: PAF+ FD HPF. Proof: By Corollary 3.12, it suffices to show that P ∗= (0; Nx : Fx) → Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx) We prove the equivalent: P ∗= 0n → ∀F{n = Nx : Fx → ∀G[Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx)]} The proof is by induction. We must show that: (i) 0 = Nx : Fx→ ∀G[Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx)] (ii) P ∗= 0n & ∀F{n = Nx : Fx → ∀G[Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx)]} & Pnm→ ∀F{m = Nx : Fx→∀G[Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx)]}
For (i): Suppose that 0 = Nx : Fx. By ZE, ¬∃xFx. Now, if 0 = Nx : Gx, by ZE, ¬∃xGx, so Eq(F,G). Conversely, if Eq(F; G), then ¬∃xGx, so, by FE, Nx : Fx = Nx : Gx. For (ii): Suppose the antecedent, and suppose further that m = Nx : Fx. We must show that, for every G, Nx : Fx = Nx : Gx iff Eqx (Fx, Gx). Since P ∗= 0n and Pnm, P ∗= 0m and so P ∗= (0, Nx : Fx). Left-to-right: Suppose Nx : Fx = Nx : Gx. Since P(n, Nx : Fx), Nx : Fx = 0, by NPZ, and so, by ZE, ∃xFx, say a; similarly, ∃xGx, say b. By (102): P[Nx : (Fx & x = a), Nx : Fx] P[Nx : (Gx & x = b), Nx : Gx] Since Nx : Fx = Nx : G, Nx : (Fx & x = a) = Nx : (Gx & x = b), by P1MF. Moreover, since P(n, Nx : Fx), by P1MF, again, n = Nx : (Fx & x = a). So, by the induction hypothesis: Eqx (Fx & x = a, Gx & x = b) But then Eqx (Fx, Gx), since Fa and Gb. Right-to-left: Suppose Eqx (Fx;Gx). Once again, ∃xFx, say a, and ∃xGx, say b, and: P[Nx : (Fx & x = a), Nx : Fx] P[Nx : (Gx & x = b), Nx : Gx]
Finitude and Hume’s Principle
79
Since P(n; Nx : Fx), n = Nx : (Fx & x = a), by P1MF. But, if Eqx (Fx, Gx), Fa, and Gb, certainly: Eqx (Fx & x = a; Gx & x = b) So, by the induction hypothesis, Nx : (Fx & x = a) = Nx : (Gx & x = b). But then, since P[Nx : (Fx & x = a), Nx : Fx] and P[Nx : (Gx & x = b), Nx : Gx], Nx : Fx = Nx : Gx, by Axiom 3 of PAF. That, then, establishes that HPF is a theorem of PAF + FD. We now turn to the proof that all axioms of PAF are theorems of FAF + FD. Our plan is simply to mimic Frege’s proofs of the axioms of arithmetic, relativized in the appropriate way to the natural numbers. To make these proofs work, we need to establish an analogue of Corollary 3.12. From this it will follow that, when talking about natural numbers, we are dealing only with finite concepts, so HPF will do the work HP does in Frege’s proofs. We divide the proof of Theorem 3.2 into two parts: The proof that all axioms other than Axiom 6 hold is relatively easy, and we prove this first; the proof that Axiom 6 holds is of special interest and so will be considered separately. First, we establish the corollary, by establishing an analogue of Lemma 3.11. Lemma 3.21 (Theorem 348, again): FAF + FD P ∗= (0, Nx : Fx) → Finite(F). Proof: It will suffice to show that (∗ )P ∗= 0n → ∃F[Finite(F) & n = Nx : Fx] For then, suppose that P ∗= (0, Nx : Fx). Then, for some finite G, Nx : Fx = Nx : Gx. By HPF, Eqx (Fx, Gx), so F is finite. The proof of (∗ ) itself is by induction. We must show that: (i) ∃F[Finite(F) & 0 = Nx : Fx] (ii) P ∗= 0n & ∃F[Finite(F) & n = Nx : Fx] & Pnm→ ∃F[Finite(F) & m = Nx : Fx]
For (i): 0 = Nx : x = x and Finitex (x = x). For (ii): Suppose the antecedent, so that Finite(F) and n = Nx : Fx. Since Pnm, P(Nx : Fx, m), so by Axiom 2 of FD, for some G and b: Gb & m = Nx : Gx & Nx : Fx = Nx : (Gx & x = b) Since Finite(F), by HPF, Eqx [Fx, Gx & x = b], so Finitex (Gx & x = b). But then G too is finite. Corollary 3.22: FAF + FD Finite(F) ≡ P ∗= (0, Nx : Fx). Lemma 3.23: FAF + FD All axioms of PAF other than Axiom 6.
80
The Arché Papers on the Mathematics of Abstraction
Proof: Axioms 1, 2, and 7 do not require any special attention: Each of them is an immediate consequence of FD itself—indeed, of just the third axiom of FD. We thus need to prove Axioms 3, 4, and 5 and the Axiom PAF itself. Axiom 5 is: ¬∃x(Nx & Px0). Suppose that Pn0. By the second axiom of FD, there are Fand ysuch that: Fy & 0 = Nx : Fx & n = Nx : (Fx & x = y) But since 0 = Nx : x = x and Finitex (x = x), HPF yields that Eqx (x = x, Fx). But ∃yFy. Contradiction. (Note that this actually establishes NPZ.) Axiom 3 is: ∀x∀y∀z(Nx & Pxy & Pxz → y = z). So suppose that Na, i.e., that P ∗= 0a, and that Pab and Pac. By the second axiom of FD, there are Fand G, and yand z, such that: Fy & b = Nx : Fx & a = Nx : (Fx & x = y) Gz & c = Nx : Gx & a = Nx : (Gx & x = z) Since P ∗= 0a and a = Nx : (Fx & x = y), by Corollary 3.22, Finitex (Fx & x = y). Since Nx : (Fx & x = y) = a = Nx : (Gx & x = z), by HPF, Eqx (Fx & x = y, Gx & x = z). But then Eqx (Fx, Gx), since Fy and Gz, and certainly Finite(F). So, by HPF again, Nx : Fx = Nx : Gx and so b = c. Axiom 4 is: ∀x∀y∀z(Nx & Ny & Pxz & Pyz → x = y). So suppose that Naand Nb, and that Pac and Pbc. Note that P ∗= 0c. We shall make no further appeal to the assumptions that Na and Nb. Once again, there are F and G, and y and z, such that: Fy & c = Nx : Fx & a = Nx : (Fx & x = y) Gz & c = Nx : Gx & b = Nx : (Gx & x = z) Since c = Nx : Fx and P ∗= 0c, by Corollary 3.22, Finite(F). And Nx : Fx = c = Nx : Gx, so by HPF, Eqx (Fx, Gx). But then, since Fy and Gz, Eqx (Fx & x = y, Gx & x = z); these are finite, since F and G are, so by HPF again, Nx : (Fx & x = y) = Nx : (Gx & x = z) and so a = b. (Note that this actually establishes P1MF.) PAF is: Nx & Pyx → Ny. As noted parenthetically above, the proofs of Axioms 4 and 5 in fact suffice to prove NPZ and P1MF, from which PAF follows. But it can also be proven directly. Suppose that P ∗= 0n and that Pmn. Since Pmn, there are Fand a such that: Fa & n = Nx : Fx & m = Nx : (Fx & x = a) ∗=
Since P (0, Nx : Fx), F is finite; so Finitex (Fx & x = a), and so P ∗= [0, Nx : (Fx & x = a)]. But then P ∗= 0m. In the paper mentioned earlier, Boolos proves the surprising result that, in the bridge theory FD, Axiom 6, that is, ∀x(Nx → ∃yPxy), of PA2 is redundant, since it follows from Axioms 3, 4, and 5. More recently, he has observed that, in FD, Axiom 6 in fact follows from Axiom 3 alone. Boolos’s original proof
81
Finitude and Hume’s Principle
of this extraordinary result is somewhat indirect: 29 I shall take the opportunity to give a direct proof here. Of course, since FAF + FD Axiom 3, it follows that FAF + FD Axiom 6. To prove Axiom 6, Frege proves P ∗= 0n → P[n, Nx : (P ∗= 0x&P ∗= xn)] to establish which he needs the crucial lemma: P ∗= 0n → ¬P ∗ nn It is for the proof of this lemma that Axiom 5 is needed. Our proof will differ from his, in the first instance, in that we do not make use of this lemma, but instead pack the necessary condition into the antecedent and prove the weaker: P ∗= 0n & ¬P ∗ nn → P[n, Nx : (P ∗= 0x & P∗= xn)] We complete our argument by also proving P ∗= 0n & P∗ nn → ∃yPny whence Axiom 6 follows by dilemma. Our proof will also differ in another way. Frege appeals to Axiom 4 at a crucial point, but we shall see that the necessary inference does not require it, even in his proof. What we shall use instead is the logicized version of the Law of Trichotomy: P ∗= 0b & P∗= 0c → P ∗ bc ∨ P∗ cb ∨ b = c The Law follows from Axiom 3 and the following strengthening of the famous Proposition 133 of the Begriffsschrift: 30 ∀x ∀ y∀z[R ∗= ax & Rxy & Rxz → y = z] & R ∗= ab & R∗= ac → R ∗ bc ∨ R∗ cb ∨ b = c Instantiating ‘R’ with ‘P’, ‘a’ with ‘0’, and noting that the first conjunct then follows from Axiom 3, the Law of Trichotomy follows immediately. 29 We have that: FD, 3, 4, 5 6. Boolos then observed that also: FD, 3, ¬6 4 & 5. But then, by truthfunctional logic: FD, 3, ¬6 6. And so: FD, 3 6. For a proof of a related result, see G. Boolos, “Frege’s Theorem and the Peano Postulates”, Bulletin of Symbolic Logic 1 (1995), pp. 317–26. 30 The strengthening lies in our assuming not that Rξ η is functional, but just that it is functional on the members of the R-series beginning with a.
Proof: We assume that ∀x∀y∀z[R ∗= ax & Rxy & Rxz → y = z] and R ∗= ab and prove that R ∗= ac → R ∗ bc ∨ R ∗ cb ∨ b = c by induction on c. We must thus prove: (i)
R ∗= aa → R ∗ ba ∨ R ∗ ab ∨ b = a
(ii)
R ∗= ax & (R ∗= ax → R ∗ bx ∨R ∗ xb ∨b = x) & Rxy → (R ∗= ay → R ∗ by ∨R ∗ yb ∨ b = y)
Since R ∗= ab, (i) follows from the definition of the weak ancestral. So suppose the antecedent in (ii). Note that R ∗= ay. Moreover, R ∗ bx or R ∗ xb or b = x. If R ∗ bx, then since Rxy, R ∗ by. Moreover, if b = x, then Rby, so R ∗ by. So suppose that R ∗ xb. By the roll-forward theorem, to be mentioned shortly, for some z, Rxz and R ∗= zb. But then Rxz and Rxy: And since Rξ η is functional on the R-series beginning with a and R ∗= ax, we have z = y. So R ∗= yb, hence R ∗ yb or b = y.
82
The Arché Papers on the Mathematics of Abstraction
Lemma 3.24: FD + Axiom 3 Axiom 6 of PAF. Proof: We must establish ∀x[Nx → ∃yPxy] for which, in light of FD, it will suffice to establish: P ∗= 0n → ∃yPny We proceed by dilemma, proving each of: P ∗= 0n & P ∗ nn → ∃yPny P ∗= 0n & ¬P ∗ nn → ∃yPny The former follows immediately from the following theorem of second-order logic, the roll-forward theorem: 31 R ∗ ab → ∃y(Ray & R ∗= yb) For, then, if P∗ nn, for some y, Pny & P ∗= yn, so certainly ∃yPny. For the latter, we prove: P ∗= 0n & ¬P ∗ nn → P[n, Nx : (P ∗= 0x & P ∗= xn)] The proof is by induction. We thus need to establish: (i) FD ¬P ∗ 00 → P[0, Nx : (P ∗= 0x & P ∗= x0)] (ii) FD + Axiom 3 P ∗= 0a & {¬P∗ aa → P[a, Nx : (P ∗= 0x & P ∗= xa)] & Pab → {¬P∗ bb → P[b, Nx : (P ∗= 0x & P ∗= xb)]}
For (i): Suppose ¬P ∗ 00. Since P ∗= 00, by (102): P[Nx : (P ∗= 0x & P ∗= x0 & x = 0), Nx : (P ∗= 0x & P ∗= x0)] Now suppose that P ∗= 0x & P ∗= x0 & x = 0. Then, since P ∗= x 0 & x = 0, P ∗ x0. But then P ∗ x0 and P ∗= 0x, so P ∗ 00, contradicting our supposition. Thus ¬∃x(P ∗= 0x & P ∗= x0 & x = 0). By FE, Nx : (P ∗= 0x & P ∗= x0 & x = 0) = 0 and so P[0, Nx : (P ∗= 0x & P ∗= x0)]. For (ii): Suppose the antecedent and suppose further that ¬P∗ bb. Suppose, for reductio, that P∗ aa. By the roll-forward theorem, for some y, Pay and P ∗= ya. Since Pab, Axiom 3 implies that y = b. But then P ∗= ba and Pab, so P∗ bb. Contradiction. Hence ¬P∗ aa. By the induction hypothesis, then, P[a, Nx : (P ∗= 0x & P ∗= xa)]. Now, we need to show that P[b, Nx : (P ∗= 0x & P ∗= xb)]. Since P ∗= 0b and P ∗= bb, by (102): P[Nx : (P ∗= 0x & P ∗= xb & x = b), Nx : (P ∗= 0x & P ∗= xb)] So it is enough to show that b = Nx : (P ∗= 0x & P ∗= xb & x = b). And since Pab and P[a, Nx : (P ∗= 0x & P ∗= xa)], we have, by Axiom 3, that b = 31 The proof of this result is similar to that of the roll-back theorem, mentioned earlier.
Finitude and Hume’s Principle
83
Nx : (P ∗= 0x & P ∗= xa). So we need only show that Nx : (P ∗= 0x & P ∗= xa) = Nx : (P ∗= 0x & P ∗= xb & x = b). By FE, this will follow from: ∀x[(P ∗= 0x & P ∗= xa) ≡ (P ∗= 0x & P ∗= xb & x = b)] Left-to-right: Suppose P ∗= 0x and P ∗= xa. Then since Pab, certainly P ∗= xb and, further, P ∗= 0b. Suppose x = b. Then P ∗= ba and Pab, so P∗ bb. Contradiction. Right-to-left: Suppose P ∗= 0x & P ∗= xb & x = b. Then P∗ xb. By the roll-back theorem, for some y, P ∗= xy and Pyb. Up to this point, we have been following Frege’s proof closely. Here, he uses Axiom 4 to conclude that, since Pab, a = y, whence P ∗= xa, and he is done. But we can in fact establish that a = y without appeal to Axiom 4. We have that P ∗= xy and Pyb. Since P ∗= 0x, certainly P ∗= 0y. Since, by the initial hypotheses of the inductive step, P ∗= 0a, the Law of Trichotomy yields that either P∗ ay or P∗ ya or a = y. Suppose that P∗ ay. By the roll-forward theorem, for some z, Paz & P ∗= zy. But since Pab, Axiom 3 implies that z = b. So P ∗= by & Pyb, so P∗ bb, contradiction. Similarly, if P∗ ya, then for some z, Pyz & P ∗= za. But since Pyb, Axiom 3 implies that z = b, so P ∗= ba Pab, so P∗ bb, once again. Hence a = y, and we are done. Theorem 3.2: FAF FD All axioms of PAF. Proof: By Lemmas 3.23 and 3.24.
6.
Closing
We have thus seen that FAF is equivalent, in the presence of the bridge theory FD, to PAF. By Theorem 4, then, FAF is strictly stronger than PA2. The following two questions now raise themselves: Whether there is some further weakening of HP which is provable in PA2 + FD and, if so, whether some such principle is equivalent, in FD, to the conjunction of the axioms of PA2. A natural axiom at which to look would be WHP (for Weak Hume’s Principle): Finitex (Fx) & Finitex (Gx) → [Nx : Fx = Nx : Gx ≡ Eqx (Fx, Gx)] WHP states only that finite concepts have the same number if, and only if, they are equinumerous and makes no claim whatsoever about the conditions under which infinite concepts have the same number as any concept, finite or otherwise. (As far as WHP is concerned, some infinite concepts could have the number zero, others one, and so forth.) Call the theory whose sole nonlogical axiom is WHP, WHP. It can be shown that, though WHP is provable in PA2 + FD, none of the axioms of PA2 which are not already theorems of FD itself are theorems of WHP + FD: Not even the disjunction of these axioms— that is, of Axioms 3, 4, 5, and 6—is a theorem of WHP + FD.
84
The Arché Papers on the Mathematics of Abstraction
Still, it is easy to see that there are no finite models of WHP + FD. For, in any model, there must be a number Nx : x = x; there must be a number Ny : (y = Nx : x = x), which, by WHP, will differ from Nx : x = x; there will be a number Nz : [z = Nx : x = x ∨ z = Ny : (y = Nx : x = x)], which again must differ from the first two, and so forth. Indeed, it is not terribly difficult to prove that PA2 can be relatively interpreted in WHP, and so that PA2 and WHP are equi-interpretable and therefore equi-consistent. As we saw earlier, however, the fact that a theory A is interpretable in another B is no guarantee that there is any reasonable bridge theory by using which one can prove the axioms of A in B. One might therefore wonder whether, in this case, there is some bridge theory other than FD, by appeal to which one could prove the axioms of PA2 in WHP. In fact there is, the necessary modification to the axioms of FD not being drastic. The bridge theory FDF has the same first and third axioms as FD, but we change the second axiom to: ∃F[Finite(F) & m = Nx : Fx] ∨ ∃F[Finite(F) & n = Nx : Fx] → Pmn ≡ ∃F∃y[Finite(F) & Fy & n = Nx : Fx & m = Nx : (Fx & x = y)] This axiom now states nothing at all about when numbers which are not the numbers of finite concepts precede one another. It requires, however, that, if a number is the number of a finite concept, then it will precede or be preceded by another number only if there is some finite concept which does the trick. The axiom, though complicated as stated, seems intuitive enough and is certainly true, since it is a theorem of FAF + FD, as can easily be seen. It may thus be considered a partial definition of one of the primitives of PA2 in terms of those of WHP. And it can be shown that, relative to FDF, WHP, PA2, and PAF are all equivalent. What philosophical interest this result might have for a logicist is a question I shall not pursue. 32
32 Thanks here to Charles Parsons, Alison Simmons, Jason Stanley, Jamie Tappenden, and Crispin Wright for discussion. The paper also benefitted from the comments of the Journal of Philosophical Logic’s referees. I owe a special debt to George Boolos, discussions with whom led to my work on this topic, as many others. George was a teacher, a colleague, a mentor, and a source of inspiration and courage—but most of all, he was a friend. This paper is dedicated to his memory.
ON FINITE HUME 1 Fraser MacBride†
Neo-Fregeanism declares there to be an a priori route that we may follow (guided by proofs and definitions) from an understanding of analytic truths to a grasp of the fundamental laws of arithmetic (see Wright [1997], pp. 202– 11). The purportedly analytic principle from which the neo-Fregean claims we may set out is Hume’s Principle, a principle that specifies the conditions under which concepts have the same cardinal number: (HP)∀F∀G[(N x : F x = N x : Gx) ↔ F1 − 1G] When this principle is adjoined to second-order logic, the system that results is Frege arithmetic. What makes it plausible to suppose that Hume’s principle provides a departure point from which we may successfully go onto grasp arithmetic a priori is the result called Frege’s theorem. For according to Frege’s theorem, the Peano postulates can be interpreted in Frege arithmetic and their interpretations proved in that system (see Boolos [1996]). There are, however, alternative departure points, other purportedly analytic principles, from which we may just as plausibly set out. As Richard Heck has shown (Heck [1997]), finite Hume’s principle is one such principle, a principle that states the conditions under which finite concepts have the same cardinal number: (HPF) ∀F∀G((Finite(F)vFinite(G)) → [(Nx : Fx = Nx : Gx) ↔ F1 − 1G]) The system that results from adjoining this principle to second-order logic may be called finite Frege arithmetic. Heck demonstrates that the Peano postulates have provable interpretations in finite Frege arithmetic just as they do in Frege arithmetic. 1 This paper first appeared in Philosophia Mathematica 8, [2000], pp. 150–9. Reprinted by kind permission of the editor and Oxford University Press. † I wish to thank Patrick Greenough, Katherine Hawley, Richard Heck, Alex Oliver, Stewart Shapiro, Crispin Wright and, especially, Peter Clark for discussion of this paper. I am also grateful to the audience at an Arché conference on abstraction held at the University of St. Andrews for their helpful comments.
85 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 85–93. c 2007 Springer.
86
The Arché Papers on the Mathematics of Abstraction
To these technical results Heck adds the philosophical contention that it is from an understanding of finite Hume’s principle—rather than Hume’s principle—that the neo-Fregean should guide us to an a priori grasp of arithmetic. I will argue that Heck’s philosophical arguments are flawed. They do not give the neo-Fregean reason to lose nerve. An abstraction is a principle of the form: ((α j ) = (αk )) ↔ α j ≈ αk Abstraction principles tie the conditions under which entities of one kind (s) are identical to the obtaining of an equivalence relation (≈) amongst another kind of entity (αs). Such a principle may be read as offering a substantial, synthetic claim to the effect that facts about one kind of entity are necessarily connected to facts about another entirely distinct kind. But the neo-Fregean counsels that we need not always read abstractions in this way (Wright [1997], pp. 205–08). He suggests instead that an abstraction principle may be read (in certain circumstances) as offering an analytic claim. For where an abstraction contains an occurrence of a novel term-forming operator (“”) on expressions (“α1 ” · · · “αk ” . . . ) that are already understood, the abstraction may be read as embodying a stipulation that introduces the novel term into language. It may be read as stipulating that the truth conditions of identity statements featuring the novel operator (“(α j ) = (αk )”) coincide with the truth conditions of another familiar form of statement (“α j ≈ αk ”). Since one way to be analytic is to be stipulated, the neoFregean concludes, that under these circumstances, abstraction principles are analytic. According to neo-Fregean doctrine, Hume’s principle is an analytic abstraction. It introduces a novel cardinality operator by stipulating that the truth conditions of identity statements concerning cardinal numbers coincide with the conditions under which an equivalence relation amongst concepts may be familiarly said to obtain. Imagine a faultlessly rational character—call him ‘Hero’—who has mastered second-order logic but has yet to be introduced to any characteristically mathematical notions (see Wright [1998], p. 359). Despite his mathematical ignorance, Hero is able to grasp Hume’s principle because the familiar vocabulary (bound concept variables, the notion of one–one correspondence) which the principle uses to introduce the cardinality operator is second-order expressible. Hero is therefore able to appreciate a priori the conditions under which—according to the stipulation Hume’s principle effects—cardinal numbers are identical. Having come to appreciate Hume’s principle, Hero is then able to appreciate the truth of Peano’s postulates. For as Frege’s theorem shows, these postulates may be interpreted and their interpretations proved in a system generated from Hume’s Principle and the second-order logic that Hero has already mastered. The possible case of Hero—the neo-Fregean
On Finite Hume
87
claims—shows how it is possible for mathematical knowledge to be acquired a priori. There are many challenges the neo-Fregean must overcome in order to substantiate the claim that Frege’s theorem may be invested with such epistemological significance. Amongst the most basic of these challenges are two. First, the neo-Fregean must establish that Hume’s Principle is nothing more than a stipulation even though in conjunction with second-order logic it generates a theory—Frege arithmetic—committed to the existence of infinitely many objects. Second, the neo-Fregean must establish—contrary to Quinean suspicions—that an understanding of second-order logic does not presuppose prior knowledge of mathematics. Nevertheless, provided such challenges can be met, it appears the neo-Fregean may legitimately claim that it is possible to travel a priori from an understanding of Hume’s Principle to a grasp of arithmetic. But the legitimacy of this final claim remains open to question. For even if it is granted that Hume’s principle may be employed to access some array or other of a priori truths, it remains to be established that these really are arithmetical truths and that the a priori knowledge acquired by employing Hume’s principle is genuinely arithmetical in character. Frege’s theorem, on its own, does not secure this result. All that Frege’s theorem strictly shows—given the assumption that abstraction principles can be used to generate a priori knowledge—is that there is a system of a priori truths (Frege arithmetic) that is capable of modelling arithmetic. It does not show that any of these a priori truths are arithmetical in character. According to Heck, the a priori truths that flow from Hume’s principle can only be arithmetical if Hume’s principle itself is a genuinely arithmetical principle. To establish that Hume’s principle is arithmetical, Heck claims, it must be shown to be a basic law “upon which ordinary arithmetical reasoning relies” (Heck [1997], p. 596). But, Heck goes onto argue, Hume’s principle does not underlie ordinary arithmetical reasoning and so the a priori knowledge which a grasp of Hume’s principle grounds cannot be arithmetical. Heck concludes that Frege’s theorem fails to map an a priori route to arithmetical knowledge. Heck bases his argument that Hume’s principle does not inform ordinary arithmetical reasoning on the contention that “no amount of reflection on the nature of arithmetical thought could ever convince one of HP” (Heck [1997], p. 597). Hume’s Principle says that all and only equinumerous concepts have the same cardinality. It follows that the number of natural numbers and the number of even numbers are the same (since the concepts natural number and even number can be put in one–one correspondence). But no amount of reflection on ordinary arithmetical reasoning—that is, “reasoning about, and with, finite numbers”—could ever convince one that these infinite cardinals are the same. Indeed it comes as something of a conceptual shock to discover that the concepts natural number and even number have the same cardinality.
88
The Arché Papers on the Mathematics of Abstraction
Since the evens form only a portion of the naturals, our naive inclination is to say that there are fewer evens than naturals. That is why it took not just an ordinary arithmetical thinker, but an intellect of great genius—Georg Cantor— to make the conceptual leap required to recognise the truth of what Hume’s principle tells us, that the concepts in question share the same cardinal. That is why it cannot be Hume’s principle that is implicit in ordinary arithmetical reasoning. By contrast, Heck continues, it is plausible to hold that Finite Hume’s principle informs habitual arithmetical practice. Finite Hume’s principle does not make any claim concerning the conditions under which infinite cardinals are the same or different. It is possible to accept Finite Hume without taking the conceptual leap that Cantor made. In fact, Heck claims, prior to the receipt of Cantor’s work, reflection on numerical practice led almost all thinkers to endorse finite Hume’s principle. Heck cites Bolzano as an example of such a thinker, a thinker who was sceptical about Hume’s principle (since he thought it possible for there to be infinite totalities that, even though equinumerous, nevertheless differed in multiplicity) but who endorsed Finite Hume’s principle. Heck goes so far as to claim that Finite Hume “really is implicit in arithmetical reasoning and that one can convince oneself of its truth, come to understand why it is true, by (and perhaps only by) reflecting on basic aspects of arithmetical thought”. We count by establishing a one–one correspondence between an initial segment of the sequence of numerals and the objects counted: we begin with “1” and end with a numeral n that stands for the number of objects. Consequently, if two finite concepts are equinumerous, then, by the transitivity of equinumerosity, the objects falling under one of those concepts will be one– one correspondent with the same initial segment of numerals as the objects falling under the other concept. In other words, those two finite concepts will have the same number. Conversely, if the objects falling under one concept are one–one correspondent with the same initial segment of the numerals as the objects falling under another concept—that is, if those two finite concepts share the same number—then, by the transitivity of equinumerosity, those concepts will be equinumerous. Reflection on the nature of counting thereby establishes that finite concepts have the same number if, and only if, they are equinumerous. Indeed—Heck claims—it is just such an argument that convinced Bolzano of the truth of finite Hume. Heck concludes that if the neo-Fregean claim to provide an epistemology of arithmetic is to have any legitimacy, the neo-Fregean should adopt finite Hume’s principle rather than Hume’s principle. The neo-Fregean account of how a priori knowledge may be generated applies just as well to the former principle as the latter. Just like its less modest cousin, finite Hume’s principle may be understood as an analytic abstraction, only a restricted one, stipulating the conditions under which finite cardinal numbers are identical. The additional vocabulary finite Hume’s principle uses to fix these conditions
On Finite Hume
89
(the extra notion of finite) is also expressible in second-order logic. And, furthermore, interpretations of the Peano postulates are—as Heck shows— provable in the system that results from uniting finite Hume’s principle and second-order logic. But, unlike Hume’s principle, finite Hume is implicit in ordinary arithmetical practice. So, Heck claims, there is every reason to suppose that the a priori knowledge that finite Hume delivers—by contrast to the knowledge that flows from Hume’s principle—is arithmetical. Heck’s arguments are, however, flawed. To begin with, Heck’s claim that prior to the receipt of Cantor’s work it was finite Hume, rather than Hume’s principle, that informed arithmetical thinking is highly contentious. Prior to the receipt of Cantor’s work it seemed paradoxical to think that the number of natural numbers was the same as the number of even numbers even though the latter constitute only a portion of the former. But in order for this thought to appear a paradox, it is necessary not only to have an intellectual inclination to deny that those numbers are distinct, it is also necessary to have an intellectual inclination to affirm that they are the same. Yet if—as Heck supposes— pre-Cantorian thinkers only endorsed finite Hume, then they would have had no reason to affirm the identity of the number of natural numbers and the number of even numbers. They would have had no reason because finite Hume’s principle is entirely silent about identities amongst infinite numbers. So, it would not have seemed—as it did seem to them—a paradox that those numbers were distinct. Indeed, such thinkers as Heck describes would have been quite unable to frame a thought about the identity and distinctness of infinite numbers. If, prior to the reception of Cantor’s work, ordinary arithmetical reasoning had only been informed by finite Hume then Bolzano’s Paradoxes of the Infinite would have been a shorter book. Heck sketches an historical Cantor whose conceptual contribution was— through his work on the infinite—to clear the way for the introduction of a novel criterion of numerical identity that applied not only to finite but also to infinite numbers. According to Heck history, we fail to register the significance of Cantor’s contribution if we suppose Hume’s principle already informed ordinary arithmetical reasoning. But Heck’s sketch makes no sense of the fact that prior to Cantor it appeared paradoxical to affirm that equinumerosity amongst concepts sufficed for the identity of infinite numbers. In fact it makes better sense of history to describe the significance of Cantor’s contribution in a quite different way. According to this historical reconstruction, ordinary arithmetical thinking, prior to Cantor, was guided not only by Hume’s principle but also by the intuition that the cardinality of a collection of entities is always greater than the cardinality of any of its proper parts. The clash of principle and intuition made the identities of infinite numbers appear paradoxical to earlier thinkers. Hume’s principle, according to this reconstruction, led these thinkers to affirm that the natural numbers and the even numbers had the same cardinality, whereas intuition led them to suppose their cardinalities were different since the even numbers form only
90
The Arché Papers on the Mathematics of Abstraction
a proper part of the collection of natural numbers. Cantor’s contribution was to recognise this intuition was founded only on a parochial acquaintance with finite wholes and parts, and to enable us to resolve the paradox by persuading us to abandon the intuition. Cantor’s contribution was not to clear the way for the introduction of a novel principle of numerical identity, Hume’s principle. Rather—like a great moral leader—he persuaded us to see that a familiar principle should be applied in an unfamiliar case. Heck’s argument that finite Hume’s principle does inform ordinary arithmetical practice (because it may be arrived at by reflection on the process of counting) must also be questioned. First, if any abstraction principle can be arrived at by reflection on the process of counting, then that principle is not finite Hume. Finite Hume presupposes the intelligibility of the notion of an infinite cardinal. Suppose that concepts F and G fail to be equinumerous because F is infinite whereas G is finite. Then, by finite Hume, there is an infinite number belonging to the concept F that is distinct from the number belonging to the concept G. But, if the identity conditions for cardinals flow from reflection on the process of counting, then the notion of an infinite cardinal cannot make sense. For an infinite cardinal is a number that belongs to a concept the objects falling under which, by definition, cannot be counted; there is no initial segment of the numerals with which the objects falling under such a concept can be put in one–one correspondence. So finite Hume cannot be arrived at by the reflective route recommended by Heck. This suggests that if any abstraction principle arises from simple reflection on the counting process, it is another principle—weak Hume’s principle—that, unlike finite Hume, concerns only the identity conditions of finite cardinals: (WHP)∀F∀G((Finite(F) & Finite(G)) → [(Nx : Fx = Nx : Gx) ↔ F1 − 1G]) The suggestion receives historical support. Heck mentions Bolzano as an example of a thinker who endorsed finite Hume on the basis of reflection on the counting process (Bolzano [1950], §22). In fact, in the passage cited by Heck, it is weak Hume, rather than finite Hume, which Bolzano sanctions on that basis. 2 Second, it is far from evident that there is any abstraction even resembling finite Hume that is implicit in ordinary arithmetical practice. A character can readily be imagined who is capable of counting yet lacks the conceptual wherewithal to grasp finite Hume. He might, for instance, lack the notion of a relation that would be required to grasp the second-order definition of one– one correspondence embodied in finite Hume. Or he might be unable to comprehend the notion of an arbitrary property required to grasp the significance 2 Bolzano [1950], pp. 98–9 endorses the following principle: “Whenever, in fact, two finite sets are constituted so that every object a in the one corresponds to another object b in the other which can be paired off with it, no object in either set being without a partner in the other, and no object occurring in more than one pair: then indeed are the two finite sets always equal in respect of multiplicity”.
On Finite Hume
91
of its second-order quantifiers. Alternatively, instead of having any thought of numerical identity, he might have a concern only for the employment of numerals. More generally, it is contentious to suppose that any theoretical principle is implicit in ordinary arithmetical practice. For not only is the notion of a theoretical principle implicit in practice a very murky and difficult notion to apply—as Kripke’s rule following paradox makes clear—the more radical possibility remains that our arithmetical practice should not be described in theoretical terms at all. Perhaps ordinary arithmetical practice, rather than being informed by a ghostly inner theory, is better understood as the exercise of a repertory of arithmetical techniques. Of course, if ordinary arithmetical reasoning is not informed by a secondorder abstraction, then a fortiori it is not informed by Hume’s principle. Heck assumes that the neo-Fregean can only succeed in providing an epistemology of arithmetic if the basic laws employed for that purpose actually inform ordinary arithmetical reasoning. So if Heck is correct to make this assumption it appears the epistemological project the neo-Fregean undertakes cannot succeed. Unfortunately, Heck makes no attempt to justify the assumption that an arithmetical epistemology must be derived from principles that we can retrieve by reflection on ordinary arithmetical reasoning. Heck’s remarks suggest the following argument. It is constitutive of arithmetical truths that they are derived from the basic laws that actually inform ordinary arithmetic. We might call these laws the ‘canonical sources’ of arithmetical truths. Since Hume’s principle is not a canonical source, the a priori truths that may be derived from it cannot be arithmetical in character. But this argument is far from convincing. The premise that an arithmetical truth can only be derived from canonical sources is unmotivated, and generalised it leads to the absurd conclusion that a stronger principle cannot be employed—perhaps for reasons of elegance—to prove the consequences of a weaker principle. Heck’s objections to neo-Fregeanism reflect a presupposition common amongst its critics. According to this presupposition, the neo-Fregean project is a hermeneutic one: it aims to show that what we ‘had in mind’ all along, when we reasoned arithmetically, is a priori. It is now generally recognised that Frege had no concern to determine that ordinary arithmetical reasoning is a priori (see Benacerraf [1981], Weiner [1984], and Dummett [1991], pp. 176– 79). Regrettably, it is not generally recognised that neo-Fregeans need have no hermeneutic concern either. Benacerraf exhibits such a lack of recognition in his discussion of Hempel: According to Hempel the Frege-Russell definitions of number, successor, and related concepts have shown the propositions of arithmetic to be analytic because they follow by stipulative definitions from logical principles. What Hempel has in mind here is clearly that in a constructed formal system of logic one may introduce by stipulative definition the expressions ‘Number’, ‘Zero’. . . in such
92
The Arché Papers on the Mathematics of Abstraction a way that sentences of such a formal system using these abbreviations and which are formally the same (i.e. spelled the same way as) certain sentences of arithmetic appear as theorems of the system. He concludes from that undeniable fact that these definitions show the theorems of arithmetic to be mere notational extensions of theorems of logic, and thus analytic. (Demopoulos [1995], p. 46)
Having outlined Hempel’s logicist project, Benacerraf proceeds to remark (next sentence) that the project fails because Hempel does not establish that the arithmetical truths we express in ordinary language are analytic: He [Hempel] is not entitled to that conclusion. Nor would he be even if the theorems of logic in their primitive notations were themselves analytic. For the only things that have been shown to follow from theorems of logic by stipulation are the abbreviated theorems of the logistic system. To parlay that into an argument about the propositions of arithmetic, one needs an argument that the sentences of arithmetic, in their preanalytic senses, mean the same (or approximately the same) as their homonyms in the logicistic system. That requires a separate and longer argument. I bring this up here not to berate Hempel but to use his views as an illustration of the epistemological motivation that drives twentieth century logicists.
But Benacerraf’s observation is misplaced. For neither Hempel nor any other neo-Fregean ever claimed to be putting forward a thesis about the ordinary senses of arithmetical expressions. Hempel’s expresses the very different nature of his epistemological project with exemplary clarity: The assertion that the definitions given above state the “customary” meaning of arithmetical terms involved is to be understood in the logical, not the psychological sense of the term “meaning”. It would obviously be absurd to claim that the above definitions express “what everybody has in mind” when talking about numbers and the various operations that can be performed with them. What is achieved by those definitions is rather a “logical reconstruction” of the concepts of arithmetic in the sense that if the definitions are accepted, then those statements in science and everyday discourse which involve arithmetical terms can be interpreted coherently and systematically in such a manner that they are capable of objective validation. Hempel [1945], p. 387
The objections that Heck (and Benacerraf) voice are ineffective because they misconstrue the nature of the neo-Fregean project. That project never was to uncover a priori truth in what we ordinarily think, but to demonstrate how a priori truth could flow from a logical reconstruction of arithmetical practice. By failing to recognise the nature of the beast, Heck fails to articulate a convincing objection to the neo-Fregean doctrine that there is an a priori route from Hume’s principle to knowledge of arithmetic. 3 3 Of course, there are a host of other difficulties the neo-Fregean must confront in order to make good their claims, difficulties concerning the ability of Hume’s principle to introduce objects and the tenability of the minimalist metaphysic that principle assumes. I explore these issues further in MacBride [2003].
On Finite Hume
93
References Benacerraf, P. [1981]: “Frege: The Last Logicist” in P. French et al. (eds) Midwest Studies in Philosophy VI, pp. 17–35. Minneapolis: University of Minnesota Press. Reprinted in Demopoulos [1995], pp. 41–67. Benacerraf, P. and Putnam, H. (eds) [1983]: The Philosophy of Mathematics: Selected Readings. Cambridge: Cambridge University Press. Bolzano, B. [1950]: Paradoxes of the Infinite trans. F. Prinhonsky. London: Routledge and Kegan Paul. Boolos, G. [1996]: “On the Proof of Frege’s Theorem” in A. Morton and P. Stich (eds), Benacerraf and His Critics, pp. 143–59.Oxford: Blackwells. Demopoulos, W. (ed.) [1995]: Frege’s Philosophy of Mathematics. Harvard: Harvard University Press. Dummett, M. [1991]: Frege: Philosophy of Mathematics. London: Duckworth. Frege, G. [1950]: Foundations of Arithmetic, trans. J.L. Austin. Oxford: Basil Blackwell. Hempel, C. [1945]: “On the Nature of Mathematical Truth”, American Mathematical Monthly, 52, pp. 543–56. Reprinted in Benacerraf and Putnam [1983], pp. 377–93. Heck, R. [1997]: “Finitude and Hume’s Principle”, Journal of Philosophical Logic, 26, pp. 598–617. MacBride, F. [2003]: “Speaking with Shadows: A Study of Neo-Fregeanism”, British Journal for the Philosophy of Science, 54, pp. 103–63. Weiner, J. [1984]: “The Philosopher behind the Last Logicist” in C. Wright (ed.) Frege: Tradition & Influence. Oxford: Basil Blackwell. Wright, C. [1997]: “On the Philosophical Significance of Frege’s Theorem” in R. Heck (ed.), Language, Thought and Logic: Essays in Honour of Michael Dummett, pp. 201–44. Oxford: Oxford University Press. Wright, C. [1998]: “On the Harmless Impredicativity of N= (‘Hume’s Principle’)” in M. Schirn (ed.), The Philosophy of Mathematics Today, pp. 339–68. Oxford: Clarendon Press.
COULD NOTHING MATTER? 1 Fraser MacBride
According to the neo-Fregean we may acquire a priori knowledge of arithmetic’s fundamental laws by reflecting upon the (recognisable) second-order logical consequences of an a priori principle (Hume’s Principle) that specifies identity conditions for cardinal numbers: (HP)(∀ F)(∀ G)(Nx : Fx = Nx : Gx ↔ F 1 − 1 G) This epistemological contention receives mathematical support from Frege’s Theorem, the result that Peano’s axioms can be interpreted and their interpretations proved in the system (Frege arithmetic) that results from adjoining (HP) to second-order logic (see Wright 1997: 202–11). 2 Black (2000) brands this contention “implausible” and argues that (HP) provides the wrong sort of reason for believing in the infinity of the number series. I will argue that a central argument in Black’s paper is ineffective, relying upon a popular misconception of the epistemological character of the neo-Fregean project (see Black 2000: 233–36 and also Heck 1997: 597–98 and Lowe 1998: 49–50). It is important that this misconception is corrected and the concern Black voices assigned its proper place. Otherwise an accurate assessment will continue to evade us of the relative merits and demerits of the neo-Fregean philosophy of mathematics compared to any other.
1.
Black’s thought experiment
Black asks us to imagine a tribe of arithmeticians whose basic notion of number is that of finite ordinal. They arrive at this notion by reflecting upon their practice of counting. The tribe counts a totally ordered collection of objects by linking its members one by one with some other objects (the ‘numbers’) taken in a privileged order. The last number so assigned is the 1 This paper first appeared in Analysis 62, [2002], pp. 125–135. Reprinted by kind permission of the editor and Blackwell Publishing. 2 All references to Wright, and Hale & Wright are to the reprints of their papers in Hale & Wright 2001.
95 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 95–104. c 2007 Springer.
96
The Arché Papers on the Mathematics of Abstraction
ordinal number of the collection. Black also asks us to imagine that the tribe does not recognise the number zero. Let “W(R)” stand for the second-order statement that the relation R well-orders its field and let “R ∼ = S” stand for the statement that the orderings R and S are isomorphic. Then, according to Black (2000: 234), the notion of ordinal number that informs the tribe’s practice may be encapsulated by a principle (let’s call it the Tribe’s Principle) that specifies identity conditions for ordinal numbers: (TP)(∀ R)(∀ S)[(W(R) & W(S) & ∃ xRxx & ∃ xSxx) → (oR = oS ↔ R ∼ = S)] Continuing the fantasy, the tribe progress to the notion of cardinal number by establishing that all orderings of a given non-empty set have the same ordinal number. Their notion of cardinal number may then be captured by the restriction of (HP) to non-zero cardinals: (RHP)(∀ X)(∀ Y) [(∃yXy & ∃zYz) → (Nx : Xx = Nx : Yx ↔ X1 − 1 Y)] On this basis the tribe develops finitary arithmetic (or at least that portion of finite arithmetic required for counting the objects that the tribe actually encounters). But despite the considerable advances made by the tribe they fail to take Cantor’s leap and only envisage the application of (RHP) to finite totalities. Black’s thought experiment is designed to reveal that it is possible to have a “coherent understanding” of finite arithmetic and its applications which is not epistemologically founded on (HP) (2000: 235–36). For even though the tribe possesses such an understanding the principle for cardinal identity (RHP) they employ eschews—by contrast to (HP)—a commitment to zero. Moreover, (HP) is inconsistent with the principle of ordinal identity (TP) upon which the tribe’s understanding is ultimately founded. For whereas (HP) may be satisfied only in infinite domains, (TP) may be satisfied only in domains of finite size (inducing the Burali-Forti paradox otherwise) (Hodes 1984: fn 16). Since a coherent understanding of finite arithmetic can be achieved without recourse to (HP) Black concludes that (HP) cannot perform the foundational epistemological role the neo-Fregean proposes for it. There are a number of distinct issues here that require to be disentangled. First, Black’s assumption that (TP) is a mathematical principle—a principle of ordinal identity—may be questioned. For (TP) fails to exhibit an arguably constitutive feature of any genuinely mathematical principle, namely the feature of being conservative with respect to non-mathematical theories (cf. Field 1980: 9–16). Roughly, a mathematical principle should not allow the derivation of any non-mathematical conclusions from non-mathematical premises that could not have been drawn from those premises already. (TP) fails to be conservative (in this sense) because it cannot be satisfied in any infinite domain, even if the domain in question is composed solely of nonmathematical objects. Consequently (TP) allows us to draw a conclusion that we might not otherwise have drawn, the conclusion that there are only
Could Nothing Matter?
97
finitely many non-mathematical objects. 3 If it is indeed a characteristic mark of mathematical principles that they are conservative, it is correspondingly doubtful whether the understanding of the tribe that is based solely upon (TP) is genuinely mathematical in character. 4 Of course, Black may reject the general assumption that mathematical principles are (by constitution) conservative—although Black will then have to undertake the burden of explaining away the implausibility of supposing that a purely general mathematical proposition rules out the possibility of so many non-mathematical objects. Alternatively, he may reject the more specific assumption that (TP) need be labelled ‘mathematical’ in the first place— although he will then have to explain why this manoeuvre is anything more than ad hoc. But in either case it is open to the neo-Fregean to question whether (TP) is capable of serving as any sort of epistemological foundation, never mind an alternative one to (HP). For they may argue (to take one plausible example) that it is an open epistemic possibility that there are infinitely many spacetime points. Since the truth of (TP) is incompatible with any such infinitary hypothesis (therein lies (TP)’s failure to be conservative) it also an open epistemic possibility that (TP) is false. It then becomes a mystery how grasp of a principle that—without Cartesian excess—may legitimately be doubted could serve as a foundation for acquiring knowledge of arithmetical truths. These difficulties may be avoided by weakening (TP) so as to render it relevantly conservative. This result may be achieved by strengthening the antecedent of (TP) to characterise only concepts of finite extension: (WTP) (∀ R)(∀ S)[(W(R) & W(S) & ∃xRxx & ∃xSxx & Finite(S) & Finite(R)) → (oR = oS ↔ R ∼ = S)] It may (in any case) be argued that it is (WTP) rather than (TP) that underwrites the counting practice of the tribe in Black’s thought experiment. If, as Black suggests, the tribe is chary of the infinite then it is just as plausible to suppose that their practice is described by a version of (TP) restricted to the finite. But (WTP) may be satisfied even in infinite domains and so—by contrast to (TP)— appears not only conservative (in the relevant sense) but also consistent with (HP). 3 Less roughly, let T be any non-mathematical theory and, for any sentence A, let A* be the result of restricting the quantifiers in A to non-mathematical objects. Similarly, let T* be the result of restricting all the quantifiers in the theory T to non-mathematicals. Then a mathematical principle N is conservative iff for any sentence A, if T* + N implies A* then T implies A (cf. Wright 2000: 319). Now let S be the sentence stating that the non-mathematical universe is finite. Since (TP) implies S, (TP) + T* implies S even if T does not. Therefore, (TP) fails to be conservative. 4 By failing to be conservative (TP) also broaches a (plausible) constraint on abstraction principles that—like (HP)—are intended to introduce novel concepts (Wright 1997: 297). However, it is unclear from Black’s discussion whether this constraint should apply to (TP). For Black does not distinguish between the case where (TP) is laid down as a piece of conceptual innovation (in which case it may be obliged to be conservative) and the case where (TP) is intended as a codification of existing numerical practice (that may already fail to be conservative).
98
2.
The Arché Papers on the Mathematics of Abstraction
The modal-re-constructive character of neo-Fregean epistemology
The fact remains, however, that (HP) incorporates a commitment to zero not even (WTP) incurs. And, as Black remarks, it is not only fantastical tribes whose arithmetical understanding may be zero free. The Greeks did not recognise zero nor did Dedekind or Peano in their original formulations of the axioms of arithmetic. But, according to Black, if (HP) is to perform the foundational role the neo-Fregean intends then “we must say that the reason there are infinitely many numbers is that 0 counts as one of them”. Since none of the aforementioned thinkers would have endorsed such a claim Black reasons that (HP) “can no longer be regarded as making explicit the ideas which already underlay our use of the natural number system” (Black 2000: 236). Black concludes that—contrary to neo-Fregean intention—(HP) cannot perform any foundational role in the epistemology of arithmetic. Black’s criticism is an instance of a general form of objection to the neoFregean programme that bemoans (HP) for its strength. According to objections of this form, (HP) cannot provide an analysis of the concept number because it incorporates existential commitments that no ordinary arithmetical reasoner needs to countenance. Usually such objections focus upon the commitment of (HP) to infinite numbers, numbers that prior to Cantor went almost entirely unnoticed (Heck 1997: 597–98). Black’s criticism reveals that (HP) might also be faulted as an analysis because it is committed to zero and— although Black does not mention the possibility—perhaps other finite numbers too (after all (HP) is also committed to 1, another number that has not always been recognised as such). However, objections of this form are in general misguided. 5 This is because the epistemological success of the neo-Fregean programme need not rely upon the effectiveness of (HP) as an analysis of ordinary arithmetical notions. Of course, neo-Fregeans do sometimes speak of (HP) as an “analysis” of the ordinary notion of number or “analytic of” that concept (see, for example, Wright 1983: 106–07 and Hale 1997: 99). Nevertheless, an alternative epistemology may be gleaned (and extrapolated) from what the neo-Fregeans have to say that makes no relevant play with the notion of analysis and obviates Black’s criticism. Black’s criticisms fail to take proper account of the modal character of this epistemology. Neo-Fregean epistemology (so envisaged) offers an account of how it is possible to acquire knowledge of the fundamental laws of arithmetic (by deriving them from (HP)). It thereby undertakes to describe “an a priori route” (that goes via the recognition of zero) to knowledge of the laws of arithmetic (Wright 1997: 279–80). But is not thereby committed, as Black 5 There are, in addition, particular reasons why objections to (HP) that complain about the commitment of (HP) to infinite numbers are also misguided. I discuss some of these reasons elsewhere (MacBride 2000).
Could Nothing Matter?
99
assumes, to saying this route is “the” only one available. It is consistent with our coming to recognise arithmetical truths in one way that we could have, and perhaps do, come to recognise their truth by different means. So the mere fact that certain figures—historical or imaginary—could have achieved a coherent understanding of arithmetic from which a recognition of zero was absent does nothing to compromise the neo-Fregean contention that it is possible for (HP) to serve as a basis for acquiring arithmetical knowledge. Black’s criticisms also fail to take proper account of the re-constructive character of the neo-Fregean epistemology. According to Black, (HP) can only discharge a foundational role if it makes explicit the principles that actually underlie established arithmetical usage. But, according to the neo-Fregean epistemology, (HP) can only discharge its intended role because, in the first instance, it does not answer to existing usage. Rather, the neo-Fregean claims, (HP)—properly understood—is nothing more than a stipulation that serves to introduce a novel operator (“Nx”) into our language (Wright 1997: 278, Wright 2000: 317–18, Hale & Wright 2000: 142, Hale & Wright 2001: 14). The introduction is achieved by implicit definition: the meaning of the novel operator is fixed by stipulating that the truth conditions of identity statements (“Nx : Fx = Nx : Gx”) in which it occurs coincide with the conditions under which an equivalence relation amongst concepts may be said to obtain (“F 1 − 1 G”). And it is because (HP) is intended merely as a stipulation that the neoFregean feels able to legitimately claim that (HP) is a priori. Nevertheless, the neo-Fregean continues, (HP) provides a basis for grasping arithmetical truths a priori because (as Frege’s theorem demonstrates) the system that results from (HP) and second-order logic allows for a reconstruction of ordinary arithmetical practice in the following sense. It—Frege arithmetic—suffices for the interpretation of the laws of ordinary arithmetic and the proof of their interpretations. It is in virtue of the interpretative powers of the system (HP) engenders that the neo-Fregean takes himself to be retrospectively entitled to characterise (HP) an arithmetical principle, a principle of cardinal (in the usual sense) identity. To simply complain that (HP) fails to “make explicit the ideas which already underlay our use of the natural number system” (Black 2000: 236) is to fail to take into account the re-constructive character of the epistemology proposed and the crucial role Frege’s theorem performs in the envisaged reconstruction.
3.
The counter-Caesar problem
Whilst this characterisation of the neo-Fregean programme obviates the criticisms of Black (and others) it also brings into focus a potentially critical epistemological difficulty quite peculiar to that programme. For—strictly speaking—Frege’s theorem does not establish that the truths of ordinary arithmetic are themselves a priori. It only establishes that there is a system of truths (Frege arithmetic) capable of modelling (interpreting) the laws of arithmetic.
100
The Arché Papers on the Mathematics of Abstraction
Assume the best case scenario for the neo-Fregean and suppose the truths that comprise Frege arithmetic are a priori in character. An additional argument is still required to show that the truths of arithmetic inherit the epistemological status of their Fregean models (see Benacerraf 1981: 20, Heck 1997: 596). There appear to be at least two styles of strategy—re-constructive and hermeneutic—whereby the neo-Fregean might endeavour to address this issue. According to the re-constructive strategy, the neo-Fregean may accept that the truths expressed in Frege arithmetic concern an entirely novel subject matter and merely model the truths of ordinary arithmetic. Nevertheless, the neo-Fregean may still maintain that Frege’s theorem bears epistemological significance for ordinary arithmetic. For, the neo-Fregean may argue, the mappings that the theorem establishes a priori between the a priori truths of Frege arithmetic and ordinary arithmetic suffice to demonstrate the operational effectiveness (reliability), if not the truth, of ordinary arithmetical claims. 6 Alternatively, the neo-Fregean may adopt the hermeneutic strategy according to which all the truths of the ordinary arithmetic are expressed by truths of Frege arithmetic. Ordinary arithmetic will then automatically inherit the a priori status of the latter system. The difficulty attendant upon this second strategy is, in a sense, the reverse of the more familiar Caesar problem (or at least one member of that family of problems). The latter difficulty concerns our capacity (or lack of it) to establish that the terms and sentences occurring in two different theories (concerning, for example, persons and numbers respectively) are, as we might pre-theoretically suppose, terms for quite different entities (Julius Caesar, 2) and sentences expressive of very different truths. The former difficulty may appropriately be dubbed the ‘counter-Caesar’ problem. It concerns our capacity to establish that the terms and sentences figuring in two different theories (ordinary arithmetic, Frege arithmetic) are, as the neo-Fregean would have it, terms for the same entities (2, Nz:[z = Nx: x =x v z = Ny:(y = Nx: x = x)]) and sentences expressive of the same truths. In fact, the neo-Fregean appears to adopt the second strategy and is therefore obliged—if he is to realise the epistemological pretensions of his programme—to take on the counter-Caesar problem with all the seriousness usually reserved for the Caesar problem itself. The neo-Fregean proceeds upon the assumption “that to define the distinctively arithmetical concepts is to so define a range of expressions that the use thereby laid down for those expressions is indistinguishable from that of expressions which do indeed express those concepts” (Wright 2000: 322). He then claims that the stipulation of (HP) gives rise to a pattern of linguistic use that is ensured by the interpretability of Peano’s axioms in Frege arithmetic to be equivalent to ordinary arithmetic. 6 The neo-Fregean may also add that the existence of these mappings is in no way compromised by the commitment of Frege arithmetic to zero (and other numbers ordinary arithmetic fails to recognise). So, the neo-Fregean may conclude, a commitment to zero can hardly provide a reason for denying (HP) the status of an epistemological foundation.
Could Nothing Matter?
101
So, relying on the aforementioned assumption, the neo-Fregean concludes that regardless of the different underlying principles ((HP), Peano’s axioms or some other source) that gave rise to that same pattern of use the very same arithmetical truths are thereby expressed. 7 An analogy may help bring into relief the epistemological character now envisaged for the neo-Fregean project. According to Davidson, we may achieve insight into the nature of language by reflecting upon the possibility of a radical interpreter who (entirely ignorant of a given language L) constructs a theory knowledge of which suffices for interpreting a speaker of L (Davidson 1973). The radical interpreter does not, however, make any attempt to describe the inner psychological mechanisms that in fact account for the speaker’s mastery of L. Nonetheless, Davidson claims, the theory supplied by a radical interpreter provides insight into the character of the complex linguistic abilities displayed by ordinary speakers of L. This is because (very roughly) the theory is empirically adequate to their linguistic performances. In an analogous way, the neo-Fregean claims, we may achieve insight into the nature of arithmetical knowledge. This time we are asked to reflect upon the possibility of a character (call him “Hero”) who (entirely ignorant of arithmetic) seeks to construct a theory knowledge of which suffices for the competent use of ordinary arithmetical language. 8 Hero makes no efforts to uncover the psycho-genetic origins of our arithmetical abilities. Nonetheless, the neo-Fregean claims, the a priori theory provided by Hero provides insight into the character of arithmetical knowledge. This is because grasp of his theory provides Hero with the ability to engage in a practice of use equivalent to the arithmetical performance displayed by ordinary speakers. The a priori theory Hero supplies is Frege arithmetic. The assurance that knowledge of this theory engenders competence in arithmetic flows (very roughly) from an appreciation of Frege’s theorem.
4.
Meaning-theoretic foundations of neo-Fregeanism
How then should the neo-Fregean programme be assessed once it is liberated from the popular misconception that it was ever intended to characterise whatever principles in fact underlie ordinary arithmetical reasoning? If it is to carry conviction the neo-Fregean assumption—that ‘arithmetical’ systems which exhibit the same pattern of use refer to the same objects and express 7 More generally, the neo-Fregean must establish that the stipulation of (HP) provides for two distinct patterns of use that are respectively equivalent to the distinct uses of pure and applied arithmetical language. The neo-Fregean takes the former result to be established by demonstrating the intepretability of Peano’s axioms within Frege arithmetic. The latter result is secured by deriving from (HP) the principle (Nq) that relates (in the intuitive manner) pure occurrences of the numerals “nf ” of Frege arithmetic with appropriate applied occurrences of the numerals “n” of ordinary arithmetic (Wright 2000: 322, 330–32): (Nq) nf = Nx : Fx ↔ there are exactly n Fs. 8 The character Hero was introduced by Wright (1997: 247) for the heuristic purpose of showing how Frege’s definitions of zero and its successors might be grasped upon the basis of (HP) and second-order logic. Here I extend Hero’s role to show how ordinary numerals might similarly be grasped.
102
The Arché Papers on the Mathematics of Abstraction
the same truths about them-must be underwritten by a general and principled conception of how meaning and use relate. That conception must include a meaning-theoretic doctrine of (at least) the following strength: (MSU) Meaning (truth and reference) supervenes on use.
But by relying upon (MSU) the neo-Fregean undertakes a distinctive theoretical commitment that it is far from trivial. First, if the neo-Fregean employs (MSU) then he will confront a substantial explanatory challenge. On the one hand, the notion of ‘use’ figuring in (MSU) cannot be so behaviouristically conceived that it becomes implausible to suppose intentional facts could ever supervene on facts about use (a difficulty familiar to us from Kripke’s rule-following paradox). On the other hand, the relevant notion of ‘use’ cannot be so intensionally conceived that it becomes impossible for ordinary arithmetic and Frege arithmetic to exhibit the same pattern of use. To vindicate his stance it is incumbent upon the neoFregean to supply an account of use that navigates between these undesirable consequences. Second, a version of Black’s original concern resurfaces—this time in its proper place. For even if such a notion of ‘use’ can be supplied, the neoFregean faces the further difficulty that ordinary arithmetic and Frege arithmetic exhibit different patterns of use (in whatever sense of use that might turn out to be). This is because Frege arithmetic is a far richer language than ordinary arithmetic. It includes expressions putatively referring to numerical objects for which there may be no corresponding ordinary arithmetical terms: for example, the number of identical things, the number of natural numbers, the number of non-self-identical things, and so on. To accommodate this point, the neo-Fregean must claim that the meaning of a given expression does not supervene upon the global pattern of use associated with it, but rather upon some relevant local holism. More specifically, the neo-Fregean must claim that the meanings of expressions in Frege arithmetic that do correspond to terms in ordinary arithmetic are determined by a local pattern of use from which the more colourful, unfamiliar expressions of Frege arithmetic are excluded. The neo-Fregean will then seek to harness Frege’s theorem to show that ordinary arithmetical terms exhibit the very same local pattern of use, and so mean the same, as their Fregean counterparts. But to make good such claims the neo-Fregean is obliged to undertake a further explanatory task: to explain the significance of the notion ‘relevant local holism’ and to provide some principled account of why meaning should be taken to supervene on that sort of use. Part of this task will be to systematically specify—in the face of widespread Quinean scepticism that it cannot be done (see Quine 1976 and, more recenly, Fodor & LePore 1992: 163–83)—those aspects of the use of an expression that confer meaning (and belong to the relevant local holism) from those aspects that do not confer meaning (and
Could Nothing Matter?
103
belong only to a wider pattern of use). There is, however, reason to be sanguine about the neo-Fregean’s prospects of executing this task. After all, if some such contrast could not (at some level) be made out then the possibility of translating from a richer language into a more impoverished one, or acquiring greater knowledge of a language that we have already partially learnt, would appear to be foreclosed. Moreover, if it turns out that the explanatory task cannot be discharged it is always open to the neo-Fregean to avoid the counterCaesar problem by adopting the first strategy proposed and seeking only to model ordinary arithmetical usage. Black concludes with the suggestion that his real objection to neoFregeanism is not so much to do with the commitment of (HP) to zero (or the existence of some other object). He writes: “rather the problem is the way in which (HP) generates an infinity of numbers, generating new numbers to count the numbers already there with a tail biting circularity” (2000: 237). Black here raises an objection to the impredicative character of (HP), a form of objection familiar from Dummett’s critique of the neo-Fregean programme (Dummett 1991: 226–29). But this sort of objection must surely be distinguished from Black’s earlier concern that (HP) fails to make explicit the ideas that underlie ordinary arithmetic. For even supposing a 21st century Boolos were to uncover a predicative version of (HP) with sufficient strength to generate an ‘arithmetical’ system, the principle in question might still fail to underlie ordinary practice. Moreover, even if (HP) did underlie ordinary practice this would in no way address the objections to (HP) based upon its impredicative character. Finally, it is worth reflecting that impredicativity (specified in such general terms) appears to be a ubiquitous phenomenon. In the linguistic environment into which we are thrown the meaning of a given word can never—or so it seems—be determined independently of some significant portion of the sentential contexts in which that word occurs. It may well be that the terms introduced by (HP) are impredicative in a stronger and more objectionable sense than this. But until we have equipped ourselves with a more discerning means of saying just why this is so we—Black and Dummett included—should be more reticent to dismiss (HP) on such generic grounds. 9
References Benacerraf, P. 1981. Frege: The Last Logicist. In Midwest studies in philosophy VI, eds. P. French et al., 17–35. Minneapolis: University of Minnesota Press. Black, R. 2000. Nothing matters too much, or Wright is wrong. Analysis, 60: 229–37. Boolos, G. 1997. Is Hume’s Principle Analytic? In Heck: 1997a: 245–62. Davidson, D. 1973. Radical Interpretation. Dialectica 27: 313–28. 9 The neo-Fregean programme remains, however, beset by many other challenges, not least that of justifying the metaphysical assumption it presupposes of an intimate communion between language and reality. I explore these issues further in MacBride [2003]. For comments and discussion of the present paper thanks to Robert Black, Peter Clark, Roy Cook, Bill Demopoulos, Philip Ebert, Patrick Greenough, Jonathan Hesk, Stephanie Schlitt, Stewart Shapiro, Crispin Wright, and an anonymous referee for this journal.
104
The Arché Papers on the Mathematics of Abstraction
Dummett, M. 1991. Frege Philosophy of Mathematics. London: Gerald Duckworth & Co. Ltd. Field, H. 1980. Science without Numbers. Oxford: Basil Blackwell. Fodor, J. & LePore, E. 1992. Holism: A Shopper’s Guide. Oxford: Basil Blackwell. Hale, B. 1997. Grundlagen §64. Proceedings of the Aristotelian Society 97: 243–61. Reprinted in Hale & Wright 2001: 90–116. Hale, B. & Wright C., 2000. Implicit Definition and the a priori. In New Essays on the A Priori, ed. P. Boghossian and C. Peacocke, 286–319. Oxford: Clarendon Press. Reprinted in Hale & Wright 2001: 117–50. Hale, B. & Wright, C. 2001. The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics. Oxford: Clarendon Press. Heck, R. 1997. Finitude and Hume’s Principle. Journal of Philosophical Logic, 26: 598–617. Heck, R. 1997a. Language, Thought and Logic: Essays in Honour of Michael Dummett. Oxford: Oxford University Press. Hodes, H. 1984. Logicism and the Ontological Commitments of Arithmetic. Journal of Philosophy 81: 123–49. Lowe, E.J. 1998. The Possibility of Metaphysics: Substance, Identity and Time. Oxford: Clarendon Press. MacBride, F. 2000. On Finite Hume. Philosophia Mathematica, 8: 150–59. MacBride, F. 2003. Speaking with Shadows: A Study of Neo-Logicism. British Journal for the Philosophy of Science, 54: 103–63. Quine, W.V.O. 1976. Carnap and Logical Truth. In his Ways of Paradox and other essays, 107–32. Cambridge, Mass.: Harvard University Press. Wright, C. 1983. Frege’s Conception of Numbers as Objects. Aberdeen: Aberdeen University Press. Wright, C. 1997. On the Philosophical Significance of Frege’s Theorem. In Heck 1997a: 201–44. Reprinted in Hale & Wright 2001: 272–306. Wright, C. 1998. On the Harmless Impredicativity of N= (‘Hume’s Principle’). In M. Schirn (ed.), The Philosophy of Mathematics Today, ed. M. Schirn, 339–68. Oxford: Clarendon Press. Reprinted in Hale & Wright 2001: 229–55. Wright, C. 2000. Is Hume’s Principle Analytic? Notre Dame Journal of Formal Logic 40. Reprinted in Hale & Wright 2001: 307–22.
ON THE PHILOSOPHICAL INTEREST OF FREGE ARITHMETIC 1 William Demopoulos
1.
Fregean logicism: the laws of logic have an arithmetical content
Traditional “Fregean” logicism held that arithmetic could be shown free of any dependence on Kantian intuition if its basic laws were shown to follow from logic together with explicit definitions. It would then follow that our knowledge of arithmetic is knowledge of the same character as our knowledge of logic, since an extension of a theory (in this case the “theory” of secondorder logic) by mere definitions cannot have a different epistemic status from the theory of which it is an extension. If the original theory consists of analytic truths, so also must the extension; if our knowledge of the truths of the original theory is for this reason a priori, so also must be our knowledge of the truths of its definitional extension. The uncontroversial point for traditional formulations of the doctrine is that a reduction of this kind secures the sameness of the epistemic character of arithmetic and logic, while allowing for some flexibility as to the nature of that epistemic character. Thus, it is worth remembering that in Principles (p. 457), Russell concluded that a reduction of mathematics to logic would show, contrary to Kant, that logic is just as synthetic as mathematics. Nevertheless, the methodology underlying this approach to securing the aprioricity of arithmetic by a traditional logicist reduction has been challenged. For example, Paul Benacerraf, 2 who focuses on Hempel’s 3 classic exposition, tells us that 1 This paper first appeared in Philosophical Books 44, [2003], pp. 220–228. It is reprinted by kind permission of the editor and Blackwell Publishing. With the exception of section headings and a small number of minor stylistic changes, it is unaltered. A Postscript addresses Hale and Wright’s response to the original paper. 2 “Frege: The last logicist,” in William Demopoulos (ed.) Frege’s philosophy of mathematics (Harvard University Press: 1995), pp. 42 and 46. 3 C. G. Hempel, “On the nature of mathematical truth,” Hilary Putnam and Paul Benacerraf (eds.) The philosophy of mathematics: selected readings, second ed. (Cambridge University Press: 1983), 377–393.
105 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 105–115. c 2007 Springer.
106
The Arché Papers on the Mathematics of Abstraction . . . logicism was . . . heralded by Carnap, Hempel . . . and others as the answer to Kant’s doctrine that the propositions of arithmetic were synthetic a priori . . . in reply to Kant, logicists claimed that these propositions are a priori because they are analytic—because they are true (or false) merely “in virtue of” the meanings of the terms in which they are cast. . . . According to Hempel, the Frege-Russell definitions . . . have shown the propositions of arithmetic to be analytic because they follow by stipulative definitions from logical principles. What Hempel has in mind here is clearly that in a constructed formal system of logic (set theory or second-order logic plus an axiom of infinity), one may introduce by stipulative definition the expressions ‘Number,’ ‘Zero,’ ‘Successor’ in such a way that sentences of such a formal system using these introduced abbreviations and which are formally the same as (i.e., spelled the same way as) certain sentences of arithmetic—e.g., ‘Zero is a Number’—appear as theorems of the system. He concludes . . . that these definitions show the theorems of arithmetic to be mere notational extensions of theorems of logic, and thus analytic. He is not entitled to that conclusion. Nor would he be even if the theorems of logic in their primitive notations were themselves analytic. For the only things that have been shown to follow from the theorems of logic by [stipulative definitions] are the abbreviated theorems of the logistic system. To parlay that into an argument about the propositions of arithmetic, one needs an argument that the sentences of arithmetic, in their preanalytic senses, mean the same (or approximately the same) as their homonyms in the logistic system. That requires a separate and longer argument.
Benacerraf is questioning whether the logicist can claim to have established any truth of arithmetic on the basis of a successful reduction. What is required according to Benacerraf, is a supplementary argument showing that the logicist theorems have the preanalytic meanings of their ordinary arithmetical analogues. But Benacerraf’s demand for a further argument is not justified. The philosophical interest elicited by traditional logicism derived from the fact that it was thought implausible that the concepts and laws of logic could have an “arithmetical content.” To have successfully dispelled this belief it would have been sufficient to have shown that the concepts of logic allow for the explicit definition of notions which, on the basis of logical laws alone, demonstrably satisfy the basic laws of arithmetic. The philosophical impact of the discovery that the concepts and laws of logic have an arithmetical content in this sense would not have been in any way diminished by the observation that the preanalytic meanings of the primitives of arithmetic were not the same as their logicist reconstructions. The sense in which the logicist thesis must be understood in order to be judged successful cannot therefore be the one for which Benacerraf claims Hempel must argue. Notice also that independently of one’s view of meaning and truth in virtue of meaning, it must be conceded that traditional logicism would have provided a viable answer to Kant had it succeeded in showing that arithmetical knowledge requires only an extension of logic by explicit definitions. Hempel’s appeal to these notions addresses a different issue: Frege left the
On the Philosophical Interest of Frege Arithmetic
107
problem of securing the epistemic basis of the laws of logic largely untouched. Benacerraf’s Hempel should be understood as proposing to fill this gap by suggesting that the laws of logic are true in virtue of the meanings of the logical constants they contain. Like Frege, Hempel seeks to secure the aprioricity of arithmetic by an argument that proceeds from its analyticity. But Hempel’s version of logicism differs from Frege’s, for whom “analytic” merely meant belonging to logic or a definitional extension of logic, by providing a justification for the analyticity of logical laws: logical laws are analytic, not by fiat as on Frege’s account, but because they are true in virtue of the meanings of the logical terms they contain. From this it would follow that if logical laws are true in virtue of meaning, so also is any proposition established solely on their basis, where “established solely on their basis” is intended to encompass the use of explicit definitions. The clarity of the thesis that the laws of logic are true in virtue of meaning is therefore central to Hempel’s presentation of the view. Also central is the substantive and further claim that the basic laws of arithmetic can be recovered within a definitional extension of logic. The implied criticism of Hempel’s appeal to truth in virtue of meaning gains its force from the difficulties that stand in the way of establishing the traditional logicist thesis that arithmetic is reducible to logic in the original sense of the doctrine. Certainly, the failure to sustain this thesis led to more ambitious applications of the notion of truth in virtue of meaning. But if the basic laws of arithmetic had been recovered as a part of logic—not merely shown to have analogues that are part of some formal system or other, but to be part of logic—what more would be needed to infer that they share the epistemic status of logical laws? Once Hempel is not represented as seeking to secure the truth of the basic laws of arithmetic by an appeal to the derivability of mere formal analogues or a blanket appeal to the notion of truth in virtue of meaning, it is clear that he simply doesn’t owe us the argument Benacerraf claims he does. The difficulties that attend traditional logicism are therefore not the methodological difficulties Benacerraf advances, but the simple failure to achieve the stated aim of showing arithmetic to be a definitional extension of logic. This point is obscured by Benacerraf’s suggestion that the reduction might proceed from second-order logic with an axiom of infinity or from some version of set theory. Neither theory supports the truth in virtue of meaning account that underlies Hempel’s formulation of logicism. A reduction to second-order logic with infinity would mean a reduction to a system augmented with an axiom like Whitehead and Russell’s; but no one ever thought such a system was true in virtue of meaning. As for a reduction to set theory, set theory is properly regarded as the arithmetic of the transfinite. Why should a reduction of the natural numbers to such a generalized arithmetic be regarded as a means of establishing its aprioricity on a less synthetic footing? The only coherent logicist methodology would therefore seem to be the one just outlined: to reduce arithmetic to a theory like Begriffsshrift’s. Unfortunately, such a theory is either too weak, or in the presence of Frege’s theory of classes, inconsistent.
108
2.
The Arché Papers on the Mathematics of Abstraction
Wright and Hale’s neo-Fregean alternative
The renewed interest in logicism is based on the fact that the secondorder theory having Hume’s principle 4 as its only non-logical axiom—“Frege arithmetic”—has a definitional extension which contains the Dedekind–Peano axioms. The neo-Fregean program of Crispin Wright and Bob Hale seeks to imbed this logical discovery into a philosophically interesting account of our knowledge of arithmetic by subsuming Hume’s principle under a general method for introducing a concept by an “abstraction principle.” 5 This program explains the epistemological interest of the discovery that arithmetic is a part, not of second-order logic, but of Frege arithmetic, by the program’s account of concept introduction. The key to achieving this goal is the idea that abstraction principles have a distinguished status: they are a special kind of stipulation. Their stipulative character shows them to be importantly like explicit definitions even if their creativeness suggests an affinity with axioms; and it is a central tenet of neo-Fregean logicism that abstraction principles are sufficiently like definitions to yield an elegant explanation of why arithmetical knowledge is knowledge a priori. The neo-Fregean program has a methodological dimension that parallels the role of the theory of definition in traditional logicism. Frege accords a statement the status of a proper definition if it meets conditions of eliminability and conservativeness. The classical theory of definition is supplemented by the neo-Fregean methodology of good abstractions. Thus, the theory of definition mandates that a definitional extension must be conservative in the familiar sense of not allowing the proof of sentences formulated in the unextended vocabulary which are not already provable without the addition of the definitions which comprise the extension. But “extensions by abstraction” need not be conservative in this sense; indeed interesting extensions are interesting precisely because they are not conservative in the sense of the theory of definition. The neo-Fregean theory of good abstractions allows for classically non-conservative extensions—extensions which properly extend the class of provable sentences—while imposing a constraint on the consequences an extension by good abstraction principles can have for the ontology of the theory to which they are added. This methodology is constrained and principled, it is just not constrained in the same way as the classical theory of definition. We can, perhaps, put the difference by saying that the constraints on definition have a more purely epistemic motivation than do the constraints the neo-Fregean imposes on good abstractions. 4 Hume’s principle tells us that for any concepts F and G, the number of Fs is identical with the number of Gs if, and only if, the Fs and the Gs are in one–one correspondence. 5 By an abstraction principle, Hale and Wright mean the universal closure of an expression of the form (X) = (Y) ↔ X ᑬ Y, where ᑬ is an equivalence relation, the variables X and Y may be of any type, and the function may be of mixed type. In the case of Hume’s principle, the equivalence relation is the (second-order definable) relation on concepts of one–one correspondence, and the “cardinality function,” is a type-lowering map from Fregean concepts to objects.
On the Philosophical Interest of Frege Arithmetic
109
In my view, the reticence of the classical theory of definition to allow a mere definition to properly extend the theory to which it is added is wellfounded, and should also inform the epistemic basis of a principle as rich as Hume’s. My goal here is to consider whether the neo-Fregean account of Hume’s principle as a kind of stipulation can support the epistemological claim of neo-Fregeanism to have secured the aprioricity, if not the analyticity, in one or another traditional sense of the notion, of our arithmetical knowledge. The matter is taken up by Hale and Wright in their paper “Implicit definition and a priori knowledge”—and by Wright in his “Is Hume’s principle analytic?” which, notwithstanding its title, is not concerned to secure the analyticity of Hume’s principle but to address the question of its epistemic status within the neo-Fregean program and the light it sheds on our arithmetical knowledge. 6 Wright and Hale use the stipulative character of Hume’s principle as a premise in an argument for the aprioricity of our arithmetical knowledge. This becomes clear when we reflect on the fact that they are concerned to show that our knowledge of arithmetic can be represented as resting on a principle that introduces the concept of number. In acquiring the concept of number, we acquire a criterion of identity for number—a criterion for saying when the same number has been given to us in two different ways as the number of one or another concept. This criterion of identity—Hume’s principle—affords the only non-logical premise needed to derive the basic laws of arithmetic. Our arithmetical knowledge is secured, therefore, with our grasp of the concept of number and is based on nothing more than what we acquire when we are introduced to the concept. But since this knowledge rests on a stipulation, it is unproblematically knowledge a priori.
3.
Recovering the epistemic status of ordinary arithmetic by modeling
This is essentially the same account of the philosophical interest of Frege arithmetic that is elaborated by Fraser MacBride in two thoughtful papers 7 that address this issue. For MacBride the neo-Fregean explanation of the aprioricity of our knowledge of arithmetic runs as follows: We first stipulate a criterion of identity for a special kind of objects; call them cardinal numbers. That certain fundamental truths about these objects are established on the basis of a stipulation guarantees that our knowledge of those truths is knowledge a priori. This is to be contrasted with an account which would seek to infer the aprioricity of our knowledge of arithmetic from theses about meaning or truth in virtue of meaning. The neo-Fregean account does not depend on a traditional notion of analyticity: since neo-Fregeanism demands only 6 Both reprinted in their collection of their papers, The reason’s proper study (Oxford University Press: 2002), as chapters 5 and 13, respectively. 7 “Finite Hume,” Philosophia mathematica 8 (2000) 150–159, and “Can nothing matter?,” Analysis 62 (2002) 125–134.
110
The Arché Papers on the Mathematics of Abstraction
the relatively uncontentious concession of the aprioricity of a stipulation, it can claim that its explanation of the aprioricity of arithmetic need not address the difficulties associated with defending traditional conceptions of analyticity. The fact that the reduction to Frege arithmetic requires more than a merely definitional extension of second-order logic suggests an objection to neoFregean logicism that is closely related to the one we saw Benacerraf urge against to Hempel: How, one might ask, does our knowledge of the truths that hold of the objects the neo-Fregean has singled out—the Frege-numbers— bear on our knowledge of the numbers, on the subject matter of ordinary arithmetic? In so far as the epistemological issues are issues concerning ordinary arithmetic, have they even been addressed by the neo-Fregean? In this form, the objection presupposes only preservation of subject matter—a minimal requirement that it would be difficult to justify not meeting—and says nothing about preservation of meaning. The first of two neo-Fregean responses to this objection that I wish to review holds that it is because ordinary arithmetic can be “modeled” in Frege arithmetic that the epistemological status of the truths of Frege arithmetic is shared by the truths of ordinary arithmetic. Wright remarks (p. 322) that this answer is too weak. And although MacBride does not endorse this response, neither does he reject it as altogether unlikely. Nevertheless, I think it is worth recording exactly why such a straightforward answer, couched in terms of the relatively unproblematic relation of modeling, can’t be right. It is clearly possible to stipulate the conditions that must obtain for the properties of a purely hypothetical and imaginary “abstract” physical system to hold without in any way committing ourselves to the existence—or even the dynamical possibility—of such a system. Our knowledge that such abstract systems are configured in accordance with our stipulations is no less a priori than our knowledge that, for example, the four element Boolean algebra has a free set of generators of cardinality one. But it sometimes happens that abstract configurations “model” actual configurations, in the sense that there is a correspondence between the elements of the two systems that preserves fundamental properties. It is clear that in such circumstances we take ourselves to know more than that an imaginary example has the properties we stipulate it to have: if the example is properly constructed, we know the dynamical behavior of a part of the physical world. But of course the fact that an actual system is “modeled” by our imaginary system, together with the fact that our knowledge of the properties of our imaginary system is a priori knowledge because it depends only on our free stipulations, are completely compatible with the claim, obvious to preanalytic intuition, that our knowledge of the dynamical behavior of the actual system is a posteriori. Whatever role stipulation may have in fixing the properties of the abstract system by which the behavior of some real process is modeled, it lends no support to the idea—and
On the Philosophical Interest of Frege Arithmetic
111
would never be regarded as lending support to the idea—that our knowledge of the real process is knowledge a priori. There is a disanalogy between the number-theoretic case and our example that might seem to undermine its effectiveness as a criticism. In the numbertheoretic case the existence of the correspondence between ordinary numbers and the “Frege-numbers” that model them is known a priori. But the correspondence between the abstract system of our example and the actual system is not known a priori; it depends on the a posteriori knowledge that there are in reality configurations of particles having the postulated characteristics. This is of course entirely correct. However it is of no use to the neo-Fregean, since to know a priori that there is a mapping between the ordinary numbers and the Frege-numbers it is necessary to have a priori knowledge of the existence of the domain and co-domain of the mapping. To be of any use to the neoFregean, the fact that the Frege-numbers model the ordinary numbers therefore requires that our knowledge of the ordinary numbers be a priori. But if the modeling of the ordinary numbers by the Frege-numbers presupposes that our knowledge of the ordinary numbers is a priori, it cannot be part of a noncircular account of why ordinary arithmetic is known a priori. The general point may be put as follows: The fact that M models N , so that for any sentence s, s is true in M if and only if s is true in N , does not entitle us to infer that because the sentences true in M are known a priori, the sentences true in N are known a priori. Indeed it is perfectly possible that (with the obvious exception of the logical truths) our knowledge of sentences true in N is wholly a posteriori. So even if we grant that an assumption rich enough to secure an infinity of objects is correctly represented as a stipulation, it remains unclear how the neo-Fregean can use this fact to answer the question which motivates his account of arithmetical knowledge—it remains unclear how it yields an account of our knowledge of the numbers, knowledge that we have independently of the neo-Fregean analysis. Notice that this objection depends only on an observation about the modeling of one domain by another, and that, in particular, it does not require the resolution of various difficult issues in the theory of meaning.
4.
Recovering the epistemic status of ordinary arithmetic by preserving a pattern of use
There is an alternative to the response based on modeling. The idea is that since Frege arithmetic captures the “patterns of use” exhibited by our ordinary number-theoretic vocabulary, both in pure cases and in applications, we are justified in inferring not merely that the Frege-numbers model the ordinary numbers but that the Frege-numbers are the ordinary numbers. Showing that Frege arithmetic captures the patterns of use of our ordinary number-theoretic vocabulary constitutes a considerable strengthening of the claim that ordinary arithmetic is merely modeled by Frege arithmetic.
112
The Arché Papers on the Mathematics of Abstraction
Suppose we grant, both that Frege arithmetic captures the patterns of use of our ordinary number-theoretic vocabulary and that because of this, ordinary arithmetic and Frege arithmetic share the same subject matter. Vindicating the claim that ordinary arithmetic and Frege arithmetic share a subject matter is only one of the difficulties that the weaker understanding of the view in terms of modeling fails to address. If we are only modeling ordinary arithmetic, it is unproblematic to hold that as a statement of the modeling theory Hume’s principle is known a priori because it is a mere stipulation. The difficulty, as we saw, is that this fails to transfer to the aprioricity of the truths we are modeling—to the truths of ordinary arithmetic. Is this difficulty removed when the neo-Fregean account is extended to one which claims to capture the patterns of use of our ordinary number-theoretic vocabulary? And does it illuminate the epistemic status of the basic laws of arithmetic to observe that, in the neo-Fregean reconstruction of our patterns of use, Hume’s principle has the status of a stipulation? When neo-Fregeanism is understood to preserve our patterns of use, it becomes virtually indistinguishable from the traditional idea that the account of the numbers given by Frege arithmetic is analytic of the ordinary notion of number, so that a major burden of the account now falls on establishing the adequacy of an analysis in something very much like the traditional sense. This is a task the neo-Fregean had sought to avoid, since once the neoFregean has to defend the idea that the patterns of use of ordinary numerical expressions have been captured, the simplicity of urging the stipulational character of Hume’s principle, and then basing the aprioricity of arithmetic on this footing, has been lost: the principle no longer governs the introduction of a new concept but is constrained to capture an existing one. But let us grant both that sameness of pattern of use implies sameness of reference and that Frege arithmetic does in fact capture the pattern of use of our ordinary arithmetical vocabulary and, therefore, articulates a successful reconstruction of our arithmetical knowledge. Since we have given up the idea that Hume’s principle is being used simply to introduce a new concept, but forms part of an attempt to articulate principles that capture our numerical concepts as they are given by the patterns of use of our ordinary number-theoretic vocabulary, its justification does not consist merely in its being laid down as a stipulation. Rather, Frege arithmetic is justified because it captures the fundamental features of the judgements—pure and applied—that we make about the numbers. For the neo-Fregean, the reconstruction must not only capture an existing concept by recovering the patterns of use to which our arithmetical vocabulary conforms, it must also illuminate the epistemic status of our pure arithmetic knowledge. Having the status of a stipulation is not, of course, a characteristic of Hume’s principle that is recoverable from our use of our arithmetical vocabulary, but is something the reconstruction imposes on the principle in order to illuminate the basis for our knowledge of the propositions derivable
On the Philosophical Interest of Frege Arithmetic
113
from it. But it is unclear what is achieved if one has captured the pattern of use of an expression by a principle that—in the reconstruction of the knowledge claims in which that expression figures—is regarded as a stipulation. Does this confer the epistemological characteristics that the notion of a stipulation is supposed to enjoy on the knowledge claims that have been reconstructed? To establish that the epistemic basis for the knowledge these judgements express resides in the stipulative character the neo-Fregean analysis assigns to Hume’s principle, it is not enough to show that Frege arithmetic captures patterns of use. The essential point is not all that different from what we have already noted when discussing the response based on modeling, and it can be seen by an example that is not all that different from the one cited in that connection. Suppose the world were Newtonian. We could then give a reconstruction of our knowledge of the mechanical behavior of bodies by laying down Newton’s laws as stipulations governing our use of the concepts of force, mass and motion. 8 But the fact that in our reconstruction the Newtonian laws have the status of stipulations would never be taken to show that they are in any interesting sense examples of a priori knowledge. Why then should the fact that the neo-Fregean represents Hume’s principle as a stipulation be taken to show that arithmetic is known a priori? The neo-Fregean reconstruction of the patterns of use of expressions of arithmetic leaves the epistemic status of the basic laws of arithmetic as unsettled as it was on the suggestion that Frege arithmetic merely models ordinary arithmetic. Neither reconstruction supports the epistemological claim of the neo-Fregean to have accounted for the aprioricity of our knowledge of arithmetic. Whether that account is put forward as a theory within which ordinary arithmetic can be modeled, or whether it is said to capture the patterns of use of our number-theoretic vocabulary, it fails to have the direct bearing on the epistemic basis of our arithmetical knowledge that the neo-Fregean supposes it to have. Showing that Hume’s principle is correctly represented as a stipulation may be one route to securing it as a truth known a priori, but it is questionable whether, proceeding in this way, the task of revealing the proper basis for the aprioricity of arithmetic is made any easier than it would be by general reflection on why its basic laws are plausibly represented as known truths.
5.
Frege arithmetic and the analysis of number
Putting to one side the problem of establishing the aprioricity of arithmetic on a correct basis, a compelling argument that Frege arithmetic captures preanalytic intuitions about the numbers can be extracted from the neo-Fregean 8 There are of course well-known historical examples along these lines. Cp. Ernst Mach’s The science of mechanics (Open Court: 1960, sixth American edition, translated by Thomas J. McCormack), whose famous definition of mass (p. 266) even has the form of an abstraction principle. Thanks to Peter Clark for calling my attention to Mach’s rational reconstruction of Newtonian mechanics.
114
The Arché Papers on the Mathematics of Abstraction
corpus: Since the Dedekind–Peano axioms codify our pure arithmetical knowledge, their derivability constitutes a condition of adequacy which any account of our knowledge of number should fulfill. By Frege’s theorem, Frege arithmetic satisfies this condition of adequacy. But what makes Frege arithmetic an interesting analysis of the concept of number is that it not only yields the Dedekind–Peano axioms, but derives them from an account of the role of the numbers in our judgements of cardinality—from our foremost application of the numbers. As such, it is arguably a compelling philosophical analysis of the concept of number since, as Wright has observed, one can show that the Frege-number of Fs = n if, and only if, there are, in the intuitive sense of the numerically definite quantifier, exactly n Fs. 9 But once the project of securing a correct analysis is divorced from the project of securing a body of truths as analytic or a priori, neither the fact that Frege arithmetic satisfies our condition of adequacy nor the fact that it connects the pure theory of arithmetic with its applications—essential as each is to securing it as a correct analysis of number—addresses the question of the epistemic status of our knowledge of arithmetic. This conclusion is not particularly surprising. Both neo-Fregean strategies we have been considering are variants on the methodology of reconstruction associated with Carnap. For Carnap the thesis that arithmetical knowledge is non-factual, and therefore, a priori, was not in serious doubt. And since the aim of a reconstruction is simply to delimit more precisely the extension of a predicate, we should never have expected that a Carnapian reconstruction of arithmetical knowledge would in any way justify the claim that our knowledge of arithmetic is a species of a priori knowledge. It is precisely in respect of their epistemological significance that Fregean logicism and neo-Fregean logicism—reduction to logic by explicit definition vs. reconstruction by Frege arithmetic—come apart.
Postscript (added 2004) Hale and Wright have replied to the criticism raised in Section 4: Let ‘Newton’ denote the conjunction of Newton’s laws as ordinarily understood, and ‘NewStip’ denote the (perhaps typographically indistinguishable) conjunction of the corresponding stipulations taken as introducing certain concepts of force, mass and motion. Then Demopoulos’s claim is—or ought to be, if the parallel is to be damaging—that while we may, by laying down NewStip, acquire some a priori knowledge (in some sense, knowledge about (some things we are calling) force, mass and motion), we obviously do not thereby acquire a priori knowledge of Newton—as we ought to do, if we can, in just or essentially the same fashion, acquire a priori knowledge of truths of ordinary arithmetic by stipulating Hume’s Principle, etc. . . . [But clearly,] the mere possibility of regarding (the sentences which formulate) Newton’s laws as stipulations introducing concepts of force, mass and motion (as distinct from generalisations to which 9 See The reason’s proper study, p. 251 and pp. 330ff.
On the Philosophical Interest of Frege Arithmetic
115
bodies conform) does not, and cannot, by itself justify the claim that NewStip ‘captures a pattern of use’ exhibited by ‘ordinary’ statements of Newtonian dynamics. 10
But it was never claimed that regarding Newton’s laws as stipulations is what justifies the contention that NewStip captures a pattern of use. Rather, what justifies the contention that Frege arithmetic preserves a pattern of use is that it recovers the deductive structure of a body of pure and applied unreconstructed knowledge claims. The point at issue is whether, by representing certain principles as recoverable from a stipulation, a reconstruction sheds any light on their epistemic status. The comparison with the Newtonian case makes it transparent that from the fact that we can recover a pattern of use from a stipulation, nothing follows regarding the aprioricity or otherwise of our knowledge of the principles being reconstructed. The situation would, of course, be entirely different if, in accordance with the methodology of Fregean logicism, it had proved possible to recover arithmetic from logic plus explicit definitions.
10 “Responses to commentators,” Philosophical books 44 (2003) 245–263, pp. 248–249.
II
THE LOGIC OF ABSTRACTION
“NEO-LOGICIST” LOGIC IS NOT EPISTEMICALLY INNOCENT 1 Stewart Shapiro The Ohio State University E-mail:
[email protected].
Alan Weir University of Glasgow E-mail:
[email protected].
1. A number of philosophers in recent years, most notably Crispin Wright and Bob Hale, 2 have tried to revive something recognisably akin to the logicist programme of showing that mathematical truths are in some sense analytic, a priori or at any rate “epistemically innocent”. They do so in the face of widespread scepticism with respect to the “a priori” and especially with regard to the idea of a priori proofs of existence. Their neo-logicism seems to involve two main tenets: firstly that mathematical truths are not known a posteriori, in the way empirical truths are known, but neither are they known via some Kantian form of intuition; rather our knowledge of mathematics arises from our ability to derive mathematical truths from rules or principles which are “analytic” or “meaning-constitutive” or in some sense explanatory of key mathematical notions such as that of natural number. Secondly, the realist thesis that this mathematical knowledge is knowledge of a world which is in some sense mind-independent or objective. 1 This paper first appeared in Philosophia Mathematica 8, [2000], pp. 160–189. Reprinted by kind permission of the editor and Oxford University Press. 2 See for example Crispin Wright, Frege’s Conception of Numbers as Objects (Aberdeen University Press, 1983); ‘The Philosophical Significance of Frege’s Theorem’ in Richard Heck Jr. (ed.) Language, Thought and Logic (Oxford: Oxford University Press, 1997), pp. 201–244; ‘On the Harmless Impredicativity of N= ’ in Matthias Schirn (ed.) The Philosophy of Mathematics Today (Oxford: Clarendon, 1998), pp. 339–368. Bob Hale, ‘Dummett’s critique of Wright’s Attempt to Resuscitate Frege’, Philosophia Mathematica 2 (1994), pp. 122–147.
119 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 119–146. c 2007 Springer.
120
The Arché Papers on the Mathematics of Abstraction
Clearly neo-logicism is a very attractive position for anyone sympathetic to a fairly traditional view of mathematics as a body of objective truths knowable a priori but who is worried by the standard epistemological problems faced by platonistic mathematics: how could we gain knowledge of a world of causally inert abstract objects, and so forth? The neo-Fregean answers, roughly speaking: by virtue of our knowledge of what we mean when we use mathematical expressions. More fully, we understand mathematical concepts 3 by following, in some sense, rules and following these rules is constitutive of understanding the expressions which express those concepts. These rules are entities which, like mathematical objects, are not part of the concrete, physical world; by tracing out the consequences of these rules we can find out truths concerning abstract objects. Even those such as Quine hostile to the notion of analyticity as glossed by the positivists recognise that our knowledge of the truth of sentences such as ‘if John is tall then either John is tall or Mary is short’ and ‘ “John is tall and Mary is short” entails that “John is tall” ’ may well proceed not through ordinary empirical means, nor through any special faculty of intuition but arise rather out of our understanding of operators such as the conditional, disjunction and conjunction, understanding which Quine claims is encapsulated, in the latter two cases, in his verdict matrices for the sentential connectives. 4 We will call such knowledge “epistemically innocent”. The phrase is deliberately somewhat vague, pending fuller amplification by neo-logicists. Certainly if there are truths which are analytic in the positivists’ sense, these will count as epistemically innocent; conversely any truths which require empirical verification (including holistic verification in Quinean fashion) or verification via some sort of Kantian intuition do not count as epistemically innocent. But we allow that the term may apply even if the positivist account of analyticity as truth by virtue of meaning fails. The neo-logicist claim we take then to be that there are principles which are epistemically innocent in something like the same way in which “if John is tall then either John is tall or Mary is short” is but which are strong enough to generate at least a sizeable body of standard mathematics, enough for the needs of the physical sciences, perhaps. The simple principles involved in elementary fragments of propositional logic are of course insufficient for the derivation of, for example, arithmetic. Neo-logicists have appealed instead to second-order abstraction principles of the form: αx(ϕx) = αx(ψ x) ↔ (ϕ, ψ) where α is some term-forming variable-binding operator which forms singular terms from open sentences ({x: ϕx} and nxϕx − the class of ϕ’s and the number of ϕ’s—are classic examples) and is an equivalence relation over 3 We use “concept” in the everyday, non-Fregean sense, unless otherwise indicated. 4 W. V. O. Quine, Word and Object (Cambridge, Mass. MIT Press, 1960) §13 and The Roots of Reference
(La Salle, Illinois: Open Court, 1974) §§20–21.
“Neo-Logicist” Logic is not Epistemically Innocent
121
properties. For the first of those two cases, the corresponding abstraction principle is “Hume’s Principle”: 5 ∀F∀G(nxFx = nxGx ↔ F1 − 1G) with F1 − 1G the second-order sentence which expresses the existence of a one–one correspondence between the F’s and the G’s. From this principle (plus suitable “bridging” definitions) one can derive, in standard second-order logic, all the theorems—including of course theorems expressing the infinitude of the natural numbers—of the usual Peano-Dedekind formulation of the theory of second-order arithmetic. This result—the derivability of second-order Peano arithmetic from Hume’s Principle—has become known as “Frege’s Theorem”. 6 For the second case, abstraction using the class operators, we get, much more problematically of course, Frege’s notorious Axiom V, which for the extensions of (Fregean) concepts takes the form ∀F∀G({x : Fx} = {x : Gx} ↔ ∀z(Fz ↔ Gz)). In the present paper, we accept for the sake of argument that abstraction principles such as Hume’s Principle are, indeed, epistemically innocent, at least on some natural readings (but see ahead Section 5). 7 We accept furthermore that not only are simple logical principles, such as ∨I and & E, or the related conditional theorems, epistemically innocent; so, too, are certain at least of the quantifier rules, for instance the natural deduction rules of universal generalisation and existential elimination. Moreover we include here these rules applied to second-order variables; that is, whilst some such as Quine would block derivation of Frege’s Theorem at the outset by refusing to accept the legitimacy of second-order logic, we do not object to second-order logic per se. Our claim will be, nonetheless, that Frege’s Theorem requires 5 The term is George Boolos’, following Frege, Grundlagen §63. See George Boolos, ‘The Standard of Equality of Numbers’ in George Boolos (ed.) Meaning and Method: essays in honour of Hilary Putnam (Cambridge Eng.: Cambridge University Press, 1990), pp. 261–277, see p. 267; the article is reprinted in the collection of Boolos’ papers Logic, Logic and Logic (Cambridge, Mass.: Harvard University Press, 1998), pp. 202–219. For the Grundlagen see The Foundations of Arithmetic translated by J. L. Austin, second edition (Oxford: Blackwell, 1980), p. 73. Frege’s rather honorific reference to the Treatise Book I, III.i is garnered from Baumann’s Die Lehren von Raum, Zeit und Mathematik (Berlin, 1868). 6 The phrase is Wright’s from a suggestion by Boolos—see ‘On The Philosophical Significance of Frege’s Theorem’, p. 203. 7 Boolos, Field, and Dummett have argued that the fact that Axiom V is classically inconsistent but is formally very similar to, e.g. Hume’s Principle, irredeemably vitiates the claim of abstraction principles to be analytic truths or at any rate to be epistemically unproblematic. See George Boolos, ‘The Standard Equality of Numbers’, p. 273, ‘Whence the Contradiction?’ in Proceedings of the Aristotelian Society, Supplementary Volume LXVII (1993), pp. 213–233 (reprinted in Logic, Logic and Logic, pp. 220–236), Hartry Field, Realism, Mathematics and Modality (Oxford: Blackwell, 1989), p. 158, Michael Dummett, Frege Philosophy of Mathematics (London: Duckworth, 1991), pp. 188–189, p. 208. For a more nuanced development (we claim!) of this objection see Stewart Shapiro and Alan Weir, ‘New V, ZF and Abstraction’, Philosophia Mathematica, 7 (1999), pp. 293–321. Wright’s ‘The Philosophical Significance of Frege’s Theorem’ is in large part a response to such objections. One radical response is to reject the logic (significantly weaker than classical logic) used in the derivation of antinomies and hold that Axiom V is not in fact inconsistent or at any rate not trivial. Cf. Alan Weir ‘Naïve Set Theory is Innocent!’, Mind 107 (1998), pp. 763–798.
122
The Arché Papers on the Mathematics of Abstraction
use of first- and second-order logical principles which are not epistemically innocent. More exactly, certain of the logical principles which are essential to the derivation of a theorem of infinity, when this is construed as expressing the existence of infinitely many mind-independent entities, are at least as problematic epistemologically as axioms of infinity laid down simply as postulates. Our supposed knowledge of these principles is, we will argue, every bit as mysterious as Kantian intuition of an infinity of numbers. We will look at two main cases in turn: the second-order axiom of comprehension applied to non-instantiated properties (Sections 2 and 3) and the first-order existential instantiation and universal elimination principles as applied in standard nonfree classical logic (Sections 4 and 5). We finish in the sixth section with a summary of our overall conclusions.
2. The standard second-order logic needed in the derivation of Frege’s Theorem includes straightforward generalisation of the first-order quantifier rules plus an Axiom Scheme of Comprehension. 8 Thus, for example, universal elimination and existential introduction become ∀FF F/ P F/ P ∃FF where P is any simple predicate constant or parameter and is any open sentence with F free. The (impredicative) Comprehension Scheme consists in all instances of: ∃R∀x 1 , . . . , ∀xn (Rx1 , . . . , xn ↔ ϕx1 , . . . , xn ) where R is an n-place relation variable and ϕ is any formula of the language in which R does not occur free. Finally we add a Substitution rule permitting substitution of co-extensional predicates: 9 ∀x(ϕx ↔ ψ x), θ . θ[ϕ/ψ] Now on the face of it, our neo-logicist, in taking second-order logic as characterised above as a body of epistemically innocent truths, is committed by the Axiom Scheme of Comprehension to a strong realism about properties, committed moreover to the a priori demonstrability of this strong realism. 8 It would be neater to dispense with Comprehension in favour of use of λ terms and λ conversion. But
since most treatments of second-order logic use the comprehension scheme rather than λ abstracts we will follow suit. The points we will make in connection with Comprehension can be rephrased to apply to the λ term version of second-order logic. 9 This rule is a derivable rule in pure second-order logic but we will be concerned with expansions of the language to include complex singular terms formed by binding open formulae in one free variable x with operators such as the numerical operator nx(. . . x . . . ) or the class operator {x: . . . x . . . } and to handle these we need either a device such as λ abstraction or else a rule of the above sort. Since it could be dispensed with in a treatment with λ abstracts we will not question the epistemic status of this rule.
“Neo-Logicist” Logic is not Epistemically Innocent
123
At any rate, the neo-logicist seems committed to it being demonstrable that whatever it is that second-order variables range over exists, indeed exists in a mind-independent fashion. For the neo-logicist views mathematical theories such as number theory as arising (at least on a “rational reconstruction”) from the extension of an “empirical” or non-mathematical second-order language by the addition of new operators such as nxPx, thereby generating existential assertions with the same (or similar) degree of objectivity as pertains to the original “empirical” language. This has the advantage of treating the mathematical and empirical subfragments of language as semantically homogenous thereby easing, it is hoped, the problem of explaining the applicability of the allegedly “epistemically innocent” realm of pure mathematics to the epistemologically guilt-stained empirical realm. The downside to this for the neologicist is that the impredicative comprehension axiom must be assumed not only to be true in the same objective sense as other sentences of the original empirical sector, we must also be able to know or verify its truth in an epistemically innocent fashion. If comprehension is known via intuition or known a posteriori then no demonstration of a mathematical truth can be innocent if it relies essentially on comprehension just as no demonstration is innocent if it relies on an axiom of set theory conjecturally justified holistically in terms of the fruitfulness of its empirical consequences. We have agreed that our knowledge of the validity of simple rules such as &E and ∨I is epistemically innocent; we may well expect that there can be other less simple cases of epistemic innocence which do not share all the features of &E and ∨I. But one cannot help but notice the enormous leap from simple rules of the above type to complex principles such as: ∃R∀x 1 , . . . , ∀xn (Rx1 , . . . , xn ↔ ϕx1 , . . . , xn ) (especially when one reflects on those impredicative cases where ϕ can contain bound second-order variables). We can give a fairly plausible account of how we know &E is sound, grounded perhaps in nothing much more than the Quinean verdict matrices for &; the neo-logicist needs to come up with something along similar lines (more complex no doubt) which will explain how we know, of each instance of the above, that it is true. A Quinean might dismiss neo-logicism on these grounds alone: if one assumes that it is a priori (or innocently) demonstrable that a realm of properties exists independently of the mind, such a person may say, then maybe you can derive the existence of abstract objects. But anyone sympathetic to the “anti-Anselmian” intuition that one cannot derive existence from concepts alone will refuse to accept the above axiom scheme of comprehension as epistemically innocent. For each instance entails the existence of a property and, (in)famously, Quine argued that second-order logic is not logic at all, but is set theory in disguise. As early as 1941, he claimed that properties are too
124
The Arché Papers on the Mathematics of Abstraction
obscure to serve logic, and should be replaced with items like classes. 10 Once we invoke classes, however, we have crossed the border out of logic and into mathematics proper. Later, Quine wrote: 11 Set theory’s staggering existential assumptions are. . . . hidden. . . . in the tacit shift from schematic predicate letter to quantifiable set variable.
For Quine, then, second-order logic is a wolf in sheep’s clothing. It is set theory made to look like logic, by having variables ranging over properties/sets. But it is important to note that the Quinean is only the most extreme of the opponents the neo-logicist faces and even if the neo-logicist can confute the Quinean position, he or she is far from home and dry. To see this, let us assume for the sake of argument that properties exist in as mind-independent a fashion as scientific entities do but assume as little else as we can about properties so that we can be neutral as to what exactly the nature of properties is (perhaps they are just classes of objects, for example). Consider now a philosopher—Macari, let us say—who accepts second-order logic as correct subject to one, rather minor looking amendment. She accepts only the following form of the Comprehension Schema as logically valid: ∃x1 , . . . , ∃xn ϕx1 , . . . , xn → ∃R∀x1 , . . . , ∀xn (Rx1 , . . . , xn ↔ ϕx1 , . . . , xn ). That is, focusing on one-place open sentences for simplicity, Macari agrees that it is a logical truth that to every such sentence which is instantiated by something or other, there corresponds a co-extensional property. But she refuses to accept that logic alone tells us that there are uninstantiated properties, so refuses to conclude that to predicates such as x = x there corresponds a property. Macari’s attitude is not all that odd or lacking in motivation. Macari may have what one might loosely call “Aristotelian” reasons for being sceptical about uninstantiated properties. 12 Macari accepts, let us suppose, that it is just as sound to infer wisdom exists, because Socrates is wise, as it is to infer that Socrates exists, because that same proposition is true. But it is a far more contentious step to assume that if a sentence involving a predicate P is true, then a mind-independent property corresponds to P, even when P is the predicate x = x. For Macari, this is as substantive an assumption as the 10 W. V. O. Quine, ‘Whitehead and the rise of modern logic’, in P. A. Schilpp, The Philosophy of Alfred North Whitehead (New York: Tudor Publishing Company, 1941), pp. 127–163. 11 W. V. O. Quine, Philosophy of Logic (Englewood Cliffs, New Jersey: Prentice-Hall, 1970), p. 68. 12 There seems to be enough textual support for the idea that Aristotle was opposed to the idea of ‘ante rem’ universals to justify use of his name; but since Macari’s view of properties is doubtless not all that close to Aristotle’s we will refer to the position as “aristotelian”, with lower case “a” in analogy with “platonism” in philosophy of mathematics. The key tenet of Macari’s view is only that there is no epistemically innocent proof that uninstantiated properties exist; she need not believe, for example, in ‘concrete universals’ which are somehow both properties and extended physical objects but which, unlike mereological fusions, are ‘wholly present’—whatever that means—wherever their instances are present.
“Neo-Logicist” Logic is not Epistemically Innocent
125
assumption of the existence of an abstract object. It is certainly hard to see how this assumption can be known in as innocent a fashion as the soundness of &E and ∨I can be known. Macari’s views, we submit, err, if anything, on the side of generosity to neo-logicism, in allowing that to any arbitrary instantiated predicate there corresponds a property. Neo-logicists, if they are to convince Macari that abstract objects can be shown to exist in something like an a priori fashion must therefore do so by adding abstraction principles to a logic which does not already embody the assumption that uninstantiated properties exist, i.e. a logic which is no stronger than Macari’s “aristotelian” second-order logic (let’s call it “A2L”), that is standard second-order logic but with Comprehension restricted as above. But here they face a huge problem: arithmetic, in particular a theorem of infinity, is not derivable from Hume’s Principle in A2L. The stumbling block is the number zero, defined by the neo-logicist as nxPx for some predicate P x with ∀x ∼ P x, e.g. with x = x for P x. On the face of it, Macari ought to hold that nx(x = x) is, or could in some situations be, an empty term standing for nothing rather than standing for something, namely the number zero. For the intended interpretation of the numerical operator nxϕx is as a function which maps properties to objects in a certain way; and where ϕx is x = x there is no property available as argument for the function. But in order to keep separate a number of distinct problems with neo-logicism we will leave discussion of free logics till later (Sections 4 and 5) and assume that all singular terms, simple or complex, have a reference. We will therefore ensure that there is at least a “dummy” referent assigned to all numerical terms of the form nxϕx, even where nothing satisfies ϕx. The aristotelian interpretation of the numerical operator is as follows. For simplicity let properties, the range of the monadic second-order quantifiers, just be non-empty subsets of the domain of individuals. In each model, the semantics, via the recursion theorem, will ensure that each open sentence ends up getting assigned a subset (empty or not) of the domain as its extension. To do this for the language containing the numerical operator, we first partition all the non-empty subsets into equivalence classes under the relation of equinumerosity and map these equivalence classes via a one:one function f into the domain of individuals. We then select some arbitrary member d of the individual domain as the image of the equivalence class whose sole member is the empty extension Ø and thus extend f to a function N by mapping {Ø} to d. Since d is arbitrary, there is no requirement that d not occur in the range of f; it might, for example, turn out to be the image of f applied to the equivalence class A of all unit sets. The numerical operator is then interpreted as a function which maps the extension of E each open sentence ϕxgo the N-image of the equivalence class in our partition to which E belongs. In the model envisaged, then, N(A), that is d, is the object which is, in the model, the number one so that d may number both the empty set and every unit set.
126
The Arché Papers on the Mathematics of Abstraction
On this extension of A2L to the language including the numerical operator, we can, of course, prove that zero—nx(x = x) on the Fregean definition— exists. For ∃x(x = nx(x = x)) is a simple consequence of the appropriate instance of the axiom schema of identity which holds in A2L. 13 But the rub is that we cannot prove the standard facts about zero which we need in order to prove a theorem of infinity. In particular, we cannot prove ∃x Suc(x, 0), where Suc(x, y) abbreviates the standard Fregean definition of x succeeds y: Suc(x, y) ≡ df. ∃F∃G(x = nwFw & y = nwGw & ∃z(F z & nw(Fw & w = z) = nwGw)); and we cannot rule out the possibility that for some positive number j, 0 = S j 0, where S j 0 is the standard Fregean numeral for the number j. 14 In particular, as in the case set out above, we can have as true in an aristotelian model 0 = S0, i.e. 0 = 1. The assumption 0 = S j 0 does not lead to conflict with Hume’s Principle since in A2L, with its restriction on the Comprehension scheme, we cannot derive: nx(x = x) = nx J x < − > 1 − 1((x = x), J x) from Hume’s Principle, where J x is (x = 0 ∨ · · · ∨x= S j−1 0) (x= 0 in the example given). Since Jx is provably instantiated, we will be able to prove from the relevant instance of the Comprehension scheme: ∃G∀x(Gx ↔ J x)
(i)
If we had available full Comprehension we would also have for the instance (x = x): ∃F∀x(F x ↔ x = x)
(ii)
from which, after excising the initial second-order existential quantifiers for application of ∃E (or else after existential instantiation), we get: ∀x(Gx ↔ J x)
(iii)
∀x(F x ↔ x = x).
(iv)
∀E on Hume’s Principle yields nx F x = nx Gx ↔ F1 − 1G
(v)
S j 0 abbreviates nx J x so that two applications of the Substitution rule takes us (iii), (iv), and (v) to: 0 = S j 0 ↔ 1 − 1((x = x), J x). 13 If t = u is defined as ∀F(Ft → Fu) (equivalently ∀F(Ft ↔ Fu)) the derivation of each instance is fairly trivial. If it is primitive, we add each instance of t = t as an axiom. 14 i.e. takes the form nx(x = 0 ∨ . . . x = S j−1 0).
“Neo-Logicist” Logic is not Epistemically Innocent
127
Since we can disprove the right-hand side, (ii) along with the rest of our logical apparatus would yield a reductio ad absurdum of 0 = S j 0. But since (ii) is not forthcoming in aristotelian second-order logic, neither is this standard proof. As a consequence, the Fregean proof of infinity fails in A2L because we cannot prove there are n + 1 numbers less than or equal to n; for all the aristotelian knows, zero happens to be identical to some number Sk 0 between zero and n. As we have seen, it is possible for 0 = 1 to hold in a model in the aristotelian framework in which case the number of numbers less than or equal to one is just one. Nor can the neo-logicist hope that it is merely the standard proofs of the theorem of infinity that fail, that the aristotelian may be able to find some more complex ways of arriving at infinity. For there are aristotelian models of any finite size of Hume’s Principle. To see this, take any finite domain of individuals D = {d1 , . . . , dn }. As above, we interpret the numerical operator by partitioning the subsets of D into classes of equinumerous classes:– there will be n + 1 of these including the class containing the single zero-sized subset of D, namely Ø and with the range of the second-order quantifiers the non-empty subsets of D. 15 Interpret nxϕx by the map N which takes X ⊆ D to dk , where X is the extension of ϕ and is of size k = 0 and which takes the empty extension to d j for some j which is doubling up as “zero”. Then Hume’s Principle: ∀F∀G(nx F x = nx Gx ↔ F1 − 1G) is satisfied because for any non-empty subsets S1 and S2 of D assigned to F and G (remember assignment of Ø is not possible in this model) the left-hand side of the biconditional is true iff M assigns to F and G the same element di iff F and G belong to the same partition class under the equinumerosity equivalence relation (since F and G are non-empty) iff the right-hand side is true. 16 In the model given, all the other principles of second-order logic, including all the instances of aristotelian Comprehension and the Substitution rule are sound. Hence Hume’s Principle (HP) does not semantically entail, in models sound for A2L, a theorem of infinity and so such theorems are not derivable in A2L from HP. The neo-logicist cannot claim, then, that the existence of infinitely many numbers is demonstrable in an epistemically innocent fashion unless he or she can show that it is demonstrable, in such an epistemically innocent fashion, 15 Though the argument goes through just as well for a property realist who distinguishes the extension
of a predicate from the property it stands for, so long as they allow some interpretations in which no property has an empty extension. 16 The aristotelian interpretation also yields a (not very exciting!) one-element model of Axiom V. If d is the sole individual, there are two subsets of domain D: Ø and {d} and so only one non-empty subset falling in the range of the predicate variables. We assign {d} to d and Axiom V comes out true for every assignment of predicate extensions to the two variables F and G of the embedded biconditional, since every assignment assigns {d} to both variables. This, however, is the only aristotelian model (up to isomorphism) of Axiom V .
128
The Arché Papers on the Mathematics of Abstraction
that there is an uninstantiated property, unless, in other words, they can show that aristotelianism on properties can be shown to be wrong by an epistemically innocent argument. The “aristotelian neo-logicist” will, it is true, be able to prove (∃F∀y ∼ Fy) → Inf where Inf is a theorem of infinity; that is, if an empty property exists then there are infinitely many numbers. Even if that conditional is demonstrable in an epistemically innocent fashion, this is of little use to the neo-logicist if our knowledge of the existence of an empty property is no better placed or explicable than our knowledge of the existence of infinitely many numbers. Certainly the neo-logicist cannot argue that the hypothesis that there is an empty property is holistically confirmed by the fact that it enables us to develop the mathematics needed by science. If we take this Quinean course, we may as well help ourselves to the standard axiom systems of number theory, analysis etc. Perhaps it will be felt that to credit us with an intuitive knowledge of the existence of one empty property is more plausible than crediting us with an intuitive grasp of infinitely many numbers, or even an intuitive grasp of the fact that infinitely many numbers exist. But the neo-Kantian need not appeal to direct intuition of every number, nor does there seem any relevant difference between admitting direct intuition of one non-concrete entity and admitting direct intuition of many; 17 any such appeal to intuition represents abandonment of neo-logicism. One patch the neo-logicist might try out is to add to Hume’s Principle each instance of the following “Zero” axiom scheme: ∼ ∃xϕx ↔ (nxϕx = 0) This ensures that if ϕ and ψ are both unsatisfied then nxϕx = nxψ x = 0, whereas if θ is satisfied then nxθ x = 0. Call the theory consisting of HP plus all instances of the above schema HPP · HPP entails that the universe is infinite. Model-theoretically we can show this as follows: if there are only finitely many individuals k in a domain then there are k equivalence classes under equinumerosity on the domain of non-empty sets so we need all k distinct individuals to number the non-empty properties so as to satisfy HP. But we also need an individual which is the referent of nxθ x, where θ has an empty extension and this individual, the referent of 0, is constrained by the Zero axiom scheme to be different from the k individuals which number the instantiated properties; hence there can be no finite model of HPP. A salient point to note here is that we run into problems if we replace the schematic variable ϕ with a second-order variable because in our aristotelian theory such 17 Compare Russell’s view that if one accepts that one universal—of resemblance—exists, one may as well accept a plurality: (Bertrand Russell, The Problems of Philosophy, London: Williams and Norgate, 1912, pp. 150–151).
“Neo-Logicist” Logic is not Epistemically Innocent
129
variables range over non-empty properties and the case we wish to capture here is one in which ϕ is uninstantiated. This brings to the fore how ungainly the theory HPP is; how different in form it is from an abstraction principle, formulable as a single sentence, such as HP, not to mention simple inference rules such as &E. What reason is there to suppose that the infinitary theory HPP is epistemically innocent? Since HPP contains infinitely many sentences, there is no way we can think of it as grounded in inferential practice in the way that &E is grounded in the verdict matrices for &. However, there is a single formula related to the theory HPP which will yield, even in aristotelian second-order logic, Frege’s theorem and that is HPP∗ : ∀F∀G((nx F x = nx Gx ↔ F1 − 1G) & (∼ ∃x F x ↔ (nx F x = 0))) In other words, we conjoin to HP an instance of the Zero axiom scheme but with an ordinary variable in place of a schematic one, a variable bound by one of our initial quantifiers. It follows from the right conjunct of HPP* that no property has number zero; since all properties are instantiated, in our aristotelian models, all falsify the left-hand side ∼ ∃ xFx of the right conjunct. Hence, as with HPP, there must be infinitely many numbers. Starting with our definition of zero as nx(x = x) we can then prove the Peano Postulates using the other “bridging” definitions of successor or predecessor and so on. What we cannot prove, though, and here is a key difference with HPP, is that there is exactly one “zero”, in the sense of one number which all uninstantiated properties have. This can be seen by adding to our interpretations “virtual” properties in addition to the “real” properties which exhaust the range of the second-order quantifiers and letting formulae in one individual variable whose extension is empty be assigned an arbitrary such virtual property. The numerical operator is then to be interpreted as a function on properties, real or virtual in which equinumerous real properties are assigned the same value but distinct virtual properties can be assigned distinct values. Suppose, for example, that we can prove ∼ ∃xϕx. The formula ϕx might, for example, be x < 0. We cannot go on to prove that nx(x < 0) = 0 because we cannot apply Hume’s Principle to empty properties. If we had full Comprehension we could prove ∃F∀x(Fx ↔ x < 0) and so assume for existential elimination (i) ∀x(Fx ↔ x < 0). We instantiate HPP* and derive the right conjunct by & E. Our generalised substitution rule yields, from (i) (∼ ∃xϕx ↔ (nxϕx = 0)).
130
The Arché Papers on the Mathematics of Abstraction
Having already proved that ∼ ∃x x < 0, we would then have been able to conclude nx(x < 0) = 0. But we do not have full Comprehension! We only have, for the instance in question: ∃x x < 0 → ∃F∀x(Fx ↔ x < 0) and we cannot discharge the antecedent. This plurality of “zeros” is a problem for the neo-logicist, even though Frege’s Theorem follows from HPP∗ . For PA is only one possible formulation of arithmetic (albeit a very important one) where we understand by “arithmetic” a theory investigated (in natural language plus some notation) by mathematicians; moreover mathematicians do not solely (or even mainly) deal with formalised theories of arithmetic, but also use arithmetic principles in order to prove theorems in various different areas, pure and applied. It is by their ability to account for and explain real mathematical theories that philosophies of mathematics, and the formal systems they utilise, must be judged. In “real arithmetic” we say things like “the number of numbers less than zero is itself zero”; such applications of “the number of Ps is zero” are also very commonplace in applications of arithmetic in real situations. If a philosophy of mathematics cannot account for the truth of such assertions, that is a serious defect. All this on the assumption that HPP∗ (and the aristotelian version of the Comprehension scheme) are epistemically innocent. But once again we can ask the question what grounds are there for thinking that HPP* is innocent? It is not even an abstraction principle. We will raise this question again with respect to a related principle which arises when neo-logicism is set in the framework of plural quantification.
3. George Boolos, the most influential advocate of plural quantification, proposed that one of its values is that we can use it to interpret second-order quantification without invoking either properties or classes. 18 Perhaps the neologicist can use this to bypass the issues concerning which properties exist. If Hume’s Principle can also be interpreted “pluralistically” in such a way as to retain its innocence (we have granted it innocence on the usual way of reading it) then the neo-logicist would seem to be home and dry. Boolos suggested that a monadic, existential second-order quantifier be considered a counterpart of a plural quantifier, “there are (objects)”, in natural language. The following illustration is called the Geach-Kaplan sentence: Some critics admire only one another. 18 George Boolos, ‘To Be is to Be a Value of a Variable (or to Be Some Values of Some Variables)’, Journal of Philosophy 81 (1984), pp. 430–449; ‘Nominalist Platonism’, Philosophical Review 94 (1985), pp. 327–344; ‘Reading the Begriffschrift’, Mind 94 (1985), pp. 331–344. These essays are reprinted in Logic, Logic and Logic at pages 54–72, 73–87, and 155–170, respectively.
“Neo-Logicist” Logic is not Epistemically Innocent
131
It has a (more or less) straightforward second-order rendering, taking the class of critics to be the domain of discourse: ∃F(∃xFx & ∀x∀y((Fx & Axy) → (x = y & Fy))). According to the usual reading, the formula would correspond to “there is a non-empty class (or a non-empty property) F of critics such that for any x in F and any y, if x admires y, then x = y and y is in F”. But this implies the existence of a class or property, while the original “some critics admire only one another” does not, at least prima facie. Boolos developed a rigorous, model-theoretic semantics for monadic, second-order quantification in these terms. Some philosophers with nominalist tendencies have invoked the Boolos semantics in order to obtain the benefits of second-order quantification without encumbering oneself with a second-order ontology. A good deal. Our neo-logicist might attempt a similar maneuver, in order to make the logic more tractable. In this case, however, there are troubles at every turn. Second-order logic enters into Hume’s Principle (and Frege’s theorem) in two places. Like any second-order abstraction, Hume’s Principle has prenex universal quantifiers binding monadic property variables. The Boolos plural construction originally was limited to existential second-order quantifiers but Boolos extended it to universal quantifiers. He glosses a second-order universal quantification of the form ∀F(F) along the lines: no matter which things the Fs are, holds of the Fs.
We will give an example later illustrating how he glosses the schematic phrase:– holds of the Fs. However, on the right-hand side of Hume’s Principle we find the notion of equinumerosity and the definition of equinumerosity invokes a variable over (binary) relations on the domain: two (monadic) properties are equinumerous if there is a one-to-one relation from the extension of one of them onto the other. So our first problem is that the Boolos plural construction is limited to monadic property variables, while the second-order definition of equinumerosity has a binary relation variable. However, if there is a (definable) pair function in the language, then relations can be introduced in the usual manner. Variables over binary relations are replaced with monadic variables ranging over pairs: ∃R is rendered ∃F , where is obtained from by replacing each occurrence of Rtu with F< t, u>, where < t, u > is the ordered pair of t and u. This move is not available here, at least not without begging the crucial question. There can be no pair function on a finite domain with more than one member. If the domain has size n, then there are n 2 ordered pairs. So our neo-logicist cannot introduce a pair function until she has shown that the universe has either at most one object in it or is infinite. This throws a monkey wrench into the works. The neo-logicist wants to use Hume’s Principle and
132
The Arché Papers on the Mathematics of Abstraction
therewith Frege’s Theorem to establish that the universe is infinite, by showing that the natural numbers exist (and are distinct). But on the present plan, she cannot even formulate Hume’s Principle (via plural quantification with pairing) without first showing that the universe is either non-plural or infinite. Since we non-Hegelians know the first disjunct is false, this means showing that the universe is infinite. The plan is frustrated before it can even get started. The existence of a pair function on each infinite domain is equivalent to the full axiom of choice, since it amounts to κ 2 = κ, for each cardinality κ. How can our “pluralist” neo-logicist claim that the existence of pairs is epistemically innocent? This amounts to the epistemic innocence of full choice. Perhaps we can be helpful here. The neo-logicist might introduce pairs via an abstraction principle: ∀x∀y∀z∀w( (x, y) = (z, w) ≡ (x = z & y = w)). Call this the Pair Principle. Strictly speaking, it is not in the same form as other abstraction principles, since the right-hand side (symbolised as (x = z & y = w)) represents a four-place relation and so not an equivalence relation. But it has at least the flavor of an equivalence relation (symmetry, transitivity, reflexivity), and so is in the spirit of other abstraction principles. 19 The Pair Principle lies between first-order abstractions and second-order abstractions. If our neo-logicist can maintain that it is epistemically innocent, then an acceptable “plural” formulation of equinumerosity can be produced. The antecedent represents quite a big “if” since the principle will in effect give the neo-logicist an infinite universe. But for the rest of this section, we will make that concession and assume that the right-hand side of Hume’s Principle is kosher. What, next, of the left-hand side of the principle in which we find, crucially, the abstraction operator itself—“the number of”. On the standard reading, this is a function from properties (or classes) to objects, but we are considering here a philosopher who is trying to get by without acknowledging properties or classes at all. We thus cannot think of the abstraction term as denoting a function, for there is (or may be) nothing for it to operate on. This is not an insuperable stumbling block: English and other natural languages with a definite article construction allow that construction with respect to plurals. We speak, for example, of “the dogs in the room” or “the numbers less than 12”. So it seems to make sense to use the plural definite article with a second-order variable, to produce the locution, the F’s or, here, the number of F’s 19 Reflexivity: (x = x & y = y); symmetry: (x = z & y = w) entails (z = x & w = y); transitivity: (x = z & y = w); and (z = t & w = u) entail (x = t & y = u). Bob Hale invokes a pair abstraction in “Reals by abstraction” (Philosophia Mathematica (3) 8, 100–123; reprinted in Hale and Wright [2001], 399–420). Frege’s theorem only needs to invoke instances of Hume’s principle applied to finite properties with finite extensions and so it may be possible to formulate a “predicative” version of the Pair Principle in which no occurrences of occur (at least untyped) inside terms of the form (x, y). The (set-theoretic) statement that the Pair Principle is satisfiable on every infinite domain is equivalent to choice. However, as far as we know, the Pair Principle does not entail the axiom of choice, since it only produces a pair function on the universe.
“Neo-Logicist” Logic is not Epistemically Innocent
133
(we will attempt to be as neutral as possible on the right way or ways to approach the syntax and semantics of such locutions). Certainly Boolos found application of the number operator to plural phrases perfectly sensible. 20 Here is one of his glosses of Hume’s Principle, illustrating also his way of reading second-order universal quantifiers plurally: Hume’s Principle is the statement that no matter which things the Fs and Gs may be, the number of Fs is the same as the number of Gs just in case the Fs and Gs are in one–one correspondence. 21
So we finally have a plural-quantifier version of Hume’s Principle. Call this Plural Hume. Let us assume that, as written, Plural Hume is epistemically innocent. What does it say? In particular, what of the initial plural quantifiers? Macari, it will be recalled, agrees that instantiated properties exist, and so instantiated properties can be objects in the range of second-order variables; but Macari denies that empty properties exist (or at least that we can prove innocently that they do). We saw in the previous section that under Macari’s assumption, Frege’s Theorem does not go through using Hume’s Principle alone. In Macari’s system, Hume’s Principle is satisfied on finite domains. But if the neo-logicist goes the route of plurals, and formulates Plural Hume, then Frege’s Theorem does not go through for the same reason:—if our neo-logicist relies on plurals, she has played into Macari’s hands. The reason for this is fairly simple. The ordinary locution, there are F’s such that . . . entails that there is at least one F. Indeed, this is what the locution says. 22 In traditional terms, the plural quantifier declares that a certain predicate is instantiated. For example, I cannot announce that there are vicious elephants in our backyard, unless I have reason to believe that there are some. The empty property (if there is one) cannot instantiate a plural existential quantifier. Similarly if I say that no matter what things the Fs are, if everything which is an F is a G, then Hamish is an F, i.e. no matter what the F’s are: (∀x(Fx → Gx) → Fh) then I do not seem to be committed, absurdly, to Hamish being non-selfidentical and what I say seems perfectly coherent if, for example, only Hamish is a G. 23 Thus the plural quantifiers are analogues only of Macari’s aristotelian quantifiers, not the standard second-order quantifiers. If, therefore, the Boolos program is successful, it eliminates a commitment to instantiated properties 20 See, for example, Boolos [1993]. 21 Op. cit., p. 223. Perhaps “no matter which things the Fs and Gs are” is better than “no matter which
things the Fs and Gs may be” as the latter may seem to import a modal element foreign to straightforward universal quantification. 22 Boolos notes that the locution there are F’s might entail that there are at least two F’s, but one F is enough for present purposes. See, for example, Boolos [1984], p. 443. 23 If there is no way of reading universal second-order quantification without commitment to properties or classes, including empty ones, then someone sticking to Boolos’ original motivation for plural quantification would have to drop ∀F as a primitive and use the ∼ ∃F ∼ translation.
134
The Arché Papers on the Mathematics of Abstraction
or to non-empty sets. For that reason we cannot validate, in the way required by neo-logicists, the full axiom scheme of Comprehension; only the restricted form available to Macari is sound for plural quantifiers. And so Frege’s Theorem is blocked from the start. By reasoning parallel to that in the Macari case we can show that there are models in which the numbers form a finite subset of the infinite domain, with 0 = S n 0, for some number n. Indeed there will be a model of size one, one in which the pairing axiom holds. Plural Hume does not entail the Peano postulates. However, Boolos’ pluralist agenda included the program of rendering standard “non-aristotelian” second-order languages into the language of plural quantification. (This language is akin to lawyer’s English, with indexed pronouns and related constructions playing a role similar to “the party of the third part” and so on.) He does this by including in his translations a clause to handle explicitly what would be empty properties. Let (F) be a formula with the monadic, second-order variable F free. Let ∗ be the result of replacing each occurrence of Ft in (F) with t = t. In standard terms, ∗ states that holds of the empty property. Then Boolos renders the second-order ∃F(F) as something like: Either there are some F’s such that (F), or ∗
For example, he “translates” the second-order set-theoretic truth ∃F∀x(Fx ≡ x∈ / x) as (after some simplification) “Either there are some sets that are such that every set is one of them iff it is not a member of itself or every set is a member of itself” (ibid.). 24 In this case, the second disjunct does no work (being provably false), but it is part of the “translation”. Applying the same idea to ∀F(F) we get: no matter what the Fs are, (F) and ∗
with conjunction in place of disjunction. Since plural English can get rather complex we will carry out some translations into a formalised version of it but in order that the plural reading be borne in mind we will use (F) for no matter what the F’s are and (EF) for there are some F’s such that. So we translate, e.g. the sentence ∀F Ft, which is false in standard second-order logic, into: ((F)Ft & t = t) which is also false even in a one-element universe in which no uninstantiated property exists. Can we amend Hume’s Principle along those lines? The matter is complicated a bit in the present context. Since we have introduced a higherorder operator the number of F’s , we cannot just replace the “F” with something like t = t to handle the uninstantiated case. The phrase The 24 ∗ (F) here is ∀x(x = x ↔ x ∈ / x) which is provably equivalent to ∀x(x ∈ x).
“Neo-Logicist” Logic is not Epistemically Innocent
135
number of t = t is not grammatical. One way round this is to introduce λ terms and thereby a translation *λ in which [∀F(F)]∗λ = ((F)(F)) & (λx(x = x)) with a dual clause for ∃F; thus ∀F Ft becomes (F)Ft & λx(x = x)t. But since we can always eliminate λ terms by λ conversion, starting from λ terms with narrowest scope and working outwards substituting equivalents for equivalents, we can in this way arrive at a translation ∗∗ which is the result of applying λ conversion to the ∗ λ translation. Hence applying ∗ λ to ∃F(nxFx = nx(x = x)) (i.e. there is some non-zero number) gives us: (E F)(nxFx = nx(x = x)) ∨ n(λx(x = x) = n(λx(x = x)) 25 and then λ conversion yields the ∗∗ translation (E F)(nxFx = nx(x = x)) ∨ nx(x = x) = nx(x = x) in which the second disjunct is a logical falsehood. Monadic Comprehension: ∃F∀y(Fy ↔ ϕy)(whereFdoes not occur in ϕ) becomes, after λ conversion: (E F)(∀y)(Fy ↔ ϕy) ∨ (∀y)(y = y ↔ ϕy) which is equivalent to (∃y)ϕy → (E F)(∀y)(Fy ↔ ϕy), i.e. If there is a ϕ then there are some Fs such that anything is an F if it is a ϕ and is a ϕ if it is an F.
We submit that if this is formalised so as to be put to use in a derivational system, it must be formalised as Macari’s aristotelian (monadic) axiom of Comprehension, with the antecedent requiring the existence of a ϕ; 26 it ought not to be formalised as standard comprehension. In one sense, the Boolos program validates full second-order comprehension: each instance is translated into a truth of the plural quantification framework. But never a truth with which Macari would quibble. For example, where ϕ in the Comprehension scheme is instantiated by y = y, the translation yields (∃y)y = y → (E F)(∀y)(Fy ↔ y = y). What happens to Plural Hume under the Boolos translation? Here things are complicated by the fact that we have two initial universal quantifiers to 25 If we have λ terms, we can do all variable-binding using them, formalise the numerical operator as n(λxϕx) and abbreviate the latter as nxϕx. 26 As remarked above, we leave aside the complication that the plural reading is indeed plural, that is, seems to require more than one ϕ; even ‘at least two’ is arguably not quite right as a reading either.
136
The Arché Papers on the Mathematics of Abstraction
work on. Plural Hume, using our conventions about “pluralese” (and leaving unpacked the definition of one:one correspondence) is: (F)(G)(nx F x = nx Gx ↔ F1 − 1 G). Applying ∗∗ to (F) we get: ((F)[(G)(nxFx = nxGx ↔ F1 − 1G)]) & [(G)(nx(x = x) = nxGx ↔ (x = x)1 − 1G)]. We must now apply ∗∗ to the two (G) formulae in square brackets. For the first, it yields (using the standard definition of 0 as nx(x = x)): ((G)(nxFx = nx Gx ↔ F1 − 1G)) & (nx F x = 0 ↔ F1 − 1x = x). For the second we get ((G)(0 = nx Gx ↔ (x = x)1 − 1G)) & (0 = 0 ↔ (x = x)1 − 1(x = x)). Putting all this together the result is ((F)((G)(nx F x = nx Gx ↔ F1 − 1G) & (nx F x = 0 ↔ F1 − 1x = x)))& (G)((0 = nx Gx ↔ (x = x)1 − 1G) & (0 = 0 ↔ (x = x)1 − 1(x = x))). Call this Amended Plural Hume. Now the last conjunct is a logical truth and the second and third are symmetrical so that we can delete one of these which gives, with some further simplification: (F)(G)((nx F x = nx Gx ↔ F1 − 1G) & (∼ ∃x F x ↔ nx F x = 0)). That is: No matter what the Fs are and no matter what the Gs are, the number of Fs = the number of Gs just in case there is a one–one correspondence between them and there are no Fs iff the number of Fs = zero.
But this is none other than the non-schematic HPP∗ . We conjoin to Plural Hume a clause specifying that the number of Fs is zero iff there are no Fs. HPP∗ yields, we have seen, Frege’s theorem but fails to capture many applications of the numerical operator by allowing a plethora of zeros. This a form of plurality which supporters of plural quantification will be less keen on. In sum, the plural version of HP does not give us Frege’s theorem. Plural HPP∗ does but does not give us the applications we want. Moreover we can ask again why think plural HPP∗ , which is not an abstraction principle, is innocently true. After all, none of the formal versions of HP can be presumed to be explicitly part of the concept of number possessed by everyday speakers (or even most mathematicians); indeed it is doubtful whether any are even implicitly held by hoi polloi. The best case for the epistemic innocence of HP is surely that something like:
“Neo-Logicist” Logic is not Epistemically Innocent
137
the number of apples equals the number of oranges iff the apples and oranges can be paired off one:one
is part of competent users’ tacit understanding of number (and so on for other sortal concepts). If Boolos’ plural reading of quantifiers is on the right lines, the most faithful reading of these “platitudes” which jointly are constitutive of our notion of number (compare the conceptual functionalists’ idea that our folk psychological notions arise from psychological platitudes) is the original Plural Hume not Amended Plural Hume, and the original is certainly too weak to do the work the logicist requires of it. The neo-logicist might abandon the project of reconstructing arithmetic using Hume’s Principle and consider other abstraction principles, perhaps principles which are not constitutive of any ready-to-hand notion. 27 But given the equivalence in power of the plural quantification and aristotelian second-order logic, and the existence of oneelement models of Axiom V, this strategy will not work either. To conclude these two sections and the examination of logicism’s use of second-order logic: we have considered the effect of weakening second-order logic by allowing as epistemically innocent only the assumption of arbitrary instantiated properties, or equivalently, allowing only pluralist interpretations of second-order quantification. But the result, combined with Hume’s Principle, does not yield Frege’s Theorem. Moreover thus far we have been very generous to the neo-logicist. Many who are realist about properties, for example, assume only a “sparse” rather than an “abundant” theory of properties: that is, they do not assume that to each arbitrary predicate—(x is an electron or x is a baseball or x is a pleasant dream)—there corresponds a property. (And even those who favour an abundant theory do not always claim they know a priori that it is true.) This suggests that no form of the axiom scheme of Comprehension is epistemically innocent. But if we drop it completely, then it is easy to produce finite models of HP (and indeed of Axiom V), models 28 in which any number can be a “rogue” number in the same way that zero is a rogue number in the aristotelian version of Fregean arithmetic. Take any finite set of D individuals and any subset S of the power set of D such that there are at most n equivalence classes over S under the equivalence relation of equinumerosity. There is thus a one:one function f from this set of equivalence classes into D. We let S be the range of the monadic predicate quantifiers and select some member d of D as our rogue number; we then interpret the numerical operator as follows: where the extension s of ϕx belongs to S, the referent of nxϕx is f (E(s)), where E(s) is the equivalence class s belongs to; otherwise the referent of nxϕx is d. As before, this gives us a model of 27 Cf. Wright’s response to Boolos on the question of what notion “New V”, an abstraction principle which is a weakened form of Axiom V, is constitutive of:—“The Philosophical Significance of Frege’s Theorem”, pp. 239–240. 28 Non in standard ‘unfaithful Henkin’ models, of course: cf. Cf. Stewart Shapiro, Foundations without Foundationalism (Oxford: Clarendon, 1991), p. 89.
138
The Arché Papers on the Mathematics of Abstraction
second-order logic (minus Comprehension) plus Hume’s Principle, this time one in which (∃F∀y ∼ Fy) → Inf can fail, if the empty set belongs to S. Our overall conclusion, then, is that the neo-logicist program, as standardly developed, requires the use of the strongest form of the impredicative axiom scheme of Comprehension if it is to achieve its goals (e.g. the derivation of Frege’s theorem) but that this principle is on a par with outright stipulation of the Peano-Dedekind or ZF axioms in terms of epistemic innocence.
4. We turn now from the presuppositions the neo-logicist makes with respect to second-order logic to those made with respect to first-order. If one accepts that standard classical first-order logic is epistemically innocent then one already accepts that one can prove existence claims innocently, for one can prove ∃x(Fx ∨ ∼ Fx) or, in first-order logic plus identity, 29 ∃x(x = x). Even though this is not to prove very much about whatever it is which exists, the neo-logicist might take these standard theorems as revealing at least an ad hominem problem for those opponents who reject the idea of a priori or epistemically innocent existence proofs but nonetheless accept first-order logic. However, we conjecture that most contemporary logicians would respond by denying that standard non-free logic is epistemically innocent. The theorem ∃x(x = x) is harmless, they might say, since we know some things exist. And the quantifier rules ∀E and ∃I: ∀xϕx ϕx/t
ϕx/t ∃xϕx
are harmless if applied in a language in which one can reasonably assume or suppose that no singular term is empty, that is non-denoting. Making these assumptions certainly simplifies the proof theory and model theory. However, if one was interested in what could be derived by pure logic alone, or if one was working in a language containing complex singular terms which may not denote—for example, description terms or terms constructed using function terms standing for partial functions—then the above simple rules must be rejected in favour of more complex rules such as ∀xϕx, E(t) ϕx/t
ϕx/t, E(t) ∃xϕx
where E(t) represents some way of expressing the claim that t exists. Whether or not this would be the usual response of an “anti-Anselmian” logician, we submit it is the right response to make to the neo-logicist. Moreover Wright himself seems to presuppose some sort of free logic background at 29 From now on, include the standard theory of identity in “first-order logic”.
“Neo-Logicist” Logic is not Epistemically Innocent
139
least in the sense of allowing for empty domains. 30 Furthermore, to hold both that numerical terms such as nx(x = x) are genuine singular terms and that ∃I is unrestrictedly valid for all complex terms gives the neo-logicist a very easy victory but a Pyrrhic one since the key question is so obviously begged. Let us look, then, at Hume’s Principle and theorems of infinity against a free logic background. The restrictions on second-order ∀E and ∃I which blocked the proof of infinity in an unfree first-order background will, of course, a fortiori, do so in a free logic background. In order to isolate the problem with free logic, we grant the neo-logicist for these purposes standard second-order rules and axiom schemes, our sole amendment being altering first-order ∀E and ∃I as above, that is to ∀xϕx, E(t) ϕx/t
ϕx/t, E(t) ∃xϕx
Different policies on, and semantics for, “E” (it may, for example, be some complex formula containing t) will yield different frameworks for free logic. The problem for the neo-logicist is that in some of these the crucial proof of the infinity of the natural numbers is not forthcoming from Hume’s Principle. We cannot look at all possible free logics of course; rather we will look at one framework under which it is plausible that Hume’s Principle is epistemically innocent but under which Frege’s Theorem fails and at a rather different one on which the latter holds but it is very implausible that Hume’s Principle is epistemically innocent. It is then is up to the neo-logicist to refute our contention that these two approaches are representative in the sense that any free logic will fall on one or other horn of the dilemma. For the first example, we take the inner domain/outer domain framework— cf. Read Thinking about Logic (Oxford: Oxford University Press, 1994), pp. 134–7, 146–7. One divides the domain of individuals, of referents of individual constants, into two classes, the inner domain of “real” individuals, this domain being the range of the individual quantifiers, and an outer domain of “dummy” or “virtual” individuals, which do not belong in that range, so intuitively do not “exist”, though they can be assigned to nonvariable terms as referents. One can then adopt a number of policies for the value of an atomic sentence Pt in an interpretation in which t is assigned a merely virtual referent. One can always set such sentences as gappy, neither determinately true nor determinately false; or else always set them false, or (this option is rather unmotivated) always true, or let them vary freely as for sentences in which there is no failure of “real” reference. Predicate extensions then are assigned positive and negative extensions, each being divided into real and virtual components which together exhaust the particular extension. 30 Op. cit., pp. 235–6 where he writes, concerning the question whether the property λx(x = x) is too big, “But it will, presumably, be too big, however exactly that notion is defined, if the universe is empty”.
140
The Arché Papers on the Mathematics of Abstraction
Since we are working in a second-order framework, additional complications can arise if we want to mark failure of predicates to stand for properties. In a general treatment we would have domains of individuals, of properties and of n-ary relations. We then ask whether, e.g. the assignment to P in Pt of a merely virtual property induces truth-gaps or not; the semantics will proceed by assigning extensions: either of individual (positive and negative) satisfiers to properties or, dually, to each individual a set of properties which the individual has, and a negative set of properties which it lacks. To keep things as close as possible to the standard case, let us assume bivalence (so we can drop negative extensions) and confine the free logic component to the first-order case alone. So the domain of individuals D divides into exclusive and exhaustive subdomains I and O, the inner and outer domains; let us also take identity to be a primitive, its extension being all pairs α, α, for α in D, i.e. α virtual or real. So if terms t and u stand for the same individual, real or virtual, t = u is true, if they stand for distinct individuals, real or virtual, the sentence is false. This yields pretty much the standard theory of identity: t = t is valid for all t, Leibniz’s law—from ϕx/t and t = u conclude ϕx/u—is sound and so these two can be added as axiom scheme and inference rule respectively. This yields the standard theorems constraining identity to be an equivalence relation: ∀x x = x ∀x∀y(x = y → y = x) ∀x∀y∀z((x = y & y = z) → x = z). The main difference is that, since we are working in a free logic background, we cannot move from universal to existential generalisations as freely as before, i.e. cannot conclude ∃x x = x from ∀x x = x. As remarked, we are assuming the second-order quantifiers are interpreted as closely as possible to the standard way. Hence we can take the range of these quantifiers to be the power set of D thus allowing the “platonic” case of “properties” uninstantiated by a real individual. In fact, if the virtual domain is not empty we get a number of distinct empty properties even where we simply identify (which we do here, purely for convenience) properties with subsets of the domain of individuals; if v is in O, then Ø and {v} are two distinct properties both satisfying ∼ ∃yFy. Now in this framework we cannot prove, in any system sound for that semantics, ∃x(x = x), or indeed ∃x(ϕx) for any ϕ, since there are interpretations in which I is the empty set and all individuals are virtual (though we will be able to prove ∃F∀y ∼ Fy). This framework, then, does not build the neo-logicist assumption that there are a priori existence proofs, at least of the existence of individuals, into the underlying logic. Can we then show that by augmenting it with abstraction principles we can nonetheless prove, from
“Neo-Logicist” Logic is not Epistemically Innocent
141
epistemically innocent principles alone, that some things exist, indeed that an infinity of different numbers exist? The answer is no, for reasons analogous to the aristotelian case. There are finite, indeed empty models of Hume’s Principle in this framework, that is models in which I is finite or empty. For the latter case, take a “universally free” model in which D = O = {v} and I = Ø and so not even ∃x(x = x) true, far less a theorem of infinity is true. The Fregean proof of the infinity of the natural numbers breaks down right at the outset: we do not have ∃x(x= nx(x = x)). To see what happens to Hume’s Principle in this model, note that there are only two properties: {v} and Ø, both of them empty, that is both have empty real subextensions. Both properties, therefore, are equinumerous: with {v} assigned to F and Ø to G the formula which expresses the condition that a one:one correspondence holds between F and G, namely: ∃R∀x((Fx → ∃!y(Rxy & Gy))&(Gx → ∃ ! y(Ryx & Fy))) is true in the model. For remember that the quantifiers ∀x and the uniqueness quantifier ∃!y 31 on the right-hand side both range over the inner domain I, a domain which is empty in this case. Hence assigning to R any relation over D one likes satisfies the right-hand side of the biconditional. Every assignment to x trivially renders Fx → ∃!y(Rxy & Gy) true; for none renders it false, since there are no assignments from I to x; likewise for the second conditional. By assigning v as the number of each property with an empty real extension, thus to nxϕx, for every ϕ, we ensure the left-hand side is true so that the instance: nx F x = nx Gx ↔ ∃R∀x((F x → ∃ !y(Rx y & Gy)) &(Gx → ∃ !y(Ryx & F y))) of Hume’s Principle is true in the model; and the same holds for the other three possible assignments of pairs of properties to F and G. More generally, ensure there is at least one virtual individual and partition the properties of the universe, the subsets of D, into equivalence classes under equinumerosity (of the real extensions of the properties). If the inner domain is of size k, there will be k + 1 of these; 32 since D is of size at least k + 1, there is a function N which maps these equivalence classes into the total domain, inner plus outer. If we then interpret nxϕx via N, i.e. assign to it the value of N applied to the element of the partition to which the extension of ϕx belongs, then it is easy to see Hume’s Principle comes out as true in the model. Note that we could add as an additional requirement on admissible models that all simple singular terms have real referents. If there is some finite number n of simple singular terms, then will be able to construct finite models of Hume’s Principle in which standard ∀E and ∃I are sound where instantiation is for simple terms, finite models which can be of any size ≥ n. 31 i.e. ∃!yϕy is (∃yϕy & ∀x(ϕx → x = y)). 32 Since k can be infinite, we assume the Axiom of Choice in the metatheory.
142
The Arché Papers on the Mathematics of Abstraction
One might also note that Axiom V is satisfiable in this framework too; in fact, there are models of every cardinality of Axiom V and indeed of any abstraction principle. 33 This tolerance of arbitrary sets of abstraction principles affords no comfort to the neo-logicist since we still cannot prove any set, any non-proper class, exists. The outer domain of “virtual” objects, in other words, is best seen as a technical device. It gives us a semantics in which the free logic versions of first-order ∀E and ∃I are sound (with E(t) true just when t has a referent in the inner domain). And even if we were prepared to tolerate a baroque ontological slum in which virtual objects have some type of being, this will still not help the neo-logicist since there are models where the total domain D, including virtual individuals as well, is finite.
5. However, there are other frameworks for free logic than the inner/outer framework above. For instance, there are free logics in which the existence premiss E(t) is taken to be self-identity: t = t is true only if the term t refers. 34 This framework seems better suited to the neo-logicist position. The neologicist can hold that t = t entails ∃x x = t but deny that t = t is valid for every t. In some special cases, however, such an identity may be provable. For example, nx(x = x) = nx(x = x) is provable from the instance of Hume’s Principle in which we instantiate both predicate variables with x = x, for the right-hand side of this instance is a logical truth. Let us take E(t), then, to be t = t. But in order to prevent confusion with the identity relation in which t is identical with t is always true, we will retain “=” with the interpretation given above and expand our language to include in addition this second “existential” identity relation which we will express as “t ∼ = t”, evaluating such sentences as true in a model iff the referent of t belongs to the inner domain of the model. 35 The amendments to standard rules needed are then: ∀xϕx, t ∼ =t = t ϕx/t, t ∼ . ϕx/t ∃xϕx 33 Where I is of cardinality κ, let O be a disjoint set of cardinality 2κ . There is thus a one:one function B from the power set of I into D. Every property contains a real subextension (possibly empty). Interpret {x: ϕx} by a map which takes property P to B(X), where X ⊆ I is the real subextension of P. Given our interpretation of =, this is easily seen to satisfy Axiom V (for courses of values):
∀F∀G(({x : Fx} = {x : Gx}) ↔ ∀z(Fz ↔ Gz)). For, given any assignment of properties P1 and P2 to F and G, the left-hand side of the embedded biconditional is true iff B(rP1 ) = B(rP2 )—rPi being the real subextension of P—iff the real subextensions of P1 and P2 are identical iff (since ∀z ranges only over I), ∀z(Fx ↔ Gz) is true. The argument generalises to arbitrary abstraction principles. 34 See Dana Scott et al., Notes on the Formalization of Logic, Part II, Sub-Faculty of Philosophy Study Notes, University of Oxford. 35 Though when we leave wide identity = out of consideration, it does not matter whether we divide the individual domain into real and virtual components or simply let empty terms denote nothing at all.
“Neo-Logicist” Logic is not Epistemically Innocent
143
We can also lay down in the semantics that individual parameters can only be assigned real referents in I. 36 This ensures the soundness of standard ∀I and ∃E and also the logical truth of a ∼ = a,where a is any parameter (though t ∼ = t is not true in general). Since the ∼ = version of Leibniz’ law is also sound in this semantics—the premiss t ∼ = u can only be true when both t and u are nonempty and stand for the same referent—reflexivity, symmetry and transitivity of identity still hold in the generalised form above (i.e. ∀x x ∼ = x holds, even though t ∼ = t may fail). So we still have a fairly standard-looking theory of identity. The neo-logicist can then argue that the mere introduction of numerical and class terms alone does not prove that numbers and classes exist so that there is no questionbegging assumption from the outset that “cheap” proofs of existence theorems are available. However, it is still true, in this framework, that the identity relation is a one:one map from non-self-identity onto itself, reading identity as “wide” non-existential identity, i.e. (using λ abstracts for clarity here) (λx(x = x) 1 − 1 λx(x = x)) holds by dint of the relation λx(λy(x = y)). More fully this is: ∀x((x = x → ∃ !y(x = y & y = y)) & (x = x → ∃!y(y = x & y = y))). Indeed we also have λx ∼ (x ∼ = x) 1–1∗ λx ∼ (x ∼ = x)), where the ∗ indicates ∼ one:one functions expressed in terms of = rather than =. For although we do not have t ∼ = t, for arbitrary t, we do have a ∼ = a, where a is a parameter; hence we have both a = a → ∃!y(a = y & y = y) and ∼ (a ∼ =y&y∼ = y) 37 = a) → ∃ ! ∗ y(a ∼ and in general can prove anything from a = a (e.g. ∃!*y(a ∼ = y & y∼ = y)) ∼ and anything from ∼ (a = a) (e.g. ∃!y(a = y & y = y)) by ex contradictione quodlibet. Thus (λx ∼ (x ∼ = x)) also holds by dint of = x) 1 − 1∗ λx ∼ (x ∼ ∼ the relation λx(λy(x = y)). Now that we have two different notions of identity we have a number of different versions of Hume’s Principle, for instance HP∗ : ∀F∀G (nx F x ∼ = nx Gx ↔ ∃R∀x((F x → ∃!∗ y(Rx y & Gy)) &(Gx → ∃!∗ y(Ryx & F y)))) where ∃!∗ y(Rx y & Gy) is ∃y((Rx y & Gy) & ∀z((Rx z & Gz) → z ∼ = y)). 36 In models with empty I, atomic formulae containing parameters are all to be interpreted as true since there are no admissible assignments under which they are false. 37 With ∃!∗ ϕx abbreviating (∃yϕy & ∀x(ϕx → x ∼ = y)).
144
The Arché Papers on the Mathematics of Abstraction
Our original HP but with = interpreted in the wide non-existential fashion in our free logic setting is, for comparison. 38 ∀F∀G(nxFx = nxGx ↔ ∃R∀x((Fx → ∃!y(Rxy & Gy)) & (Gx → ∃!y(Ryx&Fy)))) ∃ !y(Rx y & Gy) being ∃y((Rxy & Gy) & ∀z((Rxz & Gz) → z = y)). Focusing on HP*, we can prove by instantiating F and G by λx(∼ (x ∼ = x)), (and using the Substitutivity Rule) the following: N x ∼ (x ∼ = x) ∼ = nx ∼ (x ∼ = x) ↔ ∃R∀x((∼ (x ∼ = y))) & (∼ (x ∼ = x) → = x) → ∃ !∗ y(Rx y & ∼ (y ∼ ∗ ∃ ! y(Ryx & ∼ (y ∼ = y)))). Since the right-hand side of the biconditional is, as we have seen, provable, we can deduce (nx ∼ (x ∼ = x) ∼ = nx ∼ (x ∼ = x)) (i.e. 0 ∼ = 0 with zero defined ∼ ∼ ∼ using =) and so prove ∃x(x = nx ∼ (x = x)) using our latest version of free ∃I where E(t) = df. T ∼ = t. And from here the Fregean proof of the infinity of the natural numbers can proceed as before. So HP∗ amounts to an axiom of infinity even in the context of free logic with existential identity. However, this is another place at which we must depart from our policy of not contesting the neo-logicist claim that abstraction principles such as Hume’s Principle are epistemically innocent. For the plea of innocence is more plausible with respect to HP than HP∗ . One key notion which neo-logicists have used to argue for the innocence of Hume’s Principle and against the idea that they beg any questions in teasing out ontological commitments from such abstraction principles is the idea of “reconceptualisation”. Taking as their text Grundlagen §64, the neo-logicists do not claim that in general one can generate objects from concepts; rather one shows how to “reconceptualise” some thoughts so as to generate new concepts which carve up the “state of affairs” represented in the thoughts in a different way (as involving an identity between directions rather than a parallelism between lines, for instance). More generally, the neo-logicist argues that one can “recarve” the concepts involved in thinking of a state of affairs in which an equivalence relation obtains in such a way as to generate new concepts, pertaining to abstracts. Moreover there is to be, using Frege’s directions example: 39 absolutely no gap between the existence of directions and the instantiation of properties and relations among lines. 38 And of course there are a number of other of versions of Hume’s Principle using permutations of = and ∼ =; the further variants produce no philosophically relevant new cases. 39 C. Wright, “On the philosophical significance of Frege’s theorem”, Language, thought, and logic, edited by Richard Heck, Jr., Oxford, Oxford University Press, 201–244.
“Neo-Logicist” Logic is not Epistemically Innocent
145
But what Wright must mean here is that there is absolutely no epistemic gap between the existence of directions (or the truth of identities between abstracts) and the instantiation of properties and relations among the original domain of objects. Reconceptualisation cannot do the job it is intended to do unless in addition to referring to a process of conceptual innovation or creation it also carries an epistemic punch. For suppose numbers do exist of necessity. Then “0 exists” and “0 ∼ = 0” will be semantically equivalent to (various readings of) “there is a one:one map from non-self-identity onto itself”. Indeed “0 exists” will be semantically equivalent to P → P , for any P. But the neo-logicist needs more than this, if neo-logicism is to answer the epistemological worries usually directed against platonism. The neo-logicist needs there to be an epistemic equivalence so that we can know that 0 ∼ = 0, know the truth of the left-hand side of the relevant instance of HP*, on the basis of our knowledge of the truth of the right-hand side (though we do not know it on the basis of our knowledge that P → P). So to say that we “reconceptualise” some “state of affairs” described in terms of relations among properties into one involving objects such as numbers is just to assert that we can know the objects exist on the basis of our knowledge of the relations among properties. It is not to show how we can know this. In the case of wide identities such as 0 = 0, the epistemic gap may indeed be small. But x = x is not, of course, an existence predicate and in knowing the truth of 0 = 0 we are not coming to know anything about the existence of objects. The identity t ∼ = t, on the other hand, has the sense or informational content of [t exists] or a sense very close to it: competent speakers, after all, are to use such identities as the existential premisses of the ∃I and ∀E rules. So in this case we are being asked to accept that the state of affairs consisting of there being a one:one mapping of the existential identity relation λx(x ∼ = x) onto itself can be reconceptualised into the thought [nx(∼ x ∼ = x) ∼ = nx(∼ x ∼ = x)] in such a way that there is not even the slightest epistemic gap between the latter proposition, which is tantamount to “the number zero exists” and the former which is held to be a (second-order) logical truth. But this is to ask us to accept right at the outset that some epistemically innocent truths 40 are equivalent to existence claims. This is precisely the point on which we remain to be convinced:– it is no clearer how we can know zero exists on the basis of our knowledge of the one:one map on non-self-identity than we can know that it exists on the basis of our knowledge that if the moon is made of green cheese then it is made of green cheese. It may seem that there is a closer link between “0 ∼ = 0” and “there is a one:one map from non-self-identity onto itself” than there is between “0 ∼ = 0” 40 Let us for the sake of argument grant epistemic innocence to (λx ∼ (x ∼ x) 1 − 1∗ λx ∼ (x ∼ x)) = = though it is far from innocent on the aristotelian understanding of second-order logic.
146
The Arché Papers on the Mathematics of Abstraction
and P → P. The existence of zero is an “ingredient” of the state of affairs of the one:one map on the property, and not on the conditional state of affairs, if there is such a thing. Stripping away the metaphor, the claim of an especially close epistemic link between the two sides of the biconditional double instantiation of Hume’s Principle surely rests on the idea that the principle is analytic or in some sense constitutive of our notion of number. While this may be plausible for HP, applied to HP∗ it entails the claim that there is a purely conceptual link between a truth of (standard) second-order logic and an existence claim, i.e. it entails that one can get existence out of meanings alone, the very thing which has to be demonstrated not taken as a premiss. Far from there being no epistemic gap between the left and right sides of instances of HP∗ , there is a large chasm bridged, according to the neo-logicist, by “reconceptualisation”. But to anyone not already convinced that meaningconstitutive principles can generate mind-independent existence claims, this is a bridge too far.
6. Overall we have seen that the neo-logicist needs to show not only that some second-order principles are epistemically innocent, if the full neo-logicist programme is to be successfully accomplished; neo-logicism requires the full axiom scheme of Comprehension which, we argued, embodies substantive non-innocent ontological commitments (at least if the semantics for the pure mathematical sector is taken to be homogeneous with a broadly realist semantics for the non-mathematical sector). Moreover if the neo-logicist assumes the innocence of standard non-free first-order logic then he or she begs the question against opponents of neo-logicism. If not, then if identity does not have existential import, Frege’s Theorem fails whereas if it does have existential import, then Frege’s Theorem holds but the interpretation of the required abstraction principles, such as HP∗ , will beg the question in much the same way. Our conclusion, therefore, is that the neo-logicist has no non-questionbegging account of how there could be an epistemically innocent route to the demonstration of platonistically construed mathematical existence claims. 41
41 Thanks to the participants at the Abstraction Day conference, St. Andrews, Scotland, 14th November 1998 for discussion and to Michelle Friend for comments on an earlier draft.
ARISTOTELIAN LOGIC, AXIOMS, AND ABSTRACTION 1 Roy T. Cook
1.
Introduction
Neo-logicism is the view that various branches of mathematics can be reformulated in terms of abstraction principles that we can stipulate, and thus come to know the truth of, a priori. The main success story of neo-logicism so far is the derivation of arithmetic from Hume’s Principle: HP: (∀P)(∀Q)[Num(P) = Num(Q) ↔ P ≈ Q] where P ≈ Q is the second-order formula asserting that there is a one-toone correspondence between the P’s and the Q’s. Recently Stewart Shapiro and Alan Weir have criticized this view, arguing in ‘Neo-logicist logic is not epistemically innocent’ [2000] that abstraction principles do not provide us with a priori access to the objects necessary for mathematics: Frege’s theorem requires use of first- and second-order logical principles which are not epistemically innocent. More exactly, certain of the logical principles which are essential to the derivation of a theorem of infinity, when this is construed as expressing the existence of infinitely many mind-independent entities, are at least as problematic epistemologically as axioms of infinity laid down simply as postulates. Our supposed knowledge of these principles is, we will argue, every bit as mysterious as Kantian intuition of an infinity of numbers. (p. 162)
Shapiro and Weir suggest that someone skeptical of the strength of full secondorder logic could accept that: . . . it is a logical truth that to every . . . sentence which is instantiated by something or other, there corresponds a co-extensional property. But she refuses to accept that logic alone tells us that there are uninstantiated properties, so refuses to conclude that to predicates such as x = x there corresponds a property. (p. 165) 1 This paper first appeared in Philosophia Mathematica 11, [2003], pp. 195–202. Reprinted by kind permission of the editor and Oxford University Press.
147 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 147–153. c 2007 Springer.
148
The Arché Papers on the Mathematics of Abstraction
To avoid such problematic assumptions, Shapiro and Weir suggest that the neo-logicist abandon standard second-order logic with the full comprehension schema and replace it with the following more restricted version of comprehension, guaranteeing the existence of a property for every instantiated predicate ( any (possibly complex) predicate): AristComp : (∃x1 , x2 , . . . , xn )((x1 , x2 , . . . , xn )) → (∃R)(∀x1 , x2 , . . . , xn ) (R(x1 , x2 , . . . , xn ) ↔ (x1 , x2 , . . . , xn )). Shapiro and Weir call the resulting logic Aristotelian. For the neo-logicism, the important difference between Aristotelian logic and standard second-order logic is that one cannot derive the existence of infinitely many numbers from HP in Aristotelian logic: . . . the Fregean proof of infinity fails . . . because we cannot prove that there are n + 1 numbers less than or equal to n; for all the Aristotelian knows, zero happens to be identical to some number Sk 0 between zero and n. As we have seen, it is possible for 0 = 1 to hold in a model in the Aristotelian framework in which case the number of numbers less than or equal to one is just one. (p. 168)
Shapiro and Weir’s result generalizes to any abstraction principle as long as the equivalence relation on the right-hand side of the biconditional is formulated in purely logical terminology—any such neo-logicist abstraction principle (including Frege’s notorious Basic Law V) will have a one-element model. Given an abstraction principle AP: AP: (∀P)(∀Q)[@(P) = @(Q) ↔ E(P, Q)] We can construct an Aristotelian model of AP by letting the domain consist of a single object, call it a. @ is then the function that maps any property or predicate onto a. Since there is only one property, {a}, AP is trivially satisfied. Thus, on an Aristotelian conception of logic, every second-order abstraction principle is consistent, but no second-order abstraction principle will imply that there are infinitely many objects. Shapiro and Weir conclude from all this that: . . . the neo-logicist has no non-question-begging account of how there could be an epistemically innocent route to the demonstration of platonistically construed mathematical existence claims. (p. 188)
Just because the neo-logicist project does not meet one of its goals (or even perhaps its most important goal) does not mean that it meets none of its goals, however. Even on the Aristotelian conception of logic, neo-logicism does provide us with a great deal, as the following two case studies demonstrate.
Aristotelian Logic, Axioms, and Abstraction
2.
149
Case study 1: Hume’s Principle
The first step in deriving second-order arithmetic from HP is the definition of the finite (or ‘natural’) numbers, the constant ‘0’, the binary successor relation ‘S’, and the ternary relations addition ‘+’ and multiplication ‘×’. In standard second-order logic HP implies that the definitions of S, + , and × define total functions, i.e. given these definitions, all of the Peano axioms for arithmetic follow from HP (for details see Wright [1983]). On the Aristotelian conception of logic we do not get all of this, but even so HP implies a significant chunk of Peano arithmetic. The successor and addition relations need not be total, but we can prove that they are partial functions. Of the seven Peano axioms, four are not implied by HP trivially since they contain ‘0’, the name of the number of the empty property, which does not (or might not) exist on the Aristotelian conception of logic. 2 The other three do follow. Of course, just because 0 does not exist does not mean that no numbers exist. We can define the number 1 as: 3 1 =df Num(x is a finite number ∧ (∀y)(S(x, y) ↔ + (x, x, y))) Assuming that we do not countenance empty models, 4 the existence and uniqueness of 1 follow from HP in Aristotelian logic. As a result we can formulate alternatives to each of the problematic Peano axioms in terms of ‘1’. For example, the problematic successor axiom can be expressed as: 5 (∃x)(x = 1) ∧ (∃x)(∀y)(¬(Sy, x)) Along similar lines, we can replace the axioms for addition, multiplication, and induction containing ‘0’ with: (∀x)(∀y)(S(x, y) → + (y, 1, x)) (∀x)(×(x, 1, x)) (∀P)[((∀w)((∀z)(¬S(z, w)) → P(w))) ∧ (∀x)(∀y)(S(x, y) → (P(x) → P(y)))) → (∀x)P(x)] Each of these follows from HP on the Aristotelian conception of logic. Interestingly, on the Aristotelian picture of logic, although we cannot prove that there are infinitely many numbers, we do get a substantial description of the behavior of the numbers that do exist. There are two main differences 2 Actually, as Shapiro and Weir point out, 0 is in a sense guaranteed to exist, since ‘Num(x = x)’ is guaranteed to have a referent. If there is no property P such that (Px ↔ x = x), however, then HP does not apply to 0. As a result, 0 could be identical to any object, including any other number. Thus, the real problem here is that, in Aristotelian logic, 0 may be very badly behaved. 3 In this formula and those below the quantifiers should be understood to be restricted to finite numbers. 4 In later sections of their paper Shapiro and Weir challenge this very assumption, examining the prospects for neo-logicist arithmetic in a free logic. Here, however, our concern is with Aristotelian logic and its motivations, such as Boolos’ plural reading of second-order quantification. As a result, assuming that the domain is non-empty seems both unproblematic and, since there might be no property to pick out the supposed empty domain, entirely natural. 5 Note that the number without a predecessor need not be identical to 1.
150
The Arché Papers on the Mathematics of Abstraction
between the consequences of HP in standard second-order logic and its consequences on the Aristotelian conception. First, the relations of successor and addition can turn out to be partial functions on Aristotelian logic. Second, we were forced to reformulate some of the axioms so that they did not rely explicitly on the existence (or, more accurately, on the good behavior) of 0. The axioms we obtain in Aristotelian second-order logic, however, are relatively natural 6 axioms for the non-zero natural numbers, and are defective only in the sense that we are not assured that successor, addition, and multiplication are total functions.
3.
Case study 2: NewV
A second case study is useful to convince us that the phenomenon in question is quite general. In this section we will investigate the neo-logicist treatment of set theory from the perspective of Aristotelian logic, based on Boolos’ [1989] NewV: 7 NewV: (∀P)(∀Q)[Ext(P) = Ext(Q) ↔ ((P is ‘Big’ ∧ Q is ‘Big’) ∨ (∀x)(Px ↔ Qx))] where ‘P is ‘Big’ is an abbreviation for the second-order formula asserting that the P’s are equinumerous with the entire domain. In standard secondorder logic, with ‘is a set’ and ‘∈’ defined in the standard way (see Boolos [1989] for details), NewV entails the extensionality, empty set, pairing, union, separation, and replacement axioms, but not the powerset axiom or the axiom of infinity. On the Aristotelian account of second-order logical consequence, however, NewV implies extensionality, pairing, union, and replacement, but fails to imply empty set or separation. The fact that the empty-set axiom does not follow on the Aristotelian conception of logic is unsurprising, since the empty-set axiom is equivalent to the claim that there is a property that is not ‘Big’ and has no instances. The failure of the axiom of replacement seems more surprising until one realizes that it (plus the claim that some set exists) implies the empty-set axiom. A revised version of separation that does not imply the existence of an empty set does follow from NewV in Aristotelian logic: Arist.Separation: (∀P)(∀x)((x is a set ∧ (∃w)(w ∈ x ∧ Pw)) → (∃y)(y is a set ∧ (∀z)(z ∈ y ↔ (z ∈ x ∧ Pz)))) 6 Historically, of course, 0 was not accepted as a legitimate number by many groups that were otherwise mathematically quite sophisticated, including the Greeks. In addition, both Peano and Dedekind formulated their original arithmetical axioms with 1 as the initial number. Thus, there is some reason for thinking that the axioms that do follow from HP on Aristotelian logic capture the (or an) intuitive conception of the natural numbers. Thanks are owed to Fraser MacBride for pointing this out. 7 I am ignoring the well-documented problems with NewV (see Shapiro and Weir [1999] and Boolos [1989]). Even if NewV is an inadequate foundation for a neo-logicist theory of sets, it nevertheless provides us with another nice example of how abstraction principles behave in Aristotelian contexts.
Aristotelian Logic, Axioms, and Abstraction
151
Thus, with suitable reformulations, the only axiom that does not follow from NewV on the Aristotelian conception of logic that does follow on the standard conception is the empty-set axiom. Again, we see a division between existential principles such as the empty-set axiom and principles that just govern the behavior of and interactions between sets without implying the existence of any particular sets. On the Aristotelian conception of logic NewV does not guarantee that any sets exist (since it has a one-element model) but it does guarantee that any sets that do exist behave exactly as we would expect them to behave, i.e. they satisfy (many of) the standard axioms of set theory.
4.
What does neo-Fregean abstraction actually give us?
There are two initial reactions that one might have to these results. The first, optimistic, reading was suggested at the beginning of this essay. Even if the Aristotelian challenge is correct, the neo-logicist can retrench, arguing that he has still given us something useful. First, even in the Aristotelian context, HP and NewV provide us with enough for many of the basic applications of arithmetic and set theory. In Aristotelian logic HP plus the claim that there are n distinct objects implies that there are (at least) n distinct numbers, and HP plus the claim that there is a non-numerical object implies that there are infinitely many finite numbers. 8 Similarly, NewV plus the claim that there are two distinct objects implies that there are infinitely many sets. Thus, adding HP or NewV to a suitably robust physical theory (i.e. one that contained one or more of these additional claims) would surely allow us to carry out much of the mathematics necessary for science. Second, and perhaps more importantly, abstraction principles such as HP and NewV imply strong constraints on the behavior of the concepts that they purport to define, even if they do not (on their own) imply the existence of any (or many) of the objects supposedly falling under the scope of these concepts. If the neo-logicists can still defend the claim that Hume’s Principle in some sense defines the concept of cardinal number (or NewV defines the concept of set), independent of any ontological implications, then they will have provided the philosophy of mathematics with something valuable. Replacing a collection of axioms haphazardly compiled over decades or centuries with a single principle that tells us exactly what a mathematical concept means and how the objects falling under it must behave is certainly a step in the right direction. Shapiro and Weir’s objections do not in any way affect this part of the neo-logicist project, since it is perfectly conceivable that we could provide a suitable account of the meaning of a concept without thereby judging one way or another what objects fall under this definition, if in fact any do. 8 None of this is of help to the neo-logicist, however, unless ‘there are n objects’ or ‘there is a nonnumerical object’ is knowable a priori.
152
The Arché Papers on the Mathematics of Abstraction
It is this last point, however, that brings us to the pessimistic reading. On this interpretation, neo-logicism was doomed from the start, since we should not expect any definition of a concept, even an implicit one, to imply the existence of infinitely many objects. George Boolos has expressed something like this worry: Despite the Godel incompleteness theorems and Russell’s protestations that the axiom of infinity was no logical truth, it was a central tenet of logical positivism that the truths of arithmetic were analytic. Positivism was dead by 1960 and the more traditional view, that analytic truths cannot entail the existence of either particular objects or of too many objects, has held sway since . . . it should be asked how a statement that cannot hold if there are only finitely many objects can possibly be thought to be analytic, a matter of meanings or ‘conceptual containment’. ([1997], pp. 249–250)
According to this line of thinking, what we should expect from our definitions, at best, is constraints on when the concept being defined is applicable. Thus, there must have been something wrong with the original formulation of neologicism, and the results of Shapiro and Weir’s paper (and of the present essay) finally show us exactly what that flaw was. The pessimistic reading, however, seems a bit harsh. If we accept from the beginning the idea that logic and definitions cannot have existential consequences, then neo-logicism is a non-starter. What the pessimist has got right, however, is emphasizing that, if neo-logicism is to be successful in explaining how we come to know of the existence of infinitely many mathematical objects from definitions and logic alone, then the neo-logicists owe us an explanation of the role of logic and, more specifically, a justification of their particular choice of logic. The results of Shapiro and Weir demonstrate that the choice of logic is more crucial than one might initially think. Neo-logicism follows not merely from the conjunction of views about the nature of definition and stipulation alone, but follows from these claims plus a view about what the correct account of logic is. Wright and Hale have explicitly pointed out the importance logic has to play in the neo-logicist project and the relative lack of attention it has received: The logicist theory about a particular mathematical theory is that its fundamental laws are obtainable on the basis just of definitions and logic. It would at the time of writing be a justifiable complaint that while much attention has been paid by neo-Fregeans, and their critics, to the first component in the recipe— issues to do with abstractions in general and Hume’s Principle in particular— comparatively little has been given to the second component: the demands, technical and philosophical, to be made on the logical system which is to provide the medium for the proofs the neo-Fregeans need. ([2001], p. 429, emphasis added)
Shapiro and Weir’s paper represents an important first step in fleshing out the second of these issues, the requirements on the logic of neo-logicism, and
Aristotelian Logic, Axioms, and Abstraction
153
the present discussion has, it is hoped, helped to sharpen their insights even further. While they are right to point out that: . . . the neo-logicist has no non-question-begging account of how there could be an epistemically innocent route to the demonstration of platonistically construed mathematical existence claims ([2000], p. 188, emphasis added) 9
it would be premature to conclude that no such non-question-begging account is possible. Rather, their arguments serve to point out that work remains to be done, and the direction that this work needs to take. 10
References Boolos, G. [1989], “Iteration Again”, Philosophical Topics 17: 5–21, reprinted in Boolos [1998], pp. 88–104. Boolos, G. [1997], “Is Hume’s Principle Analytic?”, in Heck [1997]: 245–261, reprinted in Boolos [1998], pp. 301–314. Boolos, G. [1998], Logic, Logic, and Logic, Cambridge Mass, Harvard University Press. Heck, R. [1997], Language, Thought, and Logic, Oxford, Clarendon Press. Shapiro, S. and A. Weir [1999], “NewV, ZF, and Abstraction”, Philosophia Mathematica 7: 293–321. Shapiro, S. and A. Weir [2000], “Neo-logicist Logic Is Not Epistemically Innocent”, Philosophia Mathematica 8: 160–189. Wright, C. [1983], Frege’s Conception of Numbers as Objects, Scots Philosophical Monographs, vol. 2, Aberdeen, Aberdeen University Press. Wright, C. and R. Hale [2001], The Reason’s Proper Study, Oxford, Oxford University Press.
9 Actually, they are not quite correct here, since, even in the Aristotelian context, HP implies ‘(∃x)(x = 1)’, which is, one would think, a ‘platonistically construed mathematical existence claim’. 10 A version of this note was presented to members of Arché: The Centre for the Philosophy of Logic, Language, Mathematics, and Mind at the University of St Andrews and benefited considerably from the resulting discussion. Thanks are also owed to Peter Clark, Philip Ebert, Fraser MacBride, Graham Priest, Agustín Rayo, Stewart Shapiro, Crispin Wright, and an anonymous referee for helpful comments and criticism.
FREGE’S UNOFFICIAL ARITHMETIC 1 A. Rayo
In The Foundations of Arithmetic and The Basic Laws of Arithmetic, Frege held the view that number-terms refer to objects. 2 Later in his life, however, he seems to have been open to other possibilities: Since a statement of number based on counting contains an assertion about a concept, in a logically perfect language a sentence used to make such a statement must contain two parts, first a sign for the concept about which the statement is made, and secondly a sign for a second-order concept. These second-order concepts form a series and there is a rule in accordance with which, if one of these concepts is given, we can specify the next. But still we do not have in them the numbers of arithmetic; we do not have objects, but concepts. How can we get from these concepts to the numbers of arithmetic in a way that cannot be faulted? Or are there simply no numbers in arithmetic? Could the numbers help to form signs for these second-order concepts, and yet not be signs in their own right? 3
To illustrate Frege’s point, let us consider the number–statement ‘there are three cats’. It might be paraphrased in a first-order language as: 4 (∃3 x)[C AT(x)].
(1)
If its logical form is to be taken at face value, (1) can be divided into two main logical components: first, the predicate ‘C AT(. . . )’, which for Frege refers to the (first-order) concept cat; and, second, the quantifier-expression ‘(∃3 x)[ . . . (x)]’, which for Frege refers to a second-order concept (specifically, the second-order concept which is true of the first-order concepts under which
1 This paper first appeared in Journal of Symbolic Logic 67, [2002], pp. 1623–1638. Reprinted by kind permission of the editor and the Association for Symbolic Logic. 2 This is reflected in his definition of number. See, for instance Frege (1884) §67. 3 Notes for Ludwig Darmstaedter, pp. 366–7. I have substituted ‘second-order’ for ‘second-level’. 4 As usual, ‘(∃ x)[φ (x)]’ is defined as ‘∃x(φ (x) ∧ ∀y(φ (y) → x = y))’, and (for n > 1) ‘(∃ x)[φ (x)]’ n 1 is defined as ‘∃x(φ (x)∧ (∃n−1 y)[φ (y)∧ y = x])’.
155 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 155–171. c 2007 Springer.
156
The Arché Papers on the Mathematics of Abstraction
precisely three objects fall). 5 Significantly, Frege would regard neither of these components as referring to an object. Let us now consider a close cousin of ‘there are three cats’, namely, ‘the number of the cats is three’. This sentence might be paraphrased as: the number of the cats = 3.
(2)
If its logical form is to be taken at face value, (2) cannot be divided into a predicate and a quantifier-expression, like (1). Instead, Frege would take ‘the number of the cats’ and ‘3’ to be names, referring to numbers (which he regarded as objects). Frege saw a deep connection between sentences like (1)—in which something is predicated of a concept—and sentences like (2)—in which something is predicated of the number associated with that concept. An effort to account for this connection was a main theme in his philosophy of arithmetic. But, after the discovery that Basic Law V leads to inconsistency, he found much reason for dissatisfaction with his original proposal. As evidenced by the quoted passage, he no longer felt confident about the possibility of getting from concepts to their numbers ‘in a way that cannot be faulted’. Towards the end of the passage, Frege considers an alternative: the view that there really are no numbers in arithmetic, and that—appearances to the contrary—numerals are not names of objects. They do not even instantiate a legitimate logical category, they are merely orthographic components of expressions standing for second-order concepts. The grammatical form of a sentence like (2) is therefore not indicative of its logical form. Presumably, ‘the number of the cats = 3’ is to be divided into two main logical components. First, the expression ‘. . . cats’, which refers to the (first-order) concept cat; and, second, the expression ‘the number of the . . . = 3’, which refers to a secondorder concept (specifically, the second-order concept which is true of the firstorder concepts under which precisely three objects fall). The numeral ‘3’ is merely an orthographic component of ‘the number of the . . . = 3’, in much the same way that ‘cat’ is an orthographic component of ‘caterpillar’. The outermost logical form of (2) is therefore identical to that of (1). If, in addition, it turns out that the logical form of ‘the number of the . . . = 3’ corresponds to that of ‘(∃3 x)[ . . . (x)]’, then the logical form of (1) is identical to that of (2). It is unfortunate that Frege never spelled out his unofficial proposal (as we shall call it) in any detail. In particular, he said nothing about how first-order arithmetic might be understood. Luckily, Harold Hodes has developed and defended a version of the Unofficial Proposal. 6 On Hodes’s reconstruction, a 5 For Frege, a first-order concept is a concept that takes objects as arguments, and an (n + 1)th-order concept is a concept that takes nth-order concepts as arguments. See Frege (1831903), §21. Unless otherwise noted, we shall use ‘concept’ to mean ‘first-order concept’. 6 See Hodes (1984). See also Wright (1983) pp. 36–40 and Bostock (1979), volume II chapter 1.
Frege’s Unofficial Arithmetic
157
sentence ‘F(n)’ of the language of first-order arithmetic is to be regarded as abbreviating a higher-order sentence ‘(FX )((∃n x)[Xx])’, where ‘(∃n x)[ . . . x]’ refers to a second-order concept, and ‘(FX )( . . . X . . . )’ refers to a third-order concept. For instance, the first-order sentence ‘PRIME(19)’ abbreviates a certain higher-order sentence ‘(PrimeX )((∃19 x)[Xx])’. On Hodes’s version of the Unofficial Proposal, quantified sentences involve quantification over second-order concepts. More specifically, they involve quantification over finite cardinality object-quantifiers: the referents of quantifier-expressions of the form ‘(∃n x)[ . . . x]’. 7 Thus, the first-order ‘∃zP RIME (z)’ would abbreviate the result of replacing the position occupied by ‘(∃19 x)[ . . . x]’ in ‘(PrimeX )((∃19 x)[Xx])’ by a variable ranging over finite cardinality object-quantifiers, and binding the new variable with an initial existential quantifier. Hodes’s account of first-order arithmetic therefore requires third-order quantification. And the obvious extension to nth-order arithmetic (for n ≥ 2) would call for (n + 2)th-order quantification. Such logical resources are increasingly problematic. 8 Here we shall see that more modest resources will do. We will develop a version of the Unofficial Proposal within a second-order language, and show that it can be used to account for nth order arithmetic (for any finite n). This, in itself, is a surprising result. But it is especially important in light of the fact that, although the use of higher-order languages is often considered problematic, recent work has done much to assuage concerns about certain second-order resources. 9 We will also see that the Unofficial Proposal has important applications in the philosophy of mathematics.
1.
Transformation
We will see that there is a general method for ‘nominalizing’ arithmetical formulas as second-order formulas containing no mathematical vocabulary. As an example, consider ‘The number of the cats is the number of the dogs’. This sentence might be nominalized as ‘The cats are just as many as the dogs’, or: ˆ OG(x)],10 x[C ˆ AT(x)] ≈ x[D where ‘≈’ expresses one–one correspondence. 11 7 See Hodes (1990) §3. 8 Hodes (1990), observation 5, offers a nominalization of second-order arithmetic which does not
exceed the resources of second-order logic. But it proceeds by encoding Ramsey sentences, and is therefore not a version of Frege’s Unofficial Proposal. 9 See Boolos (1984, 1985a, 1985b), McGee (2000), and Rayo and Yablo (2001). 10 Syntactically, an expression of the form ‘ x[φ ˆ (x)]’ takes the place of a monadic second-order variable. But the result of substituting ‘x[φ(x)]’ ˆ for ‘Y ’ in a formula ‘(Y )’ is to be understood as shorthand for: ∀W (∀x(W x ↔ φ (x)) → (W)). 11 That is, ‘X ≈ Y ’ abbreviates ∃R[∀w(Xw → ∃!v(Yv ∧ Rwv)) ∧ ∀w(Yw → ∃!v(Xv ∧ Rvw))]
158
The Arché Papers on the Mathematics of Abstraction
Consider now the sentence ‘the number of the cats is 3’. It can be nominalized as: ˆ AT(x]); 3 f (x[C where numeral-predicates are defined in the obvious way:
r 0 f (X ) ≡d f ∀ v¬X (v); r 1 f (X ) ≡d f ∃W ∃v(0 f (W ) ∧ ¬W (v) ∧ ∀ w(X (w) ↔ (W (w) ∨ w = v))); r 2 f (X ) ≡d f ∃W ∃v(1 f (W ) ∧ ¬W (v) ∧ ∀ w(X (w) ↔ (W (w) ∨ w = v))); r etc. This sort of nominalization can easily be generalized. In order to do so, we work within a two-sorted second-order language L containing the following variables: first-order arithmetical variables, ‘m 1 ’, ‘m 2 ’, . . . , monadic secondorder arithmetical variables ‘M1 ’, ‘M2 ’, . . . , first-order general variables, ‘x1 ’, ‘x2 ’, . . . , and, for n a positive integer, n-place second-order general variables X 1n , X 2 , . . . . 12 We assume that L has been enriched with a single higher-level predicate ‘N’ taking a monadic second-order general variable in its first argument-place and a first-order arithmetical variable in its second argument-place. 13 The well-formed formulas of L are defined in the usual way, with the proviso that an atomic formula can contain arithmetical variables only if it is of the form m i = m j , Mi m j or N(X i1 , m j ). 14 On the intended interpretation, arithmetical variables are taken to range over the natural numbers, and general variables are taken to have an unrestricted 12 As a precaution against variable clashes, we divide monadic second-order general variables in two: the 1 —which we abbreviate Z —will be paired with first-order arithmetical variables; the X 1 X 2i i 2i+1 — which we abbreviate X i —will be used for more general purposes. Also to avoid variable clashes, we 2 —which we abbreviate R —will be divide dyadic second-order general variables in two: the X 2i i 2 —which we abbreviate Ri2 —will be used paired with second-order arithmetical variables; the X 2i+1 3 — for more general purposes. Finally, we divide triadic second-order general variables in two: the X 2i 3 —which which we abbreviate Si —will be paired with third-order arithmetical variables; the X 2i+1 we abbreviate Ri3 —will be used for more general purposes. For n > 3, we use Rin as a terminological variant of X in . We will sometimes appeal to the introduction of unused variables. We employ ‘m’ as an unused first-order arithmetical variable, ‘w’, ‘v’, and ‘u’ as unused first-order general variables, ‘M’ as an unused second-order arithmetical variable, ‘W ’, ‘V ,’ and ‘U ’ as unused monadic second-order general variables, and, for each n > 1 (to be determined by context), we employ ‘R’ as an unused n-place secondorder general variable. (It is worth noting that appeal to unused variables could be avoided by renumbering subscripts.) It will often be convenient regard ‘x’, ‘y’, and ‘z’ as arbitrary first-order general variables and ‘X ’, ‘Y ’, and ‘Z ’ as arbitrary (monadic) second-order general variables. 13 For a discussion of higher-order predicates see Rayo, A. “Word and Objects.” Noûs 36, 436–464 (2002). 14 Formally, the well-formed formulas of L can be characterized as follows: (a) N(X 1 ,m ) and m = j i i m j are formulas; (b) for any n-place atomic predicate P other than ‘N’, P(xi 1 , . . . , xi n ) is a formula; n (c) Mi m j and X i (x ji , . . . , x jn ) are formulas; (d) if φ and ψ are formulas, then ¬φ, (φ ∧ψ), ∃m i φ, ∃Mi φ, ∃xi φ, and ∃X in φ are formulas; and (e) nothing else is a formula.
159
Frege’s Unofficial Arithmetic
range. 15 In addition, ‘N(X i1 , m j )’ is true just in case the number of the X i1 is m j . Consider ‘The number of the cats is three’ as an example. It can be formalized in L as: ∃ m 1 (N(xˆ1 [C AT(x1 )], m 1 ) ∧ 3(m 1 ));
(3)
where, again, the number predicates are defined in the obvious way:
r 0(m) ≡d f ∃W (0 f (W ) ∧ N(W , m)); r 1(m) ≡d f ∃W (1 f (W ) ∧ N(W , m)); r 2(m) ≡d f ∃W (2 f (W ) ∧ N(W , m)); r etc. 16 Arithmetical predicates such as ‘S UCCESSOR’, ‘S UM’ and ‘P RODUCT’ can easily be defined in terms of ‘N’ and purely logical vocabulary. 17 So, without appealing to arithmetical primitives beyond ‘N’, the whole of pure and applied second-order arithmetic can be expressed within L. It will be convenient to introduce the following definitions, which are couched in purely logical vocabulary: Definition 1: F(X ) ≡d f ¬∃W (∃ w(¬Ww ∧ ∀ v(Xv ↔ (Wv ∨ v = w))) ∧ W ≈ X ) (there are at most finitely many Xs). Definition 2: ∃ f X φ (X ) ≡d f ∃ X (F(X ) ∧ φ(X )). 15 More precisely, first-order arithmetical variables are taken to range over the natural numbers, and
first-order general variables are taken to have an unrestricted range. The range of the second-order variables is to be characterized accordingly. For instance, on a Fregean interpretation of second-order quantification, second-order arithmetical variables are taken to range over first-order concepts under which natural numbers fall, and second-order general variables are taken to range over first-order concepts under which arbitrary objects fall. 16 We use number-predicates rather than numerals for the sake of simplicity, but it is worth noting that our nominalization could be carried out even if L was extended to contain numerals. To see this, note that—using standard techniques—any formula φ of the extended language can be transformed into an equivalent formula φ ∗ of the original language in which numerals have been eliminated in favor of corresponding number-predicates (defined as above). One can then identify the nominalization of φ with that of φ ∗ . 17 The definitions run as follows: S UCCESSOR(m i , m j ) ≡d f ∀V ∀U [(N(V , m i )∧ N(U , m j )) → ∃u(Uu ∧ wˆ [Uw ∧w = u] ≈ V )]; S UM(m i , m j , m k ) ≡d f ∀V ∀U ∀W [(N(V , m i )∧ N(U , m j ) ∧ N(W , m k ) ∧ ∀w(V w→ ¬Uw)) → w[V ˆ w ∨ Uw] ≈ W ]; P RODUCT(m i , m j , m k ) ≡d f ∀V ∀U ∀w[(N(V , m i )∧N(U , m j )∧N(W , m k )) → ∃R[∀v∀u((V v∧ Uu) → ∃!w(Ww ∧ Rvuw)) ∧ ∀w(Ww → ∃!v∃!u(V v∧ Uu ∧ Rvuw))]]:
160
The Arché Papers on the Mathematics of Abstraction
Our nominalization method can now be generalized to encompass the whole of first-order arithmetic by way of the following transformation: 18
r Tr(∃ m i (φ)) = ∃ f Z i ∧ Tr(φ); r Tr( m i = m j ) = Zi ≈Z j ; r Tr( N(Xi , m j )) = X i ≈ Z j . Intuitively, the transformation works by replacing talk of the number of the Fs by talk of the Fs themselves. As an example, let us return to ‘the number of the cats is three’. It can be formalized in L as: ∃ m 1 (N(xˆ1 [C AT(x1 )], m 1 ) ∧ 3(m 1 )); which Tr converts to: ∃ f Z 1 (xˆ1 [C AT(x1 )] ≈ Z 1 ∧ 3 f (Z 1 )); or, equivalently: 3 f (xˆ1 [C AT(x1 )]). For further illustration, note that ‘the number of the cats is the number of the dogs’ can be formalized in L as: ∃ m 1 [N(xˆ1 [C AT(x1 )], m 1 ) ∧ (N(xˆ1 [DOG(x1 )], m 1 )]. which Tr converts to: ∃ f Z 1 [xˆ1 [C AT(x1 )] ≈ Z 1 ∧ xˆ1 [DOG(x1 )] ≈ Z 1 ], or, equivalently: xˆ1 [C AT(x1 )] ≈ xˆ1 [DOG(x1 )]. It is worth emphasizing that mixed identity statements such as ‘m i = x j ’ are not well-formed formulas of L, so our transformation has not been defined for them. Intuitively, this means that the transformation is undefined for sentences along the lines of ‘The number 2 is Julius Caesar’, which do not express internal properties of a mathematical structure. We call such sentences Caesar sentences. This is as it should be. The view that numbers are objects led Frege to the uncomfortable question of whether the number belonging to the concept cat is, for instance, Julius Caesar. But in the context of our nominalizations, 18 The remaining clauses are trivial:
r r r r r r r r r
Tr(¬φ) = ‘¬’ ∧ Tr(φ); Tr(φ∧ψ) = ‘(’ Tr(φ) ‘’ Tr(ψ) ‘)’; Tr(∃xi (φ)) = ∃xi (Tr(φ)); Tr(∃X i (φ)) = ∃X i ∧ (Tr(φ)); Tr(X i x j ) = X i x j ; Tr(∃Rin (φ)) = ∃Rin (Tr(φ)); Tr(Rin (x j1 , . . . , x jn )) = Rin (x j1 , . . . , x jn ); Tr(xi = x j ) = xi = x j ; Tr(Pnj (xi 1 , . . . . , xi n )) = Pnj (xi 1 . . . , xi n ).
Frege’s Unofficial Arithmetic
161
such questions never arise, because number-terms do not refer to objects. ‘The number belonging to the concept cat is the number belonging to the concept dog’ is nominalized as ‘the objects falling under the concept cat are in one– one correspondence with the objects falling under the concept dog’, and ‘the number belonging to the concept cat is 3’ is nominalized as ‘there are three objects falling under the concept cat’. The question whether Julius Caesar is the number belonging to the concept cat isn’t only uncomfortable because it appears to be non-sensical. It also underscores a problem Paul Benacerraf made famous, that if mathematical terms refer to objects, then nothing in our mathematical practice determines which objects they refer to. 19 A remarkable feature of the Unofficial Proposal is that it avoids Benacerraf’s Problem altogether. It would, however, be a mistake to conclude from this that the Unofficial Proposal is the last word on Benacerraf’s Problem, since the inscrutability of reference pervades far beyond arithmetic.
2.
Second-order arithmetic
On the assumption that there are infinitely many objects in the range of the general variables of L, a certain kind of coding can be used extend Tr so that it encompasses second-order arithmetic (thanks here to . . . ). Intuitively, the coding works by representing each arithmetical concept Mi by a dyadic relation Ri . Specifically, we represent the fact that a number m j falls under Mi by having it be the case that some concept W under which precisely m j objects fall be such that some individual v bears Ri to all and only the individuals falling under W . 20 We implement the coding by enriching our transformation with the following two clauses: 21
r Tr(∃ Mi (φ)) = ∃ Ri Tr(φ); r Tr( Mi m j ) = ∃ v(F(u[R ˆ i (v, u)]) ∧ Z j ≈ u[R ˆ i (v, u)]).
3.
Higher-order arithmetic
It is possible to express any (non-Caesar) formula in the language of n-th order arithmetic as a formula of L for which Tr is defined, provided that the range of the general variables contains at least גn−2 many objects. 19 See Benacerraf (1965). 20 We represent the fact that the number zero falls under M by having it be the case that some object i bears Ri to nothing. Thus, in order to represent the fact that zero does not fall under Mi we must have it be the case that every object bears Ri either to n objects for some n > 0 falling under Mi , or to infinitely many
objects. 21 Polyadic second-order quantification can be defined as monadic second-order quantification over sequences, which can be simulated within first-order arithmetic.
162
The Arché Papers on the Mathematics of Abstraction
Consider the case of third-order arithmetic. Intuitively, we proceed by pairing each second-order concept αi with a triadic relation Si in such a way that a set of numbers M j falls under αi just in case there is some object x with the following property: (*) For any number n, M j n holds just in case there is some object y such that there are exactly n vs. satisfying Si (x, y, v). 22 So that the ‘empty’ second-order concept (i.e. the second-order concept under which no first-order concept falls) may be represented, we let Si represent the fact that M j falls under αi only if there is an object x such that it is both the case that (*) is satisfied, and that there is no y such that Si (x, y, x). The ‘empty’ second-order concept can then be represented by any relation Si such that for every x there is some y such that Si (x, y, x). Formally, if ‘αi ’ is a monadic third-order variable restricted to the natural numbers, 23 we define a transformation C as follows: 24
r C(∃ αi φ ) = ∃ Si C(φ); r C(αi (M j )) = ∃ x[∀ y(¬Si (x, y, x)) ∧ ∀ m(M j m ↔ ∃ y(N(v[S ˆ i (x, y, v)], m)))]
On the assumption that the range of the general variables contains least continuum many objects, it is easy to verify that, for any formula of third-order arithmetic, φ, on which C is defined, φ ↔ C(φ). By using n-adic relations instead of triadic ones, this procedure can be extended to n-th order arithmetic. And, on the assumption that the range of the general variables contains at least גn−2 objects, it will be the case that, for any formula of n-th order arithmetic, φ on which C is defined, φ ↔ C(φ).
4.
Numbering numbers
One would like to be able to number cats. But one would also like to be able to number numbers. One would like to say, for example, that the number of primes smaller than ten is four. And, unfortunately, an expression such as ‘N(mˆ i [P RIME-LESS - THAN-10(m i )], m j )’ is not well-formed formula of L because ‘N’ can only admit of a general variable in its first argument-place. 25 22 We represent the fact that the number zero falls under M by having it be the case that some object y j is such that there are no vs satisfying Si (x, y, v). Thus, in order to represent the fact that zero does not fall under M j we must have it be the case that every object y is either such that that there are n vs satisfying Si (x, y, v) for some n > 0 falling under M j , or such that there are infinitely many vs satisfying Si (x, y, v). 23 For instance, on a Fregean interpretation of third-order quantification, ‘α ’ ranges over second-order i concepts under which fall first-order concepts under which fall natural numbers. 24 The remaining clauses are trivial. 25 In analogy with the above, we let the result of substituting ‘m ˆ i [φ (m i )]’ for ‘M j ’ in a formula ‘ψ(M j )’ be shorthand for
∀M(∀m i (Mm i ↔ φ(m i )) → (M)).
Frege’s Unofficial Arithmetic
163
To remedy the situation, we may define a predicate ‘NN(Mi , m j )’, by appealing to the same sort of coding as before. Informally, ‘NN(Mi , m j )’ is to abbreviate a formula of L to the effect that there is a binary relation R with the following properties:
r for any number n, Mi n holds just in case some member of the domain of R
is paired with exactly n+ 1 objects; 26 r every member of the domain of R is paired with finitely many objects; r for any x and y in the domain of R, if the objects paired with x are as many as the objects paired with y, then x = y; r the domain of R contains exactly m j objects. 27
The new predicate allows us to say that the number of primes smaller than ten is four. It also allows us to say that the number of primes smaller than three is the number of objects falling under the concept cat: ∃ m 2 (NN(mˆ 1 [P RIME - LESS - THAN-6(m 1 )], m 2 ) ∧ N(xˆ1 [C AT(x1 )], m 2 )). 28 And, as desired, our any expression of the form NN(Mi , m j ) is definitionally equivalent to a well-formed formula of L.
5.
Formulas of L and their transformations
Our nominalization method is now complete. 29 Caesar sentences aside, any formula in the language of n-th order applied arithmetic can be expressed as a formula of L for which Tr is defined. And the result of applying Tr is always a formula with no mathematical vocabulary. We may now give a general characterization of the relationship between a formula and its transformation. In order to do so, consider the following five principles, all of which hold on the intended interpretation of L: 1. ∀ X (∃ m(N(X , m)) → ∃ !m(N(X , m))) (If m is a number of the Xs, then m is the number of the Xs.)
26 We require that a member of the domain of R be paired with n + 1 objects rather than n objects in order to accommodate the fact that the number zero might fall under Mi , since every member of the domain of R must be paired with at least one object. 27 More precisely, ‘NN(M , m )’ is to abbreviate: i j
∃R[∀m k (Mi m k ↔ ∃w∃W ∃u(Rwu ∧ ∀v(Wv ↔ (Rwv ∧ v = u)) ∧ N(W , m k ))) ∧ ∀w∀v(Rwv → ∃ f W ∀u(Wu ↔ Rwu)) ∧ ∀w∀v∀W ∀V ((∃u(Rwu) ∧ ∀u(Wu ↔Rwu) ∧ ∀u(V u↔Rvu) ∧ W ≈ V ) → w = v)∧ ∃W (∀v(Wv ↔ ∃u(Rvu)) ∧ N(W , m j )) ]; for m k an unused variable. 28 Whereas ‘C AT (. . . )’ may be regarded as an atomic predicate, ‘P RIME - LESS - THAN-6(. . . )’ abbreviates a complex formula constructed using the arithmetical predicates defined in Footnote 17. 29 So far we have only been concerned with the arithmetic of finite cardinals. But it is worth noting that a similar transformation could be applied to the language of infinite cardinal arithmetic.
164
The Arché Papers on the Mathematics of Abstraction
2. ∀ m∃ X N(X , m) (Given any number m, there are some objects such that m belongs to those objects.) 3. ∀ X (∃ m(N(X , m)) ↔ F(X )) (A number belongs to the Xs just in case they are at most finite in number.) 4. ∀ X ∀ Y [∀ m(N(X , m) → (Y , m)) ↔ X ≈ Y )]. (A number belonging to the Xs is also a number belonging to the Ys just in case the Xs are in one–one correspondence with the Ys.) 5. ∃ X ¬F(X ) (There are infinitely many things in the range of the general variables.)
Let Ꮽ be the conjunction of these five principles, and let φ T r be a notational variant for Tr(φ) . It is possible to show that, for any sentence φ of L, 30 Ꮽ φ ↔ φT r
where ‘ ’ expresses derivability in a standard second-order deductive system. In order to prove this result, a few preliminaries are necessary. ˆ Ri (v, u)], m))): Definition 3: N(Ri , m j ) ≡d f ∀ m(M j m ↔ ∃ v(N (u[ Definition 4: If mi1 , . . . , m ik , M j1 , . . . , M jl are arithmetical variables, we let ______________________ m i1 , . . . , m ik , M j1 , . . . , M jl abbreviate the following: (N(Z i1 , m i1 ) ∧ · · · ∧ N(Z ik , m ik )∧ N(R j , M j1 ) ∧ · · · ∧ N(R jl , M jl )). Definition 5: If φ is a formula of L, with free arithmetical variables mi1 , . . . , m ik , M j1 , . . . , M jl , we let φ ↔* φ T r abbreviate the universal closure of the following: ______________________ m i1 , . . . ,m ik ,M j1 , . . . ,M jl → (φ ↔ φ T r ). If φ contains no free arithmetical variables, we let φ ↔* φ T r be φ ↔ φ T r . Finally, we proceed to our main result: Theorem 1: If φ is a well-formed formula of L, then Ꮽ φ ↔* φ T r. See appendix for proof. [An interesting feature of the proof is that the fifth conjunct of Ꮽ is required only to ensure the adequacy of the coding for second-order variables set forth in Section 2. In particular, the fifth conjunct is not required to prove a version of the theorem restricted to first-order 30 Here and in what follows I assume that, as a precaution against variable clashes, φ contains no variables for the form Z i , Ri or Si .
Frege’s Unofficial Arithmetic
165
arithmetic. On the other hand, without its fifth conjunct—or, alternatively, without a principle guaranteeing the existence of infinitely objects in the range of the arithmetical variables—the standard arithmetical axioms do not follow from Ꮽ.] Corollary 1: (Completeness of Ꮽ with respect to applied arithmetic) If φ is a sentence of L and T is the set of true sentences of L which do not contain ‘N’, then either A ∪T φ or Ꮽ ∪T ¬φ. Proof: Let φ be a sentence of L. It is easy to verify that φ T r does not contain ‘N’. Therefore, either T φ T r or T ¬φ T r , since either φ T r ∈ T or ¬φ T r ∈ T . But, since φ contains no free variables, it follows from our theorem that Ꮽ φ ↔ φ T r . So, either Ꮽ ∪ T φ or Ꮽ ∪ T ¬φ. Corollary 2: Suppose Ꮽ holds when ‘N(X , m)’ is interpreted as ‘the number of the Xs is m’. Let φ (m i ) be a well-formed formula of L, and let ψ(Zi ) be Tr(φ (m i )). If there are at most finitely many Fs, then φ(m i ) is true of the number of the Fs just in case ψ(Z i ) is true of the Fs. 31 Proof: Immediate from theorem.
6.
Interpreting second-order languages
We have taken care to ensure that the outputs of our transformation are always second-order formulas. So an interpretation for second-order quantifiers is all we need to make sense of our nominalizations. Frege took secondorder quantifiers to range over concepts, but Fregean concepts might be considered problematic on the grounds that they constitute ‘items’ which are not objects. Not any alternative will do. On Quine’s interpretation, second-order logic is ‘set-theory in sheep’s clothing’. So we would have succeeded in eliminating number-terms from arithmetic only by making use of set-terms. And, from the perspective of the Unofficial Proposal, set-terms are presumably no less problematic than number-terms. Nor is any progress made by interpreting second-order logic as Boolos has suggested. 32 Some of our definitions make essential use of polyadic second-order quantifiers, which Boolos treats as ranging (plurally) over ordered n-tuples. And, again, from the perspective of the Unofficial Proposal, ordered-pair-terms are presumably no less problematic than numbers-terms. 31 In fact, the result is slightly more general. Suppose φ (m , . . . , m ) is a formula of L and let ψ(Z , i1 in i1 . . . , Z i n ) be Tr(φ (m i 1 , . . . , m i n )); suppose, moreover, that there are at most finitely many F1 s, at most finitely many F2 s, . . . , and at most finitely many Fn s. Then φ(m i 1 , . . . , m i n ) is true when m i 1 is the number of the F1 s, m i 2 is the number of the F2 s, . . . , and m i n is the number of the Fn s just in case ψ(Z i 1 , . . . , Z i n ) is true when the Z i 1 s are the F1 s, the Z i 2 s are the F2 s, . . . , and the Z i n s are the Fn s. 32 See Boolos (1984) and Boolos (1985a).
166
The Arché Papers on the Mathematics of Abstraction
Some deviousness is needed to avoid Fregean concepts without betraying the spirit of the Unofficial Proposal. One way of doing so is by defining second-order quantifiers implicitly, in terms of an open-ended schema, as in McGee’s ‘Everything’. Another is by interpreting second-order logic as in Rayo and Yablo’s ‘Nominalism through De-Nominalization’. Alternatively, one might argue that genuine second-order quantification is to be accepted as a primitive.
7.
Applications
Frege’s Unofficial Proposal—the view that number–statements are to be eliminated in favor of their transformations—can take several different forms, depending on the sort of elimination one has in mind. On an approach like Hodes’s, number–statements are taken to abbreviate their transformations. As a result, number-terms do not refer to objects, and there is room for rejecting the existence of numbers altogether. The Unofficial Proposal might therefore provide a basis for a nominalist philosophy of arithmetic. It should be noted, however, that unless the universe is infinite, φ T r will not always have the truth-value that φ receives on its standard interpretation. In order to avoid infinity assumptions, a nominalist might claim that a number– statement φ abbreviates ‘necessarily, (ξ → φ T r )’, where ‘ξ ’ is a sentence stating that there are infinitely many objects, such as ‘∃ X ¬F(X )’. On the plausible condition that it is possible for the universe to infinite, ‘necessarily, (ξ →φ T r )’ is true if and only if φ is true on its standard interpretation. 33 A different approach towards the Unofficial Proposal might serve the purposes of the Neo-Fregean Program, championed by Bob Hale and Crispin Wright. Neo-Fregeans believe that Hume’s Principle allows us to reconceptualize the state of affairs which is described by saying that the Fs are as many as the Gs, and that, on the reconceptualization, that same state of affairs is rightly described by saying that the number of the Fs is the number of the Gs. 34 A version of the Unofficial Proposal might allow Neo-Fregeans to make the more general claim that every number–statement φ describes—on the appropriate reconceptualization—the state of affairs which is otherwise described by φ T r . Even if the Unofficial Proposal is to be abandoned altogether, it would be a mistake to neglect the connection between number–statements and their transformations described in Section 5. For non-nominalist accounts of mathematics must yield the result that there is no special mystery about how one might come to know what the truth-values of mathematical sentences are. But, on the assumption that A can be known to be true, our theorem ensures 33 For more on modal strategies, see part II of Burgess and Rosen (1997). Hodes discusses a modal strategy in Section III of Hodes (1984). 34 See Wright (1997), Section I, and Hale (1997).
Frege’s Unofficial Arithmetic
167
that this goal can be achieved for the case of pure and applied arithmetic. Let φ be an arithmetical sentence of L. When Ꮽ is known, it follows from our theorem that one is in a position to derive φ ↔ φ T r . So, insofar as one is in a position to know the truth of φ T r , which contains no arithmetical vocabulary, one is also in a position to know the truth of φ. 35 (Of course, one may not be in a position to know the truth of φ T r . In that case one is not, for all that has been said, in a position to know φ. But that cannot be used as an objection against a non-nominalist account of mathematical knowledge. Such an account is required to show that mathematical knowledge is no more mysterious than non-mathematical knowledge, not that all knowledge is unproblematic.)
8.
Logicism
Our theorem provides us with a partial vindication of Logicism. For whenever φ is a sentence of pure arithmetic (appropriately expressed in L), φ T r is a sentence of pure second-order logic. Moreover, Tr allows us to express formulas of pure arithmetic as formulas of pure second-order logic in a way which preserves compositionality. 36 This would constitute a complete vindication of Logicism if it were true as a matter of pure logic that, for every appropriate φ, Tr(φ) has the truth-value that φ would receive on its standard interpretation. Unfortunately, the general equivalence in truth-value holds only if the universe is big enough, and the size of the universe is not a matter of pure logic. Tr doesn’t reduce arithmetic to logic—but it comes close.
Appendix The theorem is proved by induction on the complexity of φ. Trivial cases are omitted.
r Assume φ = N(X i , m j ). Then φ ↔* φ T r is the universal closure of N(Z j , m j ) → (N(X i , m j ) ↔ X i ≈ Z j ), which is an immediate consequence of Ꮽ (first and fourth conjuncts). 35 For a more detailed treatment of this issue see Rayo, A. (2004) “ Frege’s Correlation.” Analysis 64, 119–122. It is worth noting that the completeness of the second-order Dedekind-Peano axioms yields a similar result for the case of pure second-order arithmetic, and that the quasi-categoricity result in McGee (1997) yields a similar result for the case of pure set-theory. 36 Unlike nominalization in terms of Ramsey sentences, Tr respects the logical connectives and quantifiers:
r r r r
Tr(¬φ) = ‘¬’ Tr(φ), Tr(φ ∧ ψ) = Tr(φ) ‘∧’ Tr (ψ), Tr(∃m i φ) = ∃ f Zi Tr(φ), Tr(∃Mi φ) = ∃Ri Tr(φ).
168
The Arché Papers on the Mathematics of Abstraction
r Assume φ = m i = m j . Then φ ↔* φ T r is the universal closure of (N(Z i , m i ) ∧ N(Z j , m j )) → (m i = m j ↔ Z i ≈ Z j ), which is an immediate consequence of Ꮽ (first and fourth conjuncts).
r Assume φ = M j m i . Then φ ↔* φ T r is the universal closure of m i , M j → (M j m i ↔ ∃ v(F(u[R ˆ j (v, u)]) ∧ Z i ≈ u[R ˆ j (v, u])).
We make the following two assumptions: N(Z i , m i ),
(1)
N(R j , M j ).
(2)
Recall that (2) is shorthand for ˆ j (v, u)], m))), ∀ m(M j m ↔ ∃ v(N(u[R
(3)
from which it follows immediately that (M j m i ↔ ∃ v(N(u[R ˆ j (v, u)], m i ))).
(4)
From (1) and (4), together with Ꮽ (first and fourth conjuncts), it follows that M j m i ↔ ∃ v(Z i .u[R ˆ j (v, u)]),
(5)
And from (1) and (5), together with Ꮽ (first, third and fourth conjuncts), it follows that M j m i ↔ ∃ v(F(u[R ˆ j (v, u)]) ∧ Z i ≈ u[R ˆ j (v, u)]) :
(6)
Discharging assumptions (1) and (2) we get: m i , M j → (M j m i ↔ ∃ v(F(u[R ˆ j (v, u)]) ∧ Z i ≈ u[R ˆ j (v, u)])).
(7)
And the desired result follows from (7) by universal generalization.
r Assume φ = ∃ m i ψ(m i ). Let ψ have free arithmetical variables m i , . . . , 1 m ik ,M j1 , . . . , M jl distinct from m i . 37 Then φ ↔* φ T r is the universal closure of: m i1 , . . ., m ik , M j1 , . . ., M jl → (∃ m i ψ(m i ) ↔ ∃ f Z i ψ T r (Z i )).
By inductive hypothesis, the following is provable from HP: m i , m i1 , . . . , m ik , M j1 , . . . , M jl → (ψ(m i ) ↔ ψ T r (Z i )).
(1)
We make the following two assumptions: m i1 , . . . , m ik , M j1 , . . . , M jl ,
(2)
∃ m i ψ(m i ).
(3)
37 The case where ψ has no free arithmetical variables distinct from m , and the case where ψ does not i contain m i free require trivial differences in terminology. We ignore them for the sake of brevity.
169
Frege’s Unofficial Arithmetic
By Ꮽ (second and third conjuncts), it follows from (3) that ∃ m i ∃ W (F(W ) ∧ N(W, m i ) ∧ ψ(m i )).
(4)
So, by existential instantiation, F(C) ∧ N(C, c) ∧ ψ(c).
(5)
c, m i1 , . . ., m ik , M j1 , . . ., M jl → (ψ(c) ↔ ψ T r (C)).
(6)
But by (1) we have: And from (2), (5), and (6) we may conclude ψ T r (C).
(7)
∃ f Z i ψ T r (Z i ),
(8)
Thus, making again use of (5), and, discharging assumption (3), ∃ m i ψ(m i ) → ∃ f Z i ψ T r (Z i ).
(9)
∃ f Z i ψ T r (Z i ).
(10)
F(C) ∧ ψ T r (C).
(11)
Conversely, assume By existential instantiation: It is a consequence of (10) and Ꮽ (third conjunct) that ∃ mN(C, m).
(12)
From (12) we obtain the following, by existential instantiation: N(C, c).
(13)
c, m i1 , . . ., m ik , M j1 , . . ., M jl → (ψ(c) ↔ ψ T r (C)).
(14)
But by (1) we have: And from (2), the second conjunct of (11), (13), and (14) we may conclude ψ(c).
(15)
∃ m i ψ(m i ),
(16)
Thus, and, discharging assumption (10), ∃ f Z i ψ T r (Z i ) → ∃ m i ψ(m i ).
(17)
Finally, we combine (9) and (17), and discharge assumption (2): m i1 , . . . , m ik , M j1 , . . ., M jl → (∃ n m i ψ(m i ) ↔ ∃ f Z i ψ T r (Z i )). (18) The desired result is then obtained by universal generalization.
170
The Arché Papers on the Mathematics of Abstraction
r Assume φ = ∃ M j ψ (M j ). Let ψ have free arithmetical variables m i , . . . , 1 m ik , M j1 , . . . , M jl distinct from M j . 38 Then φ ↔* φ T r is the universal closure of: m i1 , . . ., m ik , M j1 , . . ., M jl → (∃ M j ψ(M j ) ↔ ∃ R j ψ T r (R j )).
By inductive hypothesis, the following is provable from HP: M j , m i1 , . . ., m ik , M j1 , . . ., M jl → (ψ(M j ) ↔ ψ T r (R j )).
(1)
We make the following two assumptions: m i1 , . . ., m ik , M j1 , . . ., M jl ,
(2)
∃ M j ψ(M j ).
(3)
By Ꮽ (second, third and fifth conjuncts), it follows from (3) that ∃ M j (∃ R(N(R, M j )) ∧ ψ T r (M j )).
(4)
So, by existential instantiation, N(P, C) ∧ ψ(C).
(5)
C, m i1 , . . ., m ik , M j1 , . . ., M jl → (ψ(C) ↔ ψ T r (P)).
(6)
But by (1) we have: And from (2), (5) and (6) we may conclude ψ T r (P).
(7)
∃ R j ψ T r (R j ),
(8)
Thus, and, discharging assumption (3), ∃ M j ψ T r (M j ) → ∃ R j ψ T r (R j ).
(9)
∃ R j ψ T r (R j ).
(10)
ψ T r (P).
(11)
∃ M∀ m(Mm ↔ ∃ v(N(u[P(v, ˆ u)], m))).
(12)
Conversely, assume By existential instantiation, The following is a logical truth: But (12) is definitionally equivalent to ∃ MN(P, M).
(13)
N(P, C).
(14)
So, by existential instantiation,
38 The case where ψ has no free arithmetical variables distinct from M , and the case where ψ does not j contain M j free require trivial differences in terminology. We ignore them for the sake of brevity.
171
Frege’s Unofficial Arithmetic
But by (1) we have: C, m i1 , . . ., m ik , M j1 , . . ., M jl → (ψ(C) ↔ ψ T r (P))
(15)
And from (2), (11), (14), and (15) we may conclude ψ(C).
(16)
∃ M j ψ(M j ),
(17)
Thus, and, discharging assumption (10), ∃ R j ψ T r (R j ) → ∃ M j ψ(M j ).
(18)
Finally, we combine (10) and (18), and discharge assumption (2): m i1 , . . ., m ik , M j1 , . . ., M jl → (∃ M j ψ(M j ) ↔ ∃ R j ψ T r (R j )).
(19)
The desired result is then obtained by universal generalization.
References Beaney, M., ed. (1997) The Frege Reader, Blackwell, Oxford. Benacerraf, P. (1965) “What Numbers Could not Be,” The Philosophical Review 74, 47–73. Reprinted in Paul Benacerraf and Hilary Putnam, Philosophy of Mathematics. Benacerraf, P., and H. Putnam, eds. (1983) Philosophy of Mathematics, Cambridge University Press, Cambridge, second edition. Boolos, G. (1984) “To Be is to Be a Value of a Variable (or to be Some Values of Some Variables),” The Journal of Philosophy 81, 430–49. Reprinted in George Boolos, Logic, Logic and Logic. Boolos, G. (1985a) “Nominalist Platonism,” Philosophical Review 94, 327–44. Reprinted in George Boolos, Logic, Logic and Logic. Boolos, G. (1985b) “Reading the Begriffsschrift,” Mind 94, 331–34. Reprinted in George Boolos, Logic, Logic and Logic. Boolos, G. (1998) Logic, Logic and Logic, Harvard, Cambridge, Massachusetts. Bostock, D. (1979) Logic and Arithmetic, Clarendon Press, Oxford. Burgess, J., and G. Rosen (1997) A Subject With No Object, Oxford University Press, New York. Frege, G. (1884) Die Grundlagen der Arithmetik. English Translation by J.L. Austin, The Foundations of Arithmetic, Northwestern University Press, Evanston, IL, 1980. Frege, G. (1893/1903) Grundgesetze der Arithmetik. Vol. 1 (1893), Vol. 2 (1903). English Translation by Montgomery Furth, The Basic Laws of Arithmetic, University of California Press, Berkeley and Los Angeles, 1964. Frege, G. (1919) “Notes for Ludwig Darmstaedter.” Reprinted in Michael Beaney, The Frege Reader. Hale, B. (1997) “Grundlagen x64,” Proceedings of the Aristotelian Society 97, 243–61. Reprinted in Bob Hale and Crispin Wright, The Reason’s Proper Study. Hale, B., and C. Wright (2001) The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics, Clarendon Press. Heck, R., ed. (1997) Language, Thought and Logic, Clarendon Press, Oxford. Hodes, H. T. (1984) “Logicism and the Ontological Commitments of Arithmetic,” Journal of Philosophy 81:3, 123–49. Hodes, H. T. (1990) “Where do Natural Numbers Come From?” Synthese 84, 347–407. McGee, V. (1997) “How We Learn Mathematical Language,” Philosophical Review 106:1, 35– 68. McGee, V. (2000) “Everything.” In Gila Sher and Richard Tieszen, Between Logic and Intuition.
172
The Arché Papers on the Mathematics of Abstraction
Rayo, A. (2002) “Word and Objects.” Noûs 36, 436–64. Rayo, A. (2004) “Frege’s Correlation.” Analysis 64, 119–22. Rayo, A., and S. Yablo (2001) “Nominalism Through De-Nominalization,” Noûs 35:1. Sher, G., and R. Tieszen, eds. (2000) Between Logic and Intuition, Cambridge University Press, New York and Cambridge. Wright, C. (1983) Frege’s Conception of Numbers as Objects, Aberdeen University Press, Aberdeen. Wright, C. (1997) “The Significance of Frege’s Theorem.” In Richard Heck, Language Thought and Logic.
III
ABSTRACTION AND THE CONTINUUM
REALS BY ABSTRACTION 1 Bob Hale
1.
General aim and basic ideas
1.1
Abstraction
A Fregean abstraction principle is now usually taken to be a principle of the general form: ∀α∀β(§α = §β ↔ α ≈ β) where ≈ is an equivalence relation on entities denoted by expressions of the type of α and β and § is an operator which forms singular terms when applied to constant expressions of the same type. The most prominent examples in Frege’s own writings are the Direction equivalence: the direction of line a = the direction of line b iff lines a and b are parallel together with what is now often called Hume’s principle: the number of Fs = the number of Gs iff the Fs and the Gs are 1–1 correlated and his ill-fated Basic Law V: the extension of F = the extension of G iff F and G are co-extensive In general, an abstraction principle seeks to give necessary and sufficient conditions for the identity of objects mentioned on its left-hand side in terms of the holding of a suitable equivalence relation between entities of some other sort. The Direction equivalence is a first-order abstraction, because its equivalence relation is a first-level relation on objects, whereas Hume’s principle and Basic Law V are second-order, their equivalence relations being second-level relations on concepts.
1.2
Frege’s logicism
Frege discusses at Grundlagen §§60–67 the suggestion that number might be contextually defined by means of Hume’s principle, but rejects it because 1 This paper first appeared in Philosophia Mathematica 8, [2000], pp. 100–123. Reprinted by kind permission of the editor and Oxford University Press.
175 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 175–196. c 2007 Springer.
176
The Arché Papers on the Mathematics of Abstraction
he can see no way to solve what is now called the Caesar problem. The problem is that while Hume’s principle provides the means to settle, at least in principle, the truth-values of identity-statements linking terms for numbers when those terms are of the form ‘the number of Fs’(or definitional abbreviations of such terms), it appears not to enable us to answer questions of numerical identity, when one of the terms is not of that form, such as whether the number of Jupiter’s moons = Julius Caesar. Frege then immediately switches to his well-known explicit definition of number in terms of extensions (or classes): the number of Fs = the class of concepts 1–1 correlated with F. This requires him to provide a theory of extensions or classes, which he does by means of Basic Law V. As is well known, Basic Law V is inconsistent. Frege’s own attempt to arrive at a restricted axiom on classes which is both consistent and able to serve in its place as the basis for his hoped-for derivation of arithmetic from logic was unsuccessful and he eventually abandoned his belief that arithmetic could be provided with a purely logical foundation. Further, whilst we now know—or at least think we know—how to formulate a consistent theory of sets, this affords no comfort to anyone in sympathy with Frege’s logicist project, for two reasons. One is that this theory—Zermelo–Fraenkel set theory, say—is not plausibly viewed as a purely logical theory, owing to the very substantial existence assumptions it involves. The other is that Frege’s definition of number cannot be consistently embedded in the theory, because the objects with which it identifies cardinal numbers are too big to be treated as sets.
1.3
Neo-Fregean logicism
As far as elementary arithmetic goes, Frege’s only indispensable appeal, in Grundlagen and in Grundgesetze 2 to his explicit definition of number (and thence to Basic Law V) is in proving Hume’s principle from it. That is, once Hume’s principle has been established as a theorem, no further appeal need be made, either to the explicit definition or to Basic Law V, in deriving as theorems what are, near enough, the Dedekind–Peano axioms for arithmetic. These include, crucially, the axiom asserting that every natural number has another natural number as its successor, which amounts (in the presence of the others) to the assertion that there are infinitely many natural numbers. This fact is now, following a suggestion of the late George Boolos, 3 referred to as Frege’s Theorem. What Frege’s Theorem asserts, in effect, is that if Hume’s principle is added to a standard formulation of second-order logic as a further axiom, the resulting system suffices for the derivation of elementary 2 As far as Grundlagen goes, this is quite clear from a reading of §§68–83 and is emphasised by Crispin Wright in his Frege’s Conception of Numbers as Objects (Aberdeen University Press 1983). That the same is true of Grundgesetze is shown by Richard G. Heck, Jr., in “The Development of Arithmetic in Frege’s Grundgesetze der Arithmetik” The Journal of Symbolic Logic 58 (1993), pp. 579–601. 3 cf. “The Standard of Equality of Numbers” in George Boolos (ed.) Meaning and Method: Essays in Honor of Hilary Putnam (Cambridge University Press 1990), pp. 261–77.
Reals by Abstraction
177
arithmetic. It is known that this system is consistent—or at least, that it is so, if second-order arithmetic is. Whether this fact supports any kind of logicism about arithmetic depends, of course, on the status of Hume’s principle. Boolos, along with many others, denies—plausibly, in my view—that it can be regarded as a truth of logic. Further, Hume’s principle cannot be taken as a definition, in any strict sense, because it does not permit the elimination of numerical terms in all contexts. This does not settle the issue, however, since it may be claimed that the principle is analytic, or a conceptual truth, in some sense broader than: either a truth of logic or reducible to one by means of definitions. That it can be so regarded is the view—now often called neo-Fregean logicism—of Crispin Wright and myself. 4 I do not intend, here, to defend this view of arithmetic against the many objections to our claim that Hume’s principle is a conceptual truth about numbers. Nor shall I offer a solution to the Julius Caesar problem 5 —though this must be (and we believe can be) done, if our view is to be viable. Nor, finally, shall I offer a general philosophical defence of the idea—which is again central to our view—that abstraction principles (provided they are consistent and perhaps meet certain other constraints) provide a legitimate means of introducing concepts of various kinds of abstract object in such a way that the existence of those objects depends only upon there being true instances of their right-hand sides. 6 Instead, what I want to do is explain one way in which I think it may be possible to extend our view beyond elementary arithmetic, to encompass the theory of real numbers. I say ‘one way’ because there are, on the face of it, several different ways in which one might try to do this.
1.4
Reals via Fregean set theory
In some ways, the most obvious approach—the one which has probably received most attention in recent work 7 —is a set-theoretic one. This would involve formulating a consistent Fregean axiom for sets to replace Basic Law V—an axiom which could form the basis of a theory of sets powerful enough 4 cf. Wright Frege’s Conception . . . , “The Philosophical Significance of Frege’s Theorem” in Richard G. Heck, Jr. (ed.) Language, Thought, and Logic: Essays in Honour of Michael Dummett (Oxford 1997) and “Is Hume’s principle analytic?” in Bob Hale & Crispin Wright The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics, Oxford: Clarendon Press 2001; Hale Abstract Objects (Blackwell 1987), “Dummett’s critique of Wright’s attempt to resuscitate Frege” Philosophia Mathematica (3) Vol. 2 (1994), pp. 122–47 and “Grundlagen §64” Aristotelian Society Proceedings 1997, pp. 243–62. For Boolos’s opposed view, see “Is Hume’s principle analytic?” in Heck (ed) op cit. 5 qv works cited in fn 4. 6 qv works cited in fn 4. 7 cf. George Boolos “Iteration Again” Philosophical Topics XVII, 2 (1989) pp. 5–21, also “Saving Frege from Contradiction” Aristotelian Society Proceedings 1987, pp. 137–51 and “Basic Law V” Aristotelian Society Supplementary Volume 67 (1993), pp. 213–34; Crispin Wright “The Philosophical Signficance of Frege’s Theorem”, Stewart Shapiro and Alan Weir “New V, ZF and Abstraction”, Philosophia Mathematica (3), vol. 7, 293–321.
178
The Arché Papers on the Mathematics of Abstraction
to support one or other of the usual set-theoretic constructions (Dedekind’s or Cantor’s) of the reals. The most obvious way to do this is by means of a suitably restricted version of Basic Law V, and a good deal of work has been done on one particular axiom of this sort, which builds in a restriction on the ‘size’ of concepts which are permitted to have sets corresponding to them which obey the principle of extensionality. 8 I shall not discuss this work here, save to remark that some of it seems to me to show that the prospects for obtaining a satisfactory treatment of the reals along this line are uncertain at best. In particular, as Boolos observed, 9 a theory based on secondorder logic plus this axiom alone, without further comprehension or existence assumptions, will not enable us to prove either an axiom of infinity or a power set axiom. So it will not yield sets large enough for the construction of the reals. This is not conclusive evidence against a broadly set-theoretic approach, of course, since it may be possible to formulate some other more powerful but still consistent Fregean axiom for sets which will give us large enough sets. Or again, it may be possible to justify supplementing this particular restricted version of Basic Law V with other principles to obtain a strong enough theory. I take no stand on that question here. 10 Instead, I want to pursue a quite different approach, which is in some respects much more like that taken by Frege in his incomplete treatment of the reals in Grundgesetze, although it differs from Frege’s in at least one quite fundamental way. This approach can roughly be described by saying that it tries (i) to minimise reliance on set theory and (ii) to obtain the reals very directly by means of abstraction principles, without any form of set-abstraction. In these respects, I think my approach may be seen as the most direct and natural way of extending the neo-Fregean position to the reals. Just as basing elementary arithmetic on Hume’s principle minimises (and, indeed, eliminates) reliance on set theory by avoiding a definition of cardinal numbers as certain equivalence classes, introducing them instead via a specifically numerical abstraction—so my approach to the arithmetic of real numbers will minimise (and indeed eliminate) reliance on set theory by avoiding a definition of reals as sets of one kind or another, introducing them instead via abstraction principles which— even if not happily described as purely numerical—are not distinctively set-theoretical. 8 The axiom (New V) is: ∀F∀G[*F= *G ↔ ((Small(F)∨ Small(G)) → ∀x(Fx↔ Gx))], where a concept is Small if fewer objects fall under it than fall under the universal concept ξ = ξ , and *F is what Boolos calls the ‘subtension’ of F (the subtensions of Small concepts being sets)—see Boolos “Saving Frege from contradiction”, Proceedings of the Aristotelian Society 87 (1986/87), pp. 137–51; also below p. 190ff. 9 cf. “Iteration again”. 10 For a brief discussion of this possibility, see Crispin Wright “On the Philosophical Significance of Frege’s Theorem”, section XI.
Reals by Abstraction
1.5
179
Reals as ratios of quantities
Frege’s actual (incomplete) treatment of the reals in Grundgesetze Pt III 11 is, of course, unsatisfactory—if only because it relies, as does his theory of cardinal numbers, on an inconsistent theory of extensions, and cannot be simply relocated within any standard (and plausibly consistent) theory of sets such as ZF or NBG because the objects with which he proposes to identify the reals are too big to be treated as sets. In any case, such a relocation would obviously betray Frege’s philosophical aims, since it would leave our entitlement to the substantial existential commitments of the theory quite unaccounted for. From a philosophical standpoint, the most striking and most important features of Frege’s treatment of the reals are two: (i) the real numbers are to be defined as ratios of quantities [§§73,157] and (ii) in regard to the analysis of the notion of quantity, the fundamental question requiring to be answered is not: What properties must an object have, if it is to be a quantity? but: What properties must a concept have, if the objects falling under it are to constitute quantities of a single kind? [§§160–61]. Briefly and roughly, his insistence that reals be defined as ratios of quantities derives from his belief that the application of reals as measures of quantities is essential to their very nature, and so should be built into an adequate definition of them. It is this, more perhaps than any other single consideration, which underlies his dissatisfaction with the theories of Cantor and Dedekind, on which the applicability of the reals appears, in Frege’s view, merely as an incidental extra. As regards the second point, it is obvious to anyone that there are many different kinds of quantity (lengths, masses, volumes, angles, etc.) and that addition and comparison (as greater or less) make sense only as applied to quantities of the same kind. Since we may not simply take the notion of a kind of quantity for granted, as already understood and itself in no need of analysis, we cannot explain what a quantity is by saying that it is something which can be added to, or be greater or less than, (other) quantities of the same kind. If an explanation of quantity is not to be vitiated by circularity in this way, Frege thinks, it must take as its target the notion of a kind of quantity, and say what characteristics a collection of entities must, as a whole, possess if it is to form what he calls a quantitative domain [ein Grössengebiet]. When that has been done, what it is to be a quantity can be easily stated—an object is a quantity if it belongs, together with other objects, to a quantitative domain. I believe Frege was substantially right on both points. Here I shall simply assume as much, without argument. Where I disagree with him is over the analysis of what he calls quantitative domains. For reasons which I shall not go into, Frege decides that the elements of a quantitative domain should themselves be relations and—heavily influenced by a passage from Gauss [quoted 11 For expositions see Michael Dummett Frege Philosophy of Mathematics (Duckworth, 1991), ch. 22 and Peter Simons “Frege’s Theory of Real Numbers” History and philosophy of logic 8 (1987), pp. 25–44.
180
The Arché Papers on the Mathematics of Abstraction
in Grundgesetze §162]—analyses such a domain as an ordered group of permutations on an underlying set, with composition as its additive operation. Since quantities themselves are, on his approach, relations of a certain sort, real numbers, when defined as ratios of quantities, turn out to be relations of relations. One advantage of Frege’s approach is that it provides very easily for negative as well as positive real numbers. I do not have space to discuss Frege’s view properly here. Whilst there is justice in his criticism of earlier writers who simply help themselves to the notion of quantities being of the same kind, I think that the notions of addition and quantitative comparability are central and fundamental to the general notion of quantity in a way Frege fails to acknowledge. Accordingly, I shall propose a different account of quantitative domains—one which gives a central role to the idea that the elements of such a domain may always be added to yield further elements.
2. 2.1
Quantities and reals Types of quantitative domain
I distinguish between the entities (usually concrete objects) which may stand in various quantitative relations to one another—such as being longer than, or being as long as—and quantities themselves, which I take to be abstract objects introduced by abstraction on quantitative equivalence relations—for example: the length of a = the length of b ↔ a is as long as b This way of introducing (terms for) quantities makes no explicit mention of addition. However, a full analysis of the notion of a quantitative relation would, I claim, show that the notion of addition is nevertheless central to that of quantity. I do not have space to go into details here, but the essential idea is this. Among quantitative relations, we may distinguish—as conceptually basic— what may be called relations of simple quantitative comparison (e.g. longer than/as long as, heavier than/as heavy as, etc.) from relations of numerically definite or determinate comparison (e.g. twice as long as, 2.4 kg heavier than, etc.). A necessary condition for φ to denote a kind of quantity is that it be associated with a pair of relations of simple quantitative comparison: more φ than and as φ as. In virtue of this, things which are φ may be partially ordered with respect to φ-ness. However, the existence of an associated pair of such relations—a strict partial ordering relation and a cognate equivalence relation—is insufficient for φ-ness to be a kind of quantity. There are enormously many adjectives in ordinary use which may be substituted without violence to sense or syntax in the schemas: more φ than and as φ as—‘sweet’, ‘elegant’, ‘graceful’, ‘pretty’, ‘clumsy’, ‘ambitious’, ‘impatient’, ‘irrascible’, ‘probable’, . . . is clearly no more than the start of a potentially very long list. But in the case of only relatively few of them is it remotely plausible that they denote something properly describable as a quantity. It is therefore
Reals by Abstraction
181
necessary to enquire what further condition needs to be satisfied, if such a pair of relations are properly to be viewed as quantitative. I contend that what makes the difference between quantitative ordering relations and others is that in the case of a quantitative ordering relation, but not otherwise, the entities which can significantly be asserted to stand in the relation can (at least in principle) be combined in such a way that compounds must come later in the relevant ordering than their components. In other words, for more φ than to be a quantitative ordering relation, there must be an operation of combination c on items lying in the field of more φ than, analogous to addition, such that c is more φ than a and ab c is more φ for any a,b in more φ than’s field, ab 12 than b. Quantitative domains are composed of (abstract) quantities. My aim in this section is to provide an informal axiomatic characterisation of such domains, on the basis of which it will be possible to introduce real numbers by means of an appropriate abstraction principle. Instead of simply laying down a single set of axioms for something to be a quantitative domain, I shall distinguish several—successively richer—types of quantitative domain. This will be helpful later, when I come to consider questions about the existence of quantitative domains. 1.1 A minimal q-domain is a non-empty collection Q of entities closed under an additive operation ⊕, which commutes, associates and satisfies the strong trichotomy law that for any a,b∈Q we have exactly one of: ∃c(a = b ⊕ c), ∃c(b = a ⊕ c) or a = b. Any minimal q-domain is strictly totally ordered by < , defined by: a < b ↔ ∃c(a ⊕ c = b). Multiplication of elements of Q by positive integers is easily defined—inductively—in terms of ⊕. 1.2 A normal q-domain is any minimal q-domain meeting the [Archimedean] comparability condition: ∀a,b∈Q ∃ (ma > b). Here and subsequently (unless explicitly indicated), m (and later n as well) ranges over positive integers. This requires quantities to be finite, in the sense that no quantity is infinitely greater (or smaller) than any other—it rules out infinitesimal 12 The basic idea is of course not new. It is, in particular, central to the theory of measurement advanced by N.R. Campbell in a number of works first published in the 1920s, the most important of them being Physics: the Elements (originally published by Cambridge University Press, 1919, and subsequently republished as Foundations of Science (Dover 1957)—see Part 2) and Measurement and Calculation (Longmans, Green & Co, London 1928). A briefer popular statement of his theory is given in What is Science? (Methuen, London 1921—see ch. VI). Whilst there is much in Campbell’s overall theory which I think we neither can nor need accept, I believe that Campbell was right, pace critics such as Brian Ellis (see Basic Concepts of Measurement, Cambridge University Press 1966, ch. IV), to insist upon the importance of a physical analogue of addition, and right too (at least in essentials) in taking there to be an important distinction between fundamental and derived measurement. More recent treatments of measurement—see, for example, the comprehensive text of Krantz, D.H., Luce, R.D., Suppes, P. and Tversky, A. (Foundations of Measurement New York and London: Academic Press 1971 (vol 1), 1989 (vols 2,3))—have not looked kindly on these distinctive features of Campbell’s approach. I need hardly emphasise that the very rough and dogmatic statement of my view, both here and in the text, requires both considerable qualification and further explanation, as well as defence.
182
The Arché Papers on the Mathematics of Abstraction
quantities. With his eye on Euclid’s Def.4 of Elements Bk.V, Howard Stein 13 describes it as the condition necessary and sufficient for a and b to have a ratio. It might be compared, in status, to the requirement on concepts presupposed by Hume’s principle, that the concepts through which it quantifies be sortal— which might be described as the condition for a concept to have a (cardinal) number. Where Q, Q* are any normal q-domains, not necessarily distinct, we introduce ratios of quantities by the abstraction principle: EM
∀a, bεQ∀c, dεQ∗ [a : b = c : d ↔ ∀m, n(ma < => nb ↔ mc< => nd)]
That is, ratios a:b and c:d are the same just if equimultiples of their numerators stand in the same order relations to equimultiples of their denominators. 14 The condition for identity of ratios is framed so as to allow that one and the same ratio may be at the same time a ratio of pairs of quantities of different kinds— belonging to different domains—such as masses and lengths. The operation in terms of which comparability is ultimately defined (i.e. addition of quantities) is, of course, domain specific—no sense is given to adding a length and a mass, for instance. But this does not preclude the introduction of ratios so that the same ratio may be found among, say, both masses and lengths. 1.3 A normal q-domain Q is full if ∀a,b,c∈Q∃q∈Q(a:b = q:c). This condition, which is a restricted form of the ancient postulate of ‘fourth proportionals’, ensures that, given a pair of ratios a:b and c:d, there is a quantity c such that c :b = c:d, so that we may always, without loss of generality, restrict attention to ratios with common denominators. I shall refer to it as CD. It is easy to see that CD ensures that there is no smallest quantity. 15 1.4 A full q-domain may be incomplete, in the sense that it may include only quantities which are rationally measurable; in consequence, the set of all ratios on a full domain is not guaranteed to include ratios corresponding
13 “Eudoxos and Dedekind: On the Ancient Greek Theory of Ratios and its Relation to Modern Mathematics” Synthese 84 (1990), pp. 163–82. Whilst the approach I pursue here differs quite radically from anything suggested by Stein, I have derived much benefit from this excellent paper. 14 This is, of course, the central principle in the ancient theory of proportion presented in Euclid’s Elements Book V (cf. Def.5) and standardly attributed to Eudoxos. I should perhaps emphasise that EM is not an abstraction principle of the form characterised at the outset. On the other hand, it should be clear that it is intended to work in essentially the same way as paradigm abstractions like the Direction equivalence and Hume’s principle and that it is reasonable to regard it as one. We might bring EM into line with the characterisation of abstraction principles with which I began by first defining an equivalence relation on ordered pairs of quantities: E[(a,b), (c,d)] ↔ ∀m,n (ma < => nb ↔ mc ⇔ nd), and then setting: Ratio(a,b) = Ratio(c,d) ↔ E[(a,b), (c,d)]. Alternatively, if it were felt desirable to avoid reliance on the notion of an ordered pair, we could introduce an extension of the notion of an equivalence relation so as to allow relations of arity greater than 2 to qualify as equivalence relations. Later we shall meet another abstraction principle which does not, as it stands, conform to the usual characterisation, but which may readily be brought into line in one or other of these ways. 15 Although I am not identifying quantities, as such, with numbers of any kind, it should be fairly clear that a full domain, and likewise the domain of ratios on it, is dense, and that we can develop an ‘arithmetic’ of ratios structurally analogous to that of the positive rationals.
Reals by Abstraction
183
to any, much less all, (positive) irrational numbers. 16 If ratio-abstraction is to yield all the positive reals, we require a complete domain. Indulging—for convenience, but avoidably—in set-theoretic language, we say that a subset S of quantities belonging to a q-domain Q is bounded above by b iff for every quantity a in S, a ≤ b. A quantity b ∈ Q is a least upper bound of S ⊆ Q iff b bounds S above & ∀c(c bounds S above → b ≤ c), and finally that a q-domain Q is complete iff Q is full and every bounded above non-empty S ⊆ Q has a least upper bound.
2.2
Real numbers
We may straightforwardly define ‘bounded above’ ‘lub’ and ‘ordercomplete’ for ratios in a way that parallels our definitions of these notions for quantities and then prove, as an easy consequence of the completeness of the underlying domain, that where Q is any complete q-domain, the set RQ of ratios on Q is order-complete. 17 It can be shown that if Q and Q∗ are any ∗ complete q-domains, they are isomorphic, so that R Q = R Q , i.e. the set of ratios on Q is identical with the set of ratios on Q∗ . Thus provided there exists at least one complete q-domain, we can introduce the positive real numbers, by abstraction, as the ratios on that domain. In standard constructions of the various number systems, negative numbers make their entry at an early stage. The method by which this is accomplished—introducing a new, enlarged domain including negative numbers as certain ordered pairs (difference pairs) of numbers belonging to an underlying domain—is, however, perfectly general, in the sense that it is quite inessential to it that the numbers in the underlying domain should be natural numbers. Of course, we must start with the natural numbers if we want to get just the integers—but in general, all that is required for the application of the method itself is that the objects belonging to the underlying domain have the requisite arithmetic properties. There is, so far as I can see, no reason, either technical or philosophical, why this step may not just as well be taken at a (much) later stage. In particular, essentially the same construction can be used to get negative reals, starting from positive ones, as difference pairs of positive reals. Letting x, y, z, . . . range over, and ⊕ stand for addition of, positive reals, we obtain difference pairs of positive reals by the abstraction: D (x, y) = (z, w) ↔ x ⊕ w = y ⊕ z 16 Of course, since quantitative domains, as I have characterised them, do not include either a zero quantity or negative quantities, the ratios on such domains will not, in any case, have elements corresponding to all the reals. 17 Proof : Let S be any bounded above subset of R Q . By CD, each ratio in S can be expressed with a single common denominator, so that the members of S are: a1 :b, a2 :b, . . . , ai:b , . . . The set of numerators of these ratios is a non-empty subset of Q, and so—by the completeness of Q—have a least upper bound a◦ . Since every ai = a◦ , ai :b = a◦ :b for every ratio ai :b in S. And if some ratio p:q is less than a◦ :b, it follows [by CD] that p:q = p :b for some p , with p < a◦ . But then by the completeness of Q, there is some ak among the numerators of the ratios ai :b so that p < ak , and hence a ratio ak :b in S such that p :b < ak :b. So a◦ :b is a least upper bound of S.
184
The Arché Papers on the Mathematics of Abstraction
Defining < , > , addition, subtraction and multiplication and zero for dpairs in the obvious way, it can be shown that the collection R of d-pairs forms a field with the operations + and ×. Further, there is a subset P of R, namely the set of all pairs (x, y) such that (z, z) < (x, y), meeting the conditions: (i) if (x, y), (z, w) ∈ P then (x, y) + (z, w) ∈ P ∧(x, y) × (z, w) ∈ P and (ii) if (x, y) ∈ R, then exactly one of (x, y) ∈ P, (y, x) ∈ P or (x, y) = (z, z) holds
Thus R is an ordered field. There is an obvious isomorphism between the strictly positive subset P of R and the positive reals as previously defined. Using this, it can be shown without too much difficulty that R is complete.
3.
The existence of quantitative domains
Our result thus far is conditional: real numbers may be obtained by abstraction on quantities, if there exists at least one complete q-domain. Even if this were the best result that could be obtained, it is not completely obvious that this would signal the collapse of the neo-Fregean abstractionist approach to foundations. It might be possible to provide principled reasons for adopting different attitudes towards the question of the existence of reals and that of the natural numbers, holding that while the latter admits of resolution, a priori, in the affirmative, the existence of the reals is a matter on which no similar a priori assurance is to be expected. According to such a view, the existence of (at least) finite cardinal numbers would be a matter of necessity— whatever the universe might be like, its ingredient objects would be assignable to distinguishable sorts or kinds; there would be some sortal concepts or other, under which the objects fell, so that for various concepts F and cardinal numbers n, there would be facts of the form: the number of Fs = n. More importantly, for any such sortal concept F, there will be a sortal—F-andnot-F—logically guaranteed to have no objects falling under it, in terms of which 0 may be defined, thus giving the necessary toe-hold for a Fregean proof of the existence of an infinite collection of finite cardinals. But there can be no similar a priori guarantee that the physical universe comprises quantities which are real-valued—it is perfectly conceivable, even if in fact false, that the physical world should be discontinuous. So a result which says, in effect, that if it does exhibit continuity, the real numbers are available to measure it, might not appear utterly outrageous. Defending this position would, naturally, require speaking to the contrary intuition, that while it may be in some way an empirical question whether the physical universe is continuous, and so an empirical question whether the reals have ‘objective’ application [in the sense that there actually are real-valued quantities—contrast the idea that using the reals simply affords a useful simplification of applied mathematics], the existence of the reals should not itself be an empirical, a posteriori matter.
Reals by Abstraction
185
Clearly, however, it is important to enquire whether a neo-Fregean can secure a stronger result. Evidently, the question of greatest interest is whether there can be proved to exist a complete q-domain. But it is worth emphasising that the question arises, not only for the case of complete q-domains, but equally for q-domains of the more modest kinds described—thus far, nothing has been done to establish the existence of a full q-domain, or even that of a normal, or even minimal, one. Even the question of the existence of a minimal domain is anything but trivial. A minimal domain is, by definition, non-empty. Since such a domain is closed under its addition operation and satisfies the additive trichotomy condition, it must comprise arbitrarily large quantities, and thus be at least countably infinite. To anyone who thinks of quantities as physical entities of some sort, the existence of such a domain must, for this reason, appear open to serious question. On my own view, quantities such as lengths, masses, angles, etc., should not be thought of a physical entities; they are, rather, abstract objects, ‘introduced’ via abstraction principles employing appropriate equivalence relations on the concrete objects whose lengths, masses, etc., they are. But this makes no essential difference, so far as the present question is concerned. At least, it will make no difference if the existence of a given length, say, is taken to be contingent upon the existence of a suitable concrete entity of which it is the length; for in that case, the ground for doubt about the existence of arbitrarily large quantities of any given kind remains. Clearly there must be an analogous doubt about the existence of arbitrarily small quantities, and hence about the existence of a full q-domain. However, it seems to me that these doubts may be assuaged and that we can actually prove the existence of at least one domain of each of the kinds I have distinguished, including complete domains. The crucial point here is to notice that whilst quantities as such are not identified, in my approach, with numbers, nothing in the characterisation of qdomains precludes such domains being composed of numbers. As previously remarked, Hume’s principle suffices for a derivation of the Dedekind–Peano axioms for elementary arithmetic, and hence for a proof of the existence of an infinite sequence of natural numbers—0, 1, 2, . . . . Omitting 0 to obtain the strictly positive naturals, N+ , and adjusting the usual recursive definitions of + and × to suit, we can easily show that N+ constitutes a minimal—and indeed a normal—q-domain. It is clear that N+ is not itself a full domain, i.e. it does not satisfy CD. + However, the collection R N of ratios on N+ does constitute a full domain. To see this, note first that since N+ is normal, there exists a ratio a:b for every a and b in N+ . Let a, b, c, d, e, f be any elements of N+ . Then what we must show is that there is a ratio g:h such that [a:b]:[c:d] = [g:h]:[e:f]. It is quite straightforward to verify that [a:b]:[c:d] = ad:bc = ade:bce = [ade:bcf]:[bce:bcf] = [ade:bcf]:[e:f]
186
The Arché Papers on the Mathematics of Abstraction
so that [ade:bcf] is our required ratio. 18 In the presence of CD, satisfaction + by R N of the minimality and normality conditions follows easily from their + satisfaction by the underlying domain N+ . Thus R N is a full domain. What we have, in effect, is a quite natural way of obtaining the positive rationals by abstraction on the positive natural numbers—each and every positive rational is simply a ratio positive natural numbers. Thus 3/4 just is the ratio 3:4. Of course, it is also the ratio 6:8 and the ratio 9:12, etc., but that is no problem, since these are all simply one and the same ratio in our sense (i.e. by the lights of EM). + It is clear that iteration of the abstractive procedure which yields R N from N+ will not yield any new kind of q-domain. The crucial point emerges above, in the observation that [a:b]:[c:d] = ad:bc. This holds quite generally—any ratio of ratios of positive natural numbers are simply ratios of positive natural numbers. In the same way, ratios of ratios of ratios of positive natural numbers collapse to ratios of positive natural numbers. Iteration of the abstraction to ratios of higher order thus merely gives us the positive rationals all over again. Thus the operation by which we obtained a full domain from an underlying normal one cannot, when re-applied to a full domain, yield a complete one. This is a special case of a quite general fact about first-order abstraction: no first-order abstraction on an infinite domain can generate a ‘new’ domain of greater cardinal size than that abstracted on. It follows that if a complete domain is to be obtained by abstraction, we must invoke a second-order abstraction. In this way—and only in this way—we may advance from a domain of objects of given cardinality to a strictly larger domain of abstracts. Given an initial domain comprising κ objects, there will be 2κ properties of those objects. By taking these properties, rather than the objects which have them, as our underlying domain for an abstraction, we may obtain a strictly larger collection of abstracts—up to (but not more than) 2κ of them. 19 We take as our initial domain the (at least countably infinite) full domain + R N of ratios on N+ . Our goal is to obtain a complete domain Q# by cutabstraction, so-called because of its obvious correspondence to Dedekind’s + construction. 20 As anticipated, cut-abstraction operates, not directly upon R N itself, but upon properties of a certain kind defined over its elements, which I shall call cut-properties. These are defined by reference to the ordering on + R N . Informally, a cut-property is a non-empty property whose extension is 18 Recall that a, b, c, d, e, f are all positive integers. A ratio is unchanged by multiplying its numerator and denominator by the same positive integer. Hence a:b = ad:bd. Similarly, c:d = bc:bd. But the ratio to one another of ratios with a common denominator is simply the ratio of their numerators, so [ad:bd]:[bc:bd] = ad:bc, whence [a:b]:[c:d] = ad:bc. Further e:f = bce:bcf and ad:bc = ade:bce. Hence, since [ade:bcf]:[bce:bcf] = ade:bce, we have: [a:b]:[c:d] = ad:bc = ade:bce = [ade:bcf]:[bce:bcf] = [ade:bcf]:[e:f] 19 If κ is infinite and CH holds, then we shall, of course, get more than κ abstracts only if we get exactly κ 2 of them; but I am not assuming CH, much less GCH. 20 cf. Richard Dedekind Stetigkeit und Irrationale Zahlen (1872), translated by Wooster Woodruff Beman as “Continuity and Irrational Numbers” in Richard Dedekind Essays on the theory of numbers, New York: reprint, Dover Publications (1963), pp. 1–27.
187
Reals by Abstraction +
a proper subset of R N and which is downwards closed [i.e. ∀a∀b(Fa→(b < a → Fb)) 21 ] and has no greatest instance [i.e. ∀a(Fa→ ∃b(b > a ∧ Fb))]. We now introduce objects—cuts—corresponding to cut-properties by the abstraction principle: Cut: #F = #G ↔ ∀a(Fa ↔ Ga) where F,G are any cut-properties on R N + and a ranges over R N .
+
+
Q# is the collection of all cuts, #F, for cut-properties F on R N . It may be shown that Q# constitutes a complete domain, in the sense previously explained. Obviously the main thing here is to verify that Q# has the least upper bound property, i.e. where φ varies over proper+ ties of cuts on R N , and bounds above and lub are defined in an obvious way, that if ∃Fφ(#F) and φ is bounded above then φ has a least upper bound. This can be done, mimicking the usual proof, by defining the property H by: H a ↔ ∃F(φ(#F) ∧ Fa)—we can then show that H is a cut-property and that #H is a lub of φ. We may define #F+ #G to be #H , where H a ↔ ∃b∃c(Fb ∧ Gc ∧ a= b⊕c), and #F × #G to be #P, where Pa ↔ ∃b∃c(Fb ∧ Gc ∧ a = b⊗c). With the aid of these and some supplementary definitions, it can then be proved that Q# is full, i.e. that it is a minimal q-domain which also meets the normality and common denominator conditions.
4.
Safe abstractions and safe sets
Are the abstraction principles which I have employed all in good standing? The question is urgent, since we know that not all abstraction principles are acceptable, if only because some—Basic Law V being the obvious example— are inconsistent. And there may be other constraints, besides consistency, with which good abstractions must comply. A thorough examination of the question lies well beyond the scope of this paper, but I should like to conclude by saying a little about it. Of the abstraction principles I have used, two—ratio-abstraction (EM) and difference abstraction—are first-order, while the other two – Hume’s principle and Cut—are second-order. In the case of first-order abstraction, we abstract upon a domain of objects of some kind, and thereby come to recognise objects of another kind; with a second-order abstraction, by contrast, we abstract upon a domain of concepts, themselves defined on some underlying domain of objects, and come to recognise ‘new’ objects, i.e. objects of a kind other than those belonging to this underlying domain. I shall call the field of an abstraction’s equivalence relation the domain for the abstraction, and in the case where this is a domain of (first-level) concepts, I shall call the domain of objects on which these concepts are defined the underlying domain. 21 Here and subsequently a,b, . . . range over elements of R N + .
188
The Arché Papers on the Mathematics of Abstraction
In the case of second-order abstractions, the underlying domain—if it has a determinate size at all—is much smaller than the domain for the abstraction; if the underlying domain has cardinality κ, then the domain for the abstraction (assuming it to comprise all the concepts defined on the underlying domain, and assuming concepts to be individuated extensionally) has cardinality 2κ . In consequence, the abstraction may ‘generate’ up to 2κ abstracts—and so many more abstracts than there are objects in the underlying domain. It is this feature of second-order abstractions which has led some writers to think that it is these abstractions—in contrast with first-order abstractions—which pose the greatest worry, as far as the risk of inconsistency is concerned. I think that is correct, and I shall therefore focus on the second-order abstractions. In fact, since Hume’s principle is known to be consistent, I shall concentrate upon the other second-order abstraction I have used—cut-abstraction. Cut—in contrast with Hume’s principle and Basic Law V—is a restricted abstraction principle, in the sense that the domain for the abstraction comprises only cut-properties on a certain specified underlying domain of objects. It is obvious that if the side constraints on it are ignored, Cut is just a notational variant on Basic Law V. Clearly, then, from unrestricted Cut, we could derive Russell’s contradiction. If we define a Russell property R by: Rx ↔ ∃F(x = #F ∧ ¬Fx), then by unrestricted Cut we have: #R = #R ↔ ∀x(Rx ↔ Rx), whence: #R = #R—so #R exists, and we may proceed: 1 1 3 3 3 3 3 3 1
12 12 12 12
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
R(#R) ∃F(#R = #F ∧ ¬F(#R)) #R = #F ∧ ¬F(#R) #R = #F #R = #F ↔ ∀x(Rx ↔ Fx) ∀x(Rx ↔ Fx) R(#R) ↔ F(#R) ¬F(#R) ¬R(#R) ¬R(#R) R(#R) → ¬R(#R)
assn 1, Def R assn 3∧E (unrestricted) Cut 4,5 ↔E 6 ∀E 3∧E 7,8 ↔ E 2, 3, 9∃E 1,10 →I
(12) (13) (14) (15) (16) (17) (18)
¬R(#R) #R = #R #R = #R ∧ ¬R(#R) ∃F(#R = #F ∧ ¬F(#R)) R(#R) ¬R(#R) → R(#R) R(#R) ↔ ¬R(#R)
assn =I 12, 13 ∧I 14 ∃I 15 Def R 12,16 →I 11,17 ↔I
With the constraints on Cut in place, however, this derivation will not go through without two further assumptions: to establish the existence of #R, and
189
Reals by Abstraction
to justify the (second-order) ∀E step involved at line (5), we must assume that + R is a cut-property on R N ; and for the application of ∀E at line (7), we must + further assume that #R is in R N . Since the contradiction at line (18) depends upon these further assumptions, we may apply reductio to infer that either R + + isn’t a cut-property on R N , or #R is not an element of R N . Does that settle the matter? Well, no. The particular cut-abstraction principle I’ve used may be viewed as a special case of a general schema which runs: (#)#F = #G ↔ ∀a(Fa ↔ Ga) where F, G are any cut-properties on a suitable domain Q and a ranges over Q. A suitable domain Q here will be any domain with an at least dense linear ordering, with respect to which cut-properties are definable. Two obvious questions which may be raised about this general schema are: Are all its instances safe? If not, what distinguishes those which are from those which are not? I’ll venture a few somewhat tentative thoughts about these questions. Perhaps the first thing I should say is that I am not, so far as I can see, committed to endorsing all instances of (#)—i.e. to defending its universal closure with respect to Q—though I would think that, should it prove that some of its instances are either prone to Russell trouble or otherwise unsafe, it should be possible to provide some principled characterisation/explanation of the limitations here. It is clear that so long as the underlying domain Q for an instance of (#) is not inclusive of all objects whatever, any derivation of Russell’s contradiction can be seen, not as showing the inconsistency of that instance (#), but as a demonstration that either the Russell property R cannot be a cut-property on Q or the Russell cut #R cannot be an element of Q. If the universe of all objects whatever constitutes an admissible underlying domain for cut-abstraction, then the Russell cut, if there is such an object at all, must belong to that domain—so the second option lapses. But the first remains open. There will be such an object as the Russell cut only if the Russell property is a cut-property on the universe. But, at least in the absence of any compelling independent reason to think (#) defective, a derivation of the Russell contradiction would seem to give us ample reason to think that the Russell property cannot be a cut-property on the universe. If what I have said is right, it is possible to block Russell trouble without challenging the assumption that the universe constitutes an admissible underlying domain for cut-abstraction. The point is, however, somewhat academic since there are other worries—having more to do with Cantor’s paradox than with Russell’s—which are, I think, best answered by rejecting that assumption. Briefly, cut-abstraction, for all I have said thus far, may be applied to any domain on which cut-properties are definable – that is, any domain with an at least dense linear ordering. If the chosen domain is strictly dense (i.e. dense— like the rationals—but not complete—like the reals), then an instance of cutabstraction will inflate, in the sense that there are more abstracts ‘generated’
190
The Arché Papers on the Mathematics of Abstraction
than there are objects in the underlying domain (i.e. the domain on which the cut-properties are defined). 22 If it is dense but complete, then there will be no inflation—the collection of abstracts will be isomorphic to the underlying object domain. If the universe of all objects whatever admits of a strictly dense linear ordering and can be taken as a domain for cut-abstraction, we shall wind up with more abstracts (and so more objects) than there are objects altogether! How should we avoid this disastrous conclusion? The answer I shall tentatively commend makes crucial play with the contrast I drew previously between unrestricted abstractions, such as Hume’s principle, and restricted ones, such as cut-abstraction. In the case of Hume’s principle, it is essential that the first-order quantifiers on its right-hand side be allowed to range unrestrictedly over all objects whatever, including—crucially—the numbers themselves. In this sense, the first-order quantifiers in Hume’s principle must be understood impredicatively. If instead those quantifiers were restricted so as to range only over objects other than numbers, we could not prove the infinity of the sequence of finite numbers—at least, not without the additional assumption that there exist infinitely many objects of some other kind. With cut-abstraction, by contrast, it is unnecessary—in order to ensure that the abstraction delivers all the abstracts we require—to construe its first-order quantifier impredicatively in this way. Moreover, if we do allow that—in particular, if we allow an instance of the cut-schema whose first-order quantifier ranges over all objects whatever—then we will (provided the universe admits of a strictly dense ordering) run into Cantor-type trouble. But we do not have to allow this. As I have explained, cut-abstraction is—in contrast with Hume’s principle, and Basic Law V—a restricted abstraction, in the sense that each instance of the cut-schema (#) involves a restriction to a specified underlying domain, over which its first-order quantifier ranges. All I have said thus far about what constitutes a suitable underlying domain is that it shall be some densely ordered collection of objects. But as far as I can see, nothing stands in the way of imposing a further restriction which will preclude application of cut-abstraction to the universe as a whole. It may seem that the most obvious way to do this would be to incorporate a ‘limitation of size’ requirement in the conditions for a suitable domain for cut-abstraction—the idea would be to require that any suitable domain Q for cut-abstraction be smaller than the universe. This would bring cut-abstraction much closer to the modified version of Basic Law V which George Boolos dubbed New V. Following Boolos, say that a concept F is a subconcept of a concept G iff ∀x(Fx→ Gx), and that F goes into G iff F ≈ H for some subconcept H of G. Let V be the concept [x: x= x], and say that F is small iff V does not go into F. Define F to be similar to G iff (F is small ∨G is 22 As an anonymous referee, Stewart Shapiro and his student Roy Cook (independently) pointed out to me, cut-abstraction inflates at every cardinality, in the sense that, for every cardinal κ, there is a domain of size κ with a strictly dense linear order on it, so that cut-abstraction applies to yield a ‘new’ domain of size 2κ .
Reals by Abstraction
191
small → ∀x(Fx↔ Gx)). Similarity is an equivalence relation. New V is then the abstraction: New V *F = *G ↔ F is similar to G If we agree—as I think we should—that the numbers may only properly be assigned to genuine sortal concepts—that is, roughly, concepts F with which are associated not only criteria of application but also criteria of identity—then we should be happy with this modification (of either cut-abstraction or Basic Law V) only if we are persuaded that self-identity is a genuine sortal. For if a concept F can have a number only if F is sortal, then, assuming Hume’s principle, F can be equinumerous with itself only if it is sortal. And if it can’t be equinumerous with itself, it can scarcely be equinumerous with any other concept. Since small is defined so that F is small iff self-identity doesn’t go into F, New V is a real restriction of Basic Law V only if self-identity is a genuine sortal. I do not think it is. A simple argument due to Crispin Wright shows, in effect, that if self-identity were a genuine sortal, many concepts which are plainly not sortal would qualify as such. The argument turns on the point that whenever a concept G is genuinely sortal, its restriction by any other (even merely adjectival) concept F—i.e. the conjunctive concept: F-and-G—will likewise be sortal. For example, since horse is, presumably, genuinely sortal, so is white horse, for all that the restricting concept white is no sortal. Thus if self-identical were a genuine sortal, so would be any restriction of it, such as white-and-self-identical. However, since white-and-self-identical is equivalent to white, it would follow that white is after all a sortal concept. Since white (or white thing) is not a genuine sortal, neither can self-identical be one. For the same reason, clearly, no concept which applies universally can be a genuine sortal concept. 23 If this is right, some other means of formulating the needed restriction is required. There is an obvious next thought. Why should we not simply stipulate that a predicate Q determines a suitable domain for cut-abstraction only if Q is genuinely sortal? Since neither self-identity, nor any other predicate (such as ‘F ∨ ¬F’) which is guaranteed application to all objects whatever, is a genuine sortal, this will ensure that the universe of objects as a whole— even if it admits of a strictly dense ordering—is not an admissible domain for cut-abstraction. 23 cf. Wright “Is Hume’s principle analytic?”. Wright formulates the argument slightly differently, as follows: “Call a concept that is not sortal a mere predicable. Where F is a mere predicable, the question: “How many F’s are there?”, is deficient in sense and “the number of F’s” has no determinate reference. However, attaching a mere predicable to a genuine sortal, G, produces a complex, restricted sortal, F-and-G, such that there can be, and normally will be, a determinate number of objects falling under it. Thus if Fis any mere predicable, and self-identity is a genuine sortal, there will be a determinate number of objects which are F and self-identical. But since F and self-identical is equivalent to F, it follows that there can be no such determinate number wherever there is no determinate number of F’s—i.e. wherever F is a mere predicable. So self-identity is not a sortal concept”.
192
The Arché Papers on the Mathematics of Abstraction
A thorough defence of this proposal requires more space than I have here. To conclude, I should like to comment briefly on three points. (i) It might be observed that a restriction of admissible domains to those specifiable by sortal concepts will not, on the face of it, exclude certain very large domains such as those comprising all ordinals, or all cardinals, or all sets (since the relevant concepts appear to qualify as genuinely sortal)—giving rise to concern that paradox may still be derivable from cut-abstraction by taking one or other of these collections as underlying domain. I think this might be met in either of two ways. First, any attempt to generate paradox from (#) by taking the ordinals, say, as domain will—so far as I can see—rely on the idea that the collection of all ordinals is universe-sized. That requires the assumption that the concept ordinal number is equinumerous with some concept under which every object—whether an ordinal number or not—falls. But if what I have already said is right, concepts can be equinumerous only if both are sortal, and there can be no universal sortal concept, so that this assumption can be rejected, and there will be no need to strengthen the restriction on cut-abstraction to preclude taking the ordinals, etc., as domains. But second, even if it should prove necessary to exclude the ordinals, etc., as admissible domains for cutabstraction, there is a quite natural way to do this. Instead of requiring simply that an admissible domain be given by a sortal concept, we might require that such a domain should have a determinate cardinal size. Since being the extension of a sortal concept is at least a necessary condition for a collection to have a determinate size, this restriction would encompass the one already proposed. If this necessary condition is not sufficient—i.e. if certain sortal concepts fail to have determinately-sized extensions – then those concepts will be excluded by the revised restriction. In particular, what Michael Dummett has called indefinitely extensible concepts, such as ordinal, cardinal and set itself, will be excluded. (ii) It may be objected that restricting admissible domains for cutabstraction in either of the ways suggested is arbitrary or ad hoc. And the objection might be thought to draw strength from the neo-Fregean’s willingness (and, indeed, need) to employ unrestricted abstractions such as Hume’s principle. I shall make just two quick points in reply, leaving—no doubt— much more to be said. First, as should by now be clear, it is in fact false that Hume’s principle is a completely unrestricted abstraction—although its first-order quantifiers are unrestricted, its initial second-order quantifiers are— crucially—restricted to sortal concepts. Second, my proposed restriction(s) on cut-abstraction appear to be no more arbitrary or ad hoc than the restriction which New V seeks to build into Basic Law V. It is true that the manner in which the restriction is imposed on (#) differs, formally, from what happens with New V—where what is done is not to restrict the range of any quantifier, but to complicate the equivalence relation—with the effect that when F and G are not small, *F and *G exist, but are identified irrespective of whether their
Reals by Abstraction
193
concepts are co-extensive. But I think this difference is superficial. Provided that the conditions for a first-level concept to be sortal can be expressed (using only logical vocabulary) in a second- (or perhaps third-) order language, I can see no reason why (#) should not be recast in essentially the same mould as New V. And if they cannot be so expressed, that is bad news (if it really is bad) not only for (#) but for New V too, for reasons already mentioned. But I am not persuaded that it would be bad news—since I see no ground for assuming that every philosophically important concept must be capable of definitive expression in the purely logical vocabulary of a second- or thirdorder language. (iii) Finally, a quick word about the state of the economy. Some recent writers 24 have claimed—plausibly, in view of the obvious risk of some form of Cantor’s paradox—that acceptable abstractions should be, in some sense, non-inflationary. Is cut-abstraction inflationary, in any objectionable sense? Some care needs to be exercised in characterising the relevant notion of inflationariness, since a great part of the point and interest of abstractions lies in the fact that they ‘generate’ objects which are ‘new’, and so, in a certain sense, ‘expand’ the underlying domain. So that in one way, inflation—or at least domain-expansion—is just what the neo-Fregean wants. Of course, this way of putting the matter is potentially very misleading, since it gives the entirely false impression of ontological prestidigitation—in which abstraction creates objects out of nothing, as it were, much as a practised conjurer appears to pull pigeons out of thin air. The neo-Fregean can, and should, insist upon a more sober description of what is going on. What an abstraction does, if all goes well, is to set up a concept—of direction, or cardinal number, or whatever—by supplying necessary and sufficient conditions for the truth of identity-statements linking terms which purport reference to objects falling under it. It draws our attention to the possibility of redescribing—or reconceptualising—the state of affairs which consists in line a being parallel to line b, for example, in terms of the holding of the relation of identity between certain objects, the direction of a and the direction of b. 25 Accepting the proposed reconceptualisation does not—in and of itself—involve acknowledging the existence of these objects. What it involves, rather, is accepting that the question whether there are such objects reduces to the question whether suitable instances of the right-hand side of the abstraction principle are indeed true. So what an abstraction does is not to ‘create’ objects, but to equip us to recognise, identify and distinguish objects which we could not recognise, identify and distinguish before—i.e. in advance of grasping the concept which the abstraction introduces. 24 See Kit Fine “The limits of abstraction”, in M.Schirn ed. The Philosophy of Mathematics Today, Oxford University Press (1998), pp. 503–629. 25 For fuller discussion of this idea, see Wright “On the Philosophical Significance of Frege’s Theorem”, §I; Hale “Dummett’s critique of Wright’s attempt to resuscitate Frege”, §2; and Hale “Grundlagen §64” passim.
194
The Arché Papers on the Mathematics of Abstraction
If inflation of this kind is acceptable, what kind might not be? Kit Fine writes: Two necessary conditions for the truth of an abstraction principle hold as matter of logic. . . . In the first place, it follows from the truth of an abstraction principle that its underlying criterion of identity on concepts should be an equivalence relation . . . Secondly, it follows from the truth of an abstraction principle that the identity criterion should not be inflationary, the number of equivalence classes must not outstrip the number of objects. There must, that is to say, be a one–one correspondence between all of the equivalence classes, or their representatives, on the one hand, and some or all of the objects, on the other. It is, of course, on this score that Law V proves unacceptable; for where there are n objects, it demands that there be 2n abstracts. 26
There is, I think, some ambiguity or vagueness in these remarks which we need to resolve if avoidable confusion is to be avoided. Let us say that an abstraction A inflates on an underlying domain D if A’s equivalence relation partitions D into more equivalence classes than D has elements. Then one might say that an abstraction is weakly inflationary if there is some domain on which it inflates, and strongly inflationary if it inflates on every domain (or perhaps—a little less exiguously—on some domain of cardinality κ, for every cardinal κ). 27 To require of an acceptable abstraction that it should not be (even) weakly inflationary would stop the neo-Fregean project dead in its tracks, before it even got moving (as it were). It will be clear that I think there is no good ground to impose such a requirement, and I shall not discuss it further. It is much more plausible to require that acceptable abstractions should not be strongly inflationary. 28 Some of the neo-Fregean’s key abstractions, including the other crucial second-order abstraction, Hume’s principle, satisfy this requirement. 29 But whilst the requirement that abstractions not be strongly inflationary is 26 “The limits of abstraction”, p. 506. 27 This characterisation of weak and strong inflation applies directly only to abstractions—like Hume’s
principle and Basic Law V—which are not restricted abstractions in the sense previously explained, i.e. are not such that their formulation already involves a specification of a particular domain as the underlying domain for the abstraction. Since any particular cut-abstraction, such as Cut, is restricted in this sense, there can be no question of its being strongly inflationary. We can, however, properly ask of the corresponding general schema—(#) in the case of Cut—whether it is strongly inflationary. 28 More plausible, because it might seem that strong inflation is bound to give rise to a version of Cantor’s paradox. It might also be thought that if an abstraction is strongly inflationary, then there could be no hope of showing that it is satisfiable, i.e. has a model—for let D be any domain, of cardinality κ, say. Then any strong abstraction inflates on D, i.e. its equivalence relation partitions D into more than κ equivalence classes, and so ‘generates’ more than κ abstracts. Thus D cannot be a model for the abstraction. But D was any domain whatever, so our abstraction can have no models. On reflection, it should be apparent that this short argument involves an unstated assumption—that the domain of any putative model for an abstraction must be the underlying domain for the abstraction. As against this, I cannot see why, in setting up model for a restricted abstraction—such as cut-abstraction—we should not choose as the domain of the model some larger collection which properly includes the collection which is to play the rôle of the underlying domain for the abstraction. 29 Hume’s principle inflates, of course, on any finite domain, but can be shown—assuming Choice, but without assuming CH or GCH—that it does not inflate on any infinite domain.
Reals by Abstraction
195
more plausible, I can see no compelling reason to accept it in full generality— that is, as applying both to unrestricted abstractions and restricted ones. It may be necessary to insist that no unrestricted abstraction can be strongly inflationary. But, as I have tried to make plausible, it is unnecessary to require this of restricted abstractions. The cut-schema, in particular, is strongly inflationary in the sense that for every cardinality κ, there is an admissible domain of cardinality κ on which an instance of (#) inflates. But that, so far as I can see, does no harm, provided admissible domains are restricted to those given by genuine sortal concepts (or perhaps, those of determinate cardinal size).
5.
Summary and concluding remarks
My aim in this paper has been to set forth one plausible way in which a neo-Fregean account of arithmetic may be extended to encompass the real numbers. I have followed Frege himself in suggesting that the reals should be introduced as ratios of quantities. This approach, as Frege perceived, demands a prior analysis of the notion of quantity. I have agreed with Frege, too, in thinking that this should be done by providing a general characterisation of what he called quantitative domains, but have offered a somewhat different account of them from that given in Grundgesetze. Ratios of quantities are introduced by an abstraction principle based on the ancient theory of proportion which comes down to us from Eudoxos. The positive reals are then obtainable as ratios of quantities in a complete quantitative domain, and zero and the negative reals by essentially the move by which the integers are standardly constructed as difference-pairs of natural numbers. My construction, taken by itself, establishes only a conditional result: if there exists a complete quantitative domain, then the reals may be introduced as ratios of quantities on it. However, as I argue in the second half of the paper, there is a route by which a neo-Fregean may establish the existence of at least one complete domain, starting with the natural numbers (as given by Hume’s principle), by successively applying ratio-abstraction to obtain a full domain and a suitably adapted version Dedekind’s method of cuts to obtain from this a complete domain. Two points deserve emphasis: first, quantities, though (on my account) abstract objects which are sharply to be distinguished from the concrete entities which stand in various quantitative relations to one another, are not themselves to be identified with numbers; and second, although I use a version of Dedekind’s method in proving the existence of a complete domain, there is no question, on the present approach, of defining the reals as or in terms of Dedekind cuts. Here is not the place to elaborate upon the significance of these points. The first is, I believe, integral to the defence of my approach against several more or less familiar objections to older attempts to treat real numbers as directly abstracted from quantitative relations among concrete entities— but that defence is best conducted in the context of a more searching analysis
196
The Arché Papers on the Mathematics of Abstraction
of the notion of quantity than I have had space for here. Such an analysis would also do much to motivate the axiomatic characterisation of quantitative domains which I have been obliged to state somewhat dogmatically, without the philosophical defence it surely requires. The second is essential to the claim of the present approach to respect Frege’s belief—I would say, insight— that a satisfying foundational account of the real numbers should introduce them in a way which expressly provides for their applications. 30, 31
30 If one disregards this constraint—as I think one should not—then it would, of course, be possible to obtain the reals by Fregean abstraction in a much simpler and more direct way than I have described. One might, for example, start with the natural numbers as given by Hume’s principle, obtain rationals by some form of ratio-abstraction (such as that employed here, but there are obviously other ways in which this might be done) and then directly introduce the reals as cuts by cut-abstraction (either as explained here, or in some similar way). 31 I am indebted to an anonymous referee for this journal, and to Roy Cook, Jim Edwards, Gary Kemp, Pierluigi Miraglia, Philip Percival, Stewart Shapiro, Neil Tennant and Crispin Wright for helpful discussion of earlier versions of this material, as well as to my audiences at presentations of parts of it in Cambridge, Columbus OH, Glasgow, L’Institut d’Histoire et Philosophie des Sciences et des Techniques in Paris and St. Andrews. Very special thanks are due to my colleague Adam Rieger. Work on this paper was carried out during my tenure of a British Academy Research Readership—I am most grateful to the Academy for its generous support.
THE STATE OF THE ECONOMY: NEO-LOGICISM AND INFLATION 1,2 Roy T. Cook
1.
Introduction
In recent years there has been a resurgence of interest in logicism as a viable philosophy of mathematics, stemming in great part from Crispin Wright’s Frege’s Conception of Numbers as Objects [1984] and the formal and philosophical work of George Boolos. Before this work it was generally accepted that Frege’s project of reducing mathematics to pure logic was devastated by Russell’s detection of a paradox produced by Frege’s notorious Basic Law V. Frege’s project has recently been reborn, with some modifications. In this paper I explore some of the landscape surrounding this project, concentrating on the prospects for a successful neo-logicist reconstruction of the real numbers. I focus on Bob Hale’s “Reals by Abstraction” [2000] and his use of a cut abstraction principle, as this approach seems to be the one most likely to be generalizable to complex analysis, functional analysis, etc. There is a serious problem that plagues Hale’s project. Natural generalizations of the sort of principle needed to construct the reals imply that there are far more objects than one would expect from a position that stresses its epistemological conservativeness. In other words, the sort of abstraction needed to obtain a theory of the reals is rampantly inflationary. After arguing for this claim with respect to Hale’s treatment, I will indicate briefly why this problem is likely to reappear in any neo-logicist reconstruction of real analysis. 1 This paper first appeared in Philosophia Mathematica 10[2002], pp. 43–66. Reprinted by kind permis-
sion of the editor and Oxford University Press. 2 This title is taken from a phrase used by Bob Hale [2000] in his own discussion of neo-logicist abstraction principles and domain inflation.
197 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 197–218. c 2007 Springer.
198
2.
The Arché Papers on the Mathematics of Abstraction
Abstraction principles An abstraction principle 3 is any second-order formula 4 of the form: (∀P)(∀Q)[@(P) = @(Q) ↔ E(P, Q)]
“@” here is a function from properties (or relations) to objects, and E is an equivalence relation on the properties (or relations). Abstraction principles allow us to take, as objects, characteristics that the properties or relations have in common. Frege’s Basic Law V is: BLV : (∀P)(∀Q)[EXT(P) = EXT(Q) ↔ (∀x)(Px ↔ Qx)] Frege derives all of arithmetic from BLV plus second-order logic, but Russell’s discovery that BLV is inconsistent with the second-order comprehension axiom renders this result less noteworthy. The resurrection of logicism stems for the observation that Frege’s only ineliminable use of BLV occurs in his derivation of Hume’s Principle: HP : (∀P)(∀Q)[NUM(P) = NUM(Q) ↔ P ≈ Q] [P ≈ Q is the second-order formula asserting that there is a one-to-one correspondence between the P’s and the Q’s] 5 The “NUM” operator is, in effect, a number generating function, mapping properties onto the number corresponding to the cardinality of the extension of the property. Unlike BLV above, HP is consistent. It can be added to any theory that has an infinite model, and the new theory will have (infinite) models of the same cardinality 6 as the original theory. Frege’s derivation of arithmetic in the Grundgesetze can be reconstructed from second-order logic plus HP, thereby avoiding the troublesome BLV. 7 This result, quite remarkable as a mathematical fact independent of any philosophical implications, has come to be called Frege’s Theorem. 8 Of course, HP, with its explicit reference to numbers via the “NUM” function, is not a logical truth. Thus, the neo-logicist must abandon the hope that 3 I am ignoring “objectual” abstraction principles where the abstraction operator “@” maps objects onto objects, as the phenomena that interest us here involve only “conceptual” abstraction, where properties or relations are mapped onto objects. 4 I assume standard set theoretic semantics for second-order logic, where the second-order predicate variables range over the full powerset of the domain, and which therefore satisfies the comprehension scheme:
(∃R)(∀x1 , x2 , . . . , xn )(R(x1 , x2 , . . . , xn ) ↔ ) for each formula not containing R free. For details see Shapiro [1991]. 5 Although I often phrase the equivalence relation for an abstraction principle in everyday English, every abstraction principle considered in this paper can be expressed using only the resources of second-order logic (plus, in some cases, previously defined abstraction operators). 6 This result depends on the axiom of choice. 7 See Boolos [1987] and Heck [1993]. 8 Another abstraction principle which will be used in examples later in the paper is a size-restricted version of Basic Law V: NewV : (∀P)(∀Q)[EXT(P) = EXT(Q) ↔ ((Pis “Big” ∧ Qis “Big”) ∨ (∀x)(Px ↔ Qx))]
The State of the Economy: Neo-Logicism and Inflation
199
one can reduce all of mathematics to truths of pure logic, but this is not surprising. The discovery of Russell’s paradox, coupled with the failure of Russell and Whitehead’s subsequent logicist attempt in the Principia Mathematica [1913], suffice to render the original logicist project implausible. In addition, Boolos argues that this sort of reduction of mathematics to pure logic is in principle impossible: mathematics has ontological commitments, while on the contemporary conception logic does not. 9 One can argue, however, that the crucial aspect of Frege’s logicism is not the reduction of all of mathematics to truths of logic. Instead, Frege’s main goal was to demonstrate the analyticity of mathematics, saving it from Kant’s charge of a priori yet synthetic (see Coffa [1991]): The problem becomes, in fact, that of finding the proof of the proposition, and of following it up right back to the primitive truths. If, in carrying out this process, we come only on logical laws and on definitions, then the truth is an analytic one . . . If however, it is impossible to give the proof without making use of truths which are not of a general logical nature, but belong to the sphere of some special science, then the proposition is a synthetic one. (Frege [1884], p. 4, emphasis added)
According to Frege, the aprioricity of mathematics is a direct consequence of its analyticity: For a truth to be a posteriori, it must be impossible to construct a proof of it without including an appeal to facts, i.e. to truths which cannot be proved and are not general. But if, on the contrary, its proof can be derived exclusively from general laws, which themselves neither need nor admit of proof, then the truth is a priori. ([1884], p. 4)
Thus, the reduction of mathematics to logic was just the particular strategy Frege adopted to secure the analyticity 10 and apriority of mathematics. Although Frege abandoned his project, Wright has revived it, stressing that the part of Frege’s project that is of interest is not the reduction of mathematics to logic but rather a demonstration of the analyticity, or at least a prioricity, of (much of) mathematics: Frege’s Theorem will still ensure . . . that the fundamental laws of arithmetic can be derived within a system of second-order logic augmented by a principle whose role is to explain, if not exactly to define, the general notion of identity of cardinal number, and that this explanation proceeds in terms of a notion which can be defined in terms of the concepts of second-order logic. If such an explanatory [where “P is Big” is an abbreviation for the second-order formula asserting that the P’s are equinumerous with the entire domain]. NewV is consistent, and satisfied by the hereditarily finite sets Vω . Many of the standard axioms of ZFC (but not infinity or powerset) can be reconstructed using NewV. 9 Except for (∃x)(x = x), the claim that there is at least one object. Even this, however, is only accepted for convenience. We could easily formulate a logic that countenanced the empty model. 10 The assumption that logic is analytic if anything is seems unproblematic. Even Quine admits something like this in “Two Dogmas of Empiricism” [1951]!
200
The Arché Papers on the Mathematics of Abstraction principle . . . can be regarded as analytic, then that should suffice . . . to demonstrate the analyticity of arithmetic. Even if that term is found troubling, as for instance by George Boolos, it will remain that Hume’s principle – like any principle serving implicitly to define a certain concept – will be available without significant epistemological presupposition . . . So one clear a priori route to the recognition of the truth of . . . the fundamental laws of arithmetic will have been made out. And if in addition [Hume’s principle] may be viewed as a complete explanation – as showing how the concept of cardinal number may be fully understood on a purely logical basis – then arithmetic will have been shown up by Hume’s principle . . . as transcending logic only to the extent that it makes use of a logical abstraction principle – one [that] deploys only logical notions. So, always provided that concept formation by abstraction is accepted, there will be an a priori route from mastery of second-order logic to a full understanding and grasp of the truth of the fundamental laws of arithmetic. Such an epistemological route . . . would be an outcome still worth describing as logicism. ([1997], pp. 210–211, emphasis added)
Although the neo-logicists are a bit vague regarding exactly what the special status of abstraction principles is, the general idea seems to be something along the following lines: Acceptable abstraction principles provide something akin to an implicit definition of the abstracts generated by the principle, providing an explanation, although not necessarily a complete 11 explanation, of what it is to be an abstract of the relevant sort. This explanation provides us with a method by which we can come to know truths about these abstracts a priori. 12 Thus, the abstraction principles are meant, among other things, to provide some sort of epistemological advantage – the idea being that we can get all of arithmetic, for example, from the epistemologically unproblematic HP. Finally, although abstraction principles are not logical truths, the fact that they invoke only logical terminology on the right-hand side of the biconditional in giving the truth conditions of the identity on the left supports the claim that neo-logicism provides “an outcome still worth describing as logicism”. Of course, as is well known, most 13 of modern mathematics can be reconstructed quite nicely in Zermelo–Fraenkel set theory. In addition, some philosophers, such as Gödel [1947], argue that the axioms of set theory are a priori knowable. 14 The interest of the neo-logicist project, then, depends on the extent to which it can be argued that the necessary abstraction principles 11 Abstraction principles, according to many critics (and some defenders) of neo-logicism, notoriously fail to solve the “Caesar Problem”. 12 This emphasis on how we come to know the truths of mathematics seems to be what is crucial in the passages from Frege’s Grundlagen quoted above. 13 One standard textbook on set theory, Kunen [1980], contains the following as an exercise:
Verify that within ZC [ZFC minus replacement] one may develop at least 99% of modern mathematics. (p. 147) 14 Unlike the neo-logicists, Gödel would not have claimed that the axioms of set theory are analytic, as their truth depends not only on the meaning of the terms involved but also (in some manner) on our direct intuition of the set theoretic universe (see Gödel [1947]).
The State of the Economy: Neo-Logicism and Inflation
201
are epistemically “cheap” in a way in which the axioms of ZFC are not. A first step towards this goal is an account of which abstraction principles are neo-logicistically acceptable. Mere consistency is not enough for an abstraction principle to be acceptable. Presumably if two abstraction principles are both acceptable, then their conjunction should be as well, yet we can formulate consistent abstraction principles that are only satisfiable on finite domains. 15 Such a principle and HP are not jointly satisfiable, so we need more stringent requirements on which abstraction principles are acceptable. The point of this paper is to examine one such constraint. Before moving on, a technical fact relevant to the satisfiability of abstraction principles needs to be noted: If the right-hand side of the biconditional in an abstraction principles contains no non-logical vocabulary, then the abstraction principle will be satisfiable on a domain of size κ if and only if it is satisfiable on any domain of size κ. A proof 16 of this result can be found in Fine [1998], although the reasoning behind it should be clear once one realizes that an abstraction principle only requires that there be a distinct object for each equivalence class of properties (or relations). It implies nothing regarding which object is associated with which class.
3.
Inflation
A number of requirements on abstraction principles have been proposed. The constraint that concerns us here is the idea that suitable abstraction principles should be non-inflationary. Informally this is just a requirement that the abstraction principles should not imply the existence of too many objects, reflecting the intuition, due to von Neumann, that the way to avoid the set theoretic paradoxes is by avoiding collections that are too large, i.e. what are now known as proper classes. 17 Consider BLV. The source of its inconsistency can be traced, at least in part, to the fact that it assigns a distinct object (an extension) to each collection of objects in the domain, violating Cantor’s theorem. To avoid this sort of contradiction, it is a good start to require that acceptable abstraction principles do not involve equivalence relations that partition the domain into more collections than there are objects. Kit Fine argues that: 15 The following, much discussed abstraction principle has come to be called the Nuisance Principle:
NP : (∀P)(∀Q)[NUI(P) = NUI(Q) ↔ (P, Q)] [where (P, Q) abbreviates the second-order formula asserting that the collection of objects that are either P-and-not-Q or are Q-and-not-P is finite]. NP is satisfiable on any finite domain, but on no infinite one. 16 Fine’s argument can be generalized to universal generalizations of logical abstraction principles. 17 I use the term “proper class” in the technical sense, referring to collections of sets that are too big to themselves be sets. Intuitively, at least, there are in fact collections that are (or at least might be) bigger than proper classes, such as various large collections of proper classes.
202
The Arché Papers on the Mathematics of Abstraction . . . the identity criterion should not be inflationary, the number of equivalence classes must not outstrip the number of objects. There must, that is to say, be a one–one correspondence between all of the equivalence classes, or their representatives, on the one hand, and some or all of the objects, on the other. ([1998], p. 506)
Along similar lines, Wright writes that: The cells into which the relevant equivalence relation partitions the universe of Concepts must not outrun the population of objects which constitute the range of the first-order variables in the abstraction principle. ([1997], p. 222)
This way of phrasing the prohibition on domain inflation begs the question against the neo-logicist reconstruction of the reals, since the strategy is to add a suitable abstraction principle to a theory satisfied by a countable domain to get a theory that is only satisfied by an uncountable domain. The neo-logicist approach depends on some abstraction principles being at least somewhat inflationary. We can get at the spirit of the ban on inflationary abstraction principles with something akin to Boolos’ take on the matter: . . . it was a central tenet of logical positivism that the truths of mathematics were analytic. Positivism was dead by 1960 and the more traditional view, that analytic truths cannot entail the existence either of particular objects or of too many objects, has held sway ever since. ([1997], pp. 249–250, emphasis added)
Ignoring the issue of whether analytic principles should entail the existence of particular objects, we can assume for the sake of argument that some acceptable abstraction principles might be inflationary, i.e. their addition to a theory with models of size κ might result in a theory whose models all have domains larger than κ. This domain inflation should not be too rampant, however. Acceptable abstraction principles should not imply the existence of too many objects, at least not if they are to be “epistemologically cheap”. There may be no way to delineate exactly what “too many” means in the previous sentence. On the contrary, like the vague predicate “red”, there might be no sharp line marking off where the extension of “too many” begins. Even so, we can lay down a number of precise ways in which an abstraction principle might be inflationary, even if we cannot determine with certainty which of them are neo-logicistically acceptable and which are not. Given an abstraction principle AP and a set of object S, the restriction of AP to S is the result of replacing every first-order quantifier “∀x” (“∃x”) with “∀x ∈ S” (“∃x ∈ S”) and replacing every second-order quantifier “∀X ” (“∃X ”) with “∀X ⊆ S” (“∃X ⊆ S”). An abstraction principle AP generates κ objects when applied to the domain S iff, for every domain D such that S ⊆ D and D satisfies 18 the restriction of AP to S, the cardinality of D– S ≥ κ. (Here, 18 Since all that is relevant to the satisfaction of a logical abstraction principle is the cardinality of the domain, I say that a set D satisfies an abstraction principle AP if there is some model with D as domain that satisfies AP.
The State of the Economy: Neo-Logicism and Inflation
203
and below, κ, γ , and λ are infinite cardinals.) 19 We now define the notion of κ-inflationary: 20 An abstraction principle AP is κ-inflationary if, for any domain S of cardinality κ, the application of AP to S generates γ objects where γ > κ. 21
Using the notion of κ-inflation, we can now define some more general senses in which an abstraction principle can be inflationary: Strictly Non-inflationary:
Locally Inflationary:
Boundedly Inflationary:
Unboundedly Inflationary: Universally Inflationary:21
An abstraction principle AP is strictly non-inflationary if there is no κ for which AP is κ-inflationary. An abstraction principle AP is locally inflationary if there are (only) finitely many k’s such that AP is κ – inflationary. An abstraction principle AP is boundedly inflationary if there are infinitely many κ‘s such that AP is κ – inflationary but there is some γ such that, for all λ > γ , AP is not λ-inflationary. An abstraction principle AP is unboundedly inflationary if, for every κ, there is a γ > κ, such that AP is γ – inflationary. An abstraction principle AP is universally inflationary if, for every κ, AP is κ-inflationary.22
HP is strictly non-inflationary, since it can be added to any theory with an infinite model and the result will have a model of the same cardinality. Shapiro and Weir [1999] show that, if the Generalized Continuum Hypothesis holds, then NewV (see note 7) is unboundedly inflationary, since on this assumption it is satisfied at every successor cardinal but at no singular cardinal. 24 Finally, it is clear that any abstraction principle used to obtain the real numbers must 19 Notice that, for any abstraction principle AP containing only logical vocabulary on the right-hand side of the biconditional and any set S of cardinality κ, if AP applied to S generates γ objects, then γ > κ. 20 I am ignoring cases where abstraction principles inflate on finite domains since they are irrelevant to the case at hand. Hume’s Principle inflates on finite domains, yet this inflation has rarely been the target of serious criticism. In addition this sort of inflation is critical to the success of the neo-logicist project given the possibility that there are only finitely many non-abstract objects in the world. 21 Since the satisfaction of an abstraction principle depends solely on the cardinality of the domain, we could have phrased this as:
An abstraction principle AP is k-inflationary if there is some domain S of cardinality κ such that the application of AP to S generates γ objects where γ > κ. 22 Locally inflationary and universally inflationary are equivalent (roughly) to Hale’s weakly and strongly inflationary, respectively (see Hale [2000], p. 121). 23 Universal inflation is one way of formalizing the intuition that some mathematical concepts given by abstraction principles are indefinitely extensible (see Dummett [1963] and note 47 below). 24 Shapiro and Weir also prove that it is consistent with ZFC that NewV has no uncountable models at all.
204
The Arché Papers on the Mathematics of Abstraction
be at least locally inflationary, since the point of the abstraction is to take us from a countable domain to an uncountable one where we can formulate real analysis. The question thus becomes: Where should we draw the line with respect to domain inflation? As has already been pointed out, abstraction principles that are locally inflationary must be acceptable. In addition, we can treat locally inflationary principles and boundedly inflationary principles as roughly on a par, since in both cases we are confronted with cases where the abstraction may blow up our ontology, but only so much. At some point we reach an upper limit beyond which the abstraction principle does not inflate. Along similar lines, we can think of unboundedly inflationary and universally inflationary principles as equally problematic, since in both cases the problem, if any, has to do with the fact that its (possibly repeated) application might multiply the underlying ontology without limit. Thus, we need to determine whether unboundedly or universally inflationary abstraction principles are neo-logicistically acceptable. A number of considerations can be brought to bear against unboundedly and universally inflationary abstraction principles, in addition to the points already canvassed against inflation more generally. I will give a different argument against each sort of inflation, although unbounded and universal inflation are sufficiently similar that problems with one are likely to indicate problems with the other. The neo-logicist is claiming that the abstraction principles implicitly define, or at least ground our use of mathematical concepts and theories. Definitions of the abstract objects of mathematics, even implicit ones, ought to determine a unique group of objects which necessarily fall under the definition. If this “defining” abstraction principle is unboundedly inflationary, however, then the neo-logicist has failed in his task. Assume that we have some unboundedly inflationary abstraction principle AP and there are κ objects in the universe 25 (including the abstracts guaranteed to exist by AP). Let γ be the least cardinal > κ such that AP is γ -inflationary. Then, had there been γ objects in the universe, there would have, by AP been more than γ (and thus more than κ) abstracts. But then the original abstracts are not all of the objects whose identity conditions are given by AP. This process can be repeated indefinitely (and transfinitely), so we never have all the objects that fall under the purview of AP. In other words, if AP is unboundedly inflationary then it fails to secure a definite collection of objects as the domain of its abstraction operator, but instead gives us different abstracts relative to how many objects exist. 26 25 This way of setting things up implies that AP is not κ-inflationary. 26 There is a tempting response at this point. One might point out that, in addition to the abstracts falling
under the principle AP, other sorts of abstract objects such as real numbers, sets, and groups will also exist and will exist necessarily. Therefore, the cardinality of the actual world is the same as the cardinality of every possible world, namely, however many objects could possibly exist. In other words, even if AP alone does not secure a definite extension for the abstraction operator, this unique extension might be secured by AP plus the fact that all of the sets of ZFC exist. While this is true, it does not save unboundedly inflationary
The State of the Economy: Neo-Logicism and Inflation
205
This argument against unbounded inflation is quite compelling, especially if we think that abstraction principles should generate abstract mathematical objects that exist necessarily instead of providing an undetermined multitude of objects whose existence depends on the number of non-abstract objects present in the universe. The case against universal inflation is a bit different, but equally worrying. In moving from an abstraction principle that is unboundedly inflationary to one that is universally inflationary we have replaced one problem with another. With unbounded inflation there was no unique collection of abstracts generated by the abstraction. In the case of universal inflation, there might be a unique collection of objects generated by the abstraction principle, but if so, then it is an extremely badly behaved collection. In other words, if an abstraction principle is universally inflationary, then it will be satisfied (if satisfied at all) only by a structure that is at least the size of the smallest proper class. Assume that AP has a set-sized model M. Then there is some κ such that the domain of M has cardinality κ. If AP is universally inflationary, then application of AP to a domain of size κ produces γ objects for some γ > κ. M satisfies AP, so the domain of M contains at least γ objects, but then the cardinality of the domain is greater than κ. Contradiction. This sort of issue seems to be what Hale has in mind where he writes that, although boundedly inflationary abstraction principles are neo-logicistically acceptable: It is much more plausible to require that acceptable abstraction principles not be strongly inflationary [equivalent to my universally inflationary]. Some of the neo-Fregean’s key abstractions, including the other crucial second-order abstraction, Hume’s Principle, satisfy this requirement. ([2000], p. 120)
He adds in a footnote that: . . . it might seem that strong inflation is bound to give rise to a version of Cantor’s paradox. It might also be thought that if an abstraction is strongly inflationary, then there could be no hope of showing that it is satisfiable, i.e. has a model. (p. 120)
Hale touches on two main worries here, each of which deserve closer scrutiny. First, there is the claim that universally inflationary abstraction principles are likely to be susceptible to set-theoretic paradoxes such as Cantor’s paradox (or Russell’s or Burali-Forti’s). The idea is simple: If, for any κ-sized collection, the universally inflationary principle AP generates, say, 2κ new objects, then it seems plausible that when applied to a proper class, or any other sort of structure, it would also inflate. This is one way of explaining what goes wrong with Frege’s Basic Law V. The reasoning is not general, however. There could be abstraction principles that inflated on all sets but did not inflate on proper classes. For example, principle AP might inflate on any collection that can be abstraction principles from the force of the objection, because an adequate definition should determine a unique extension independently of the existence of any other objects.
206
The Arché Papers on the Mathematics of Abstraction
well-ordered, but on no structure that cannot. If this is the case, then, as long as there are proper classes too large to be well ordered, AP might be satisfiable even though it is universally inflationary. We will see a potential candidate for such a satisfiable yet universally inflationary abstraction principle below. Hale’s second worry is that we might be faced with insuperable difficulties when attempting to prove the consistency/satisfiability of universally inflationary abstraction principles. The standard definition states that a sentence is satisfiable if and only if there is a set theoretic model (read: set as domain plus appropriate assignments to various bits of language) such that the sentence is true in that model. Any universally inflationary abstraction principle fails to be satisfiable in this sense, yet it is still possible that some structure (such as a proper class) might make the sentence true. This is a serious problem for the neo-logicist. As we have seen, some abstraction principles are consistent while others that resemble the former a great deal are not. Thus, one of the most important parts of defending a neo-logicist abstraction principle as acceptable is to demonstrate its satisfiability. This will prove difficult, if not impossible, for universally inflationary abstraction principles since our methods for studying and manipulating proper classes are less powerful and less secure than our set theoretic machinery. There is a disturbing historical irony here. The notion of proper class was introduced as a result of, among other things, reflection on what exactly went wrong in Frege’s Grundgesetze. The idea was to draw a distinction between the logically safe sets and the problematic proper classes, which were in some sense too large to be safely manipulated like sets. If the neo-logicist reconstruction of Frege’s project pushes us once again into the realm of proper classes, historical sensitivity (or perhaps merely superstition) should cause some worry. 27 Thus, the neo-logicist should be extremely wary of unboundedly and universally inflationary abstraction principles. While these sorts of abstractions are not necessarily susceptible to the sorts of paradoxes usually associated with “bad” abstraction principles, and are even (possibly) satisfiable, they nevertheless take us far from the epistemically innocent implicit definitions that the neo-logicists argue acceptable abstraction ought to provide.
4.
Hale’s reconstruction of the reals
I take it as a desideratum of a successful philosophy of mathematics that it must account for enough mathematics to handle scientific applications. It follows that the neo-logicists need, at a minimum, to be able to reconstruct the theory of the real numbers. In fact, the success or failure of the neo-logicist project seems to hinge on their successful treatment of the real numbers. If this difficult case can be dealt with, then it is plausible that most other areas of 27 This point was brought to my attention by Jon Cogburn.
The State of the Economy: Neo-Logicism and Inflation
207
contemporary mathematics can be handled by relatively unproblematic neologicist constructions building on the continuum. On the other hand, if the neo-logicist is unable to account for the reals, then the project has failed to provide a foundation 28 for mathematics. Reconstructing arithmetic from unproblematic abstraction principles might be interesting, and even mathematically important, but arithmetic is too simple a theory to allow us to conclude anything interesting about mathematics as a whole. It is at this point that the work of Bob Hale [2000] becomes relevant. Although others, including Simons [1987] and Dummett [1991] have written on Frege’s treatment of the reals, Hale is the first to attempt a full-scale neologicist account. Thus Hale’s work is of independent interest in an investigation of the prospects for a neo-logicist reconstruction of analysis. More importantly, given our purposes here, Hale’s account provides us with a useful case study of inflationary abstraction principles. Hale puts much stock in the fact that the reals are not just any sort of mathematical object but are, like the natural numbers and rational numbers, quantities: The most striking and most important features of Frege’s treatment of the reals are two: (i) the real numbers are to be defined as ratios of quantities . . . and (ii) in regard to the analysis of the notion of quantity, the fundamental question requiring to be answered is not: What properties must an object have if it is to be a quantity? but: What properties must a concept have, if the objects falling under it are to constitute quantities of a single kind? ([2000], p. 104)
Hale’s strategy is to, first, set up a general theory of quantity; second, to argue that if a certain sort of quantity exists, then ratios on those quantities can serve as the reals; and third, to prove, with the use of a novel abstraction principle, that the requisite sort of quantities exist. Hale begins by giving definitions of various sorts of “Quantitative Domain”. The series of definitions he proposes are intended to flesh out the second feature of Frege’s treatment of the reals – determining what properties a concept must have if the objects falling under that concept are quantities. A minimal q(uantitative)-domain is: . . . a non-empty collection Q of entities closed under an additive operation ⊕ which commutes, associates, and satisfies the strong trichotomy law that for any a, b ∈ Q, we have exactly one of: ∃c (a = b ⊕ c), ∃c (b = a ⊕ c), or 28 This is not to say that the work stemming from neo-logicism does not have other interesting philosophical applications. For example, Harold Hodes [1984] considers the following Order Type Abstraction Principle:
OTA : (∀P)(∀Q)[ORD(P, <) = ORD(Q, ) ↔ ((P, <) is isomorphic to(Q, ))] This principle, conjoined to a principle (such as HP) guaranteeing the existence of infinitely many objects, is susceptible to the reasoning of the Burali-Forti paradox, but not the Russell paradox. Thus, the study of the satisfiability of abstraction principles provides a promising approach to investigating the set theoretic paradoxes.
208
The Arché Papers on the Mathematics of Abstraction a = b. Any minimal q-domain is strictly ordered by <, defined by a < b ↔ ∃c (a ⊕ c = b). Multiplication of elements of Q by positive integers is easily defined – inductively – in terms of ⊕ (p. 106)
Next is the notion of a normal q-domain, defined as: . . . any minimal q-domain meeting the (Archimedean) comparability condition: ∀a, b ∈ Q∃m (ma > b) . . . m ranges over positive integers (pp. 106–107).
Once we have normal q-domains, Hale introduces ratios using the following abstraction principle (recast to fit the notation used here) for quantitative domains Q and Q*: EM : ∀a, b ∈ Q∀c, d ∈ Q ∗ [RAT(a, b) =
RAT (c, d)
↔
(∀m, n(ma = nb ↔ mc = nd ) ∧ ∀m, n(ma < nb ↔ mc < nd) ∧ ∀m, n(ma > nb ↔ mc > nd))] The new structure resulting from the application of EM to a normal q-domain Q is called R Q . 29 Finally, Hale defines a full q-domain to be a normal qdomain where we have: ∀a, b, c ∈ Q∃ q ∈ Q(RAT(a, b) =
RAT (q, c))(p. 107)
This can be reworded to avoid the reliance on abstraction. A normal q-domain is full iff: ∀a, b, c ∈ Q∃q ∈ Q(∀m, n(ma = nb ↔ mq = nc) ∧ ∀m, n(ma < nb ↔ mq < nc) ∧ ∀m, n(ma > nb ↔ mq > nc)) Finally, a q-domain Q is said to be complete iff: . . . every bounded above non-empty S ⊆ Q has a least upper bound. (p. 108)
This completes Hale’s first task – specifying what criteria a concept Q must meet for the objects falling under Q to be quantities of various varieties. Hale next points out that any two complete q-domains are isomorphic. 30 Applying the abstraction principle EM above, we get the result that, for any two complete domains Q and Q*, R Q = R Q∗ . Thus, according to Hale, the reals can be obtained as the ratios of any complete domain, as long as some such domain exists. All that remains is the third step in Hale’s argument – the proof that there is a complete q-domain. Hale’s argument is relatively straightforward: The 29 The principle EM, which is equivalent to a sort of pairing axiom, is strictly non-inflationary. 30 This fact depends on the explicit use of second-order quantifiers in the definition of complete q-
domains and their implicit use (in securing the fact that we are talking about the standard natural numbers and not some non-standard model of them) in the definition of normal q-domains.
The State of the Economy: Neo-Logicism and Inflation
209
neo-logicist already has access to the positive natural numbers N + via HP. The natural numbers constitute a normal q-domain, but not a full one. An application of the abstraction principle EM to N + , however, gives us the ratios on the positive naturals R N + , which is a full domain (although not complete) and an obvious candidate to serve as the positive rational numbers. The next move is to apply the following Cut Abstraction Principle to R N + (reworded to fit the notation used here): C A : (∀P)(∀Q)[CUT(P) =
CUT (Q)
↔
((∀x)((x ∈ R N + ∧ P and Q are cut properties30 on R N + ) → (Px ↔ Qx)))]31 This gives us the required complete q-domain, and we need only apply EM once more to obtain the reals. So far the neo-logicist project looks good, as long as HP, EM, and CA are acceptable. We will assume that HP and EM are acceptable, and concentrate on CA, since CA is responsible both for guaranteeing that there are uncountably many quantities and for producing the complete q-domain.
5.
Abstraction principles and abstraction processes
We have seen how CA generates a complete, uncountable q-domain from (something like) the rationals. There seems to be no principled reason for restricting this procedure, however. We should, prima facie, be able to apply cut abstraction to any linear ordering guaranteed to exist by previously accepted principles. To paraphrase Georg Kreisel, 33 one can argue that the evidence for the applicability of CA derives from the more general idea that cuts can be taken on any linear order whatsoever. In other words, we need to distinguish between abstraction principles (such as CA) and the more general 31 Hale defines a cut-property on R N + as follows:
. . . a cut property is a non-empty property whose extension is a proper subset of R N + and which is downwards closed [i.e., (∀a) (∀b) (Fa → (b < a → Fb)] and has no greatest instance [i.e., (∀a) (Fa → (∃b) (b > a ∧ Fb)]. (p. 112) We can generalize this to any linear ordering by replacing R N + with a name for the order in question. Hale’s definition, however, is a bit unwieldy, since it implies that the only linear orders with non-trivial cuts are dense. Thus, we can use a slightly modified version of Hale’s definition [for a linear order (Q, <)]: . . . a cut property on (Q, <) is a non-empty property whose extension is a proper subset of Q and which is downwards closed [i.e., (∀a) (∀b) (Fa → (b < a → Fb)] and has an upper bound [i.e., (∀a) (Fa → (∃b) (b > a ∧ ¬ Fb)]. All of the results in this paper hold on either definition of cut property. 32 Hale stresses that CA is a restricted form of Basic Law V, defining it as: C A : (∀F)(∀G)[C UT(F) = C UT(G) ↔ (∀a)(Fa ↔ Ga)] where F, G are any cut properties on R N + and a ranges over R N + (p. 112). He restricts the range of the first – and second-order quantifiers in BLV instead of building the restrictions directly into the abstraction principle. 33 Kreisel [1967] makes this point regarding the evidence for the first-order arithmetic induction scheme.
210
The Arché Papers on the Mathematics of Abstraction
abstraction processes they instantiate. Comparing HP and CA will help to make this distinction clearer. An application of HP to a particular domain assigns to each property (i.e. subset) an object that is to serve as the number of that property. The particular number assigned to a property by HP (more accurately, the conditions for when two distinct properties are assigned the same number) depends solely on a characteristic of the property itself, its cardinality. On the other hand, which object is assigned to which property by CA depends not only on the characteristics inherent in the property itself but also on the ordering of the objects of the domain. In other words, cut principles such as CA do not assign cuts to properties per se, but instead assign cuts to properties relative to particular orderings on the domain (R N + in this case). We can conclude that CA is an instance of a more general process of taking cuts on any linear order, or at least any linear order of some distinguished type. Thus, HP is the only instance of the general process of taking numbers as objects. 34 CA, however, is one particular instance of some more general process of taking cuts on linear orders. 35 Of course, arguing that the principle CA is an instance of a more general abstraction process is not the same as identifying the relevant process. In addition, the very notion of “process” itself is fraught with difficulty. We can at least accept that an abstraction process is a general procedure or operation (not necessarily constructive or algorithmic) for generating new mathematical ontology relative to an equivalence relation on a prior, given ontology. The interesting case occurs when the process gives different output for different initial ontologies. The process behind CA is of this sort. The main difficulty comes when we try to identify of which particular process CA is an instance. There are many potential candidates, some of which are: [ p1 ] [ p2 ] [ p3 ] [ p4 ]
Taking cuts on any linear order. Taking cuts on any dense linear order. Taking cuts on any linear order constructed by Bob Hale. Taking cuts on any linear order with an “R” in its name.
Some of these are more plausible as identifications of the relevant process than others. Nevertheless, it is extremely difficult to designate one of these as the correct description of the process that we are actually instantiating when we 34 One could argue that HP is also a particular instance of a more general process. The process in question is the assignment of objects to properties based on some arbitrary equivalence relation between properties. On this reading, BLV would be another application of the same process, although admittedly an unacceptable one. Thus, the present worries regarding inflation can be seen as an instance of what has come to be called the “Bad Company” objection, although, as I attempt to show in the remainder of this paper, differentiating between the good instances of cut abstraction and the unacceptable ones might be more difficult than disengaging the prima facie unproblematic HP from BLV. 35 The distinction between principles and processes is analogous to the common type/token distinction.
The State of the Economy: Neo-Logicism and Inflation
211
apply CA. We do not have to settle the issue here, however, since the problems that the neo-logicist faces regarding domain inflation occur on a number of the more plausible choices. Once we have distinguished between abstraction processes and abstraction principles, it is clear that the important questions about the acceptability of different sorts of abstraction (i.e. inflation, conservativeness, etc.) concern not the principles but the processes. If some abstraction principle is an instance of a more general process then any problems associated with the process ought to tell against the principle. Thus, with regard to domain inflation we are not interested in whether some particular abstraction principle such as CA is inflationary or not, but instead are interested in whether the process instantiated in this particular instance is inflationary in general. There is good evidence that the process underlying CA is in fact unacceptably inflationary, even though CA itself (or any single instance of applying cut abstraction to a particular linear order) is not.
6.
Generalizations of CA and inflation
Although identifying the process that a particular abstraction principle instantiates is difficult, we can make a tentative suggestion that, all else being equal, we should attempt to be as general as possible. Thus, in the case at hand, we need to determine whether cut abstraction, considered as a general process applied to any linear order, is unacceptably inflationary. The easiest way to test the acceptability of an abstraction process is to formulate a second-order formula that generates all of the possible abstracts at once. We will call this principle the Generalized Cut Abstraction Principle, of which CA is a special case: GCA : (∀P)(∀Q)(∀{H, < })[CUT(P, {H, < }) = CUT(Q, {H, < }) ↔ ((∀x)((Hx ∧ P, Q are cut properties on {H, < }) → (Px ↔ Qx)))] This principle just says that, given any linear order, we ought to be able to use cut abstraction to form the “Dedekind” cuts on that linear order. Before investigating whether GCA is acceptable, we need to convince ourselves that it is at least satisfiable. This much is trivial, however, as GCA is satisfied by simple finite models. 36 The more serious concern is whether HP + GCA is satisfiable. The difficulty results from the fact that HP is only satisfied by infinite models, and infinite models have far more interesting linear orders (and orderings in general) than finite models do. Here we will satisfy ourselves by noting that none of the standard paradoxes (Russell’s, Burali-Forti’s, etc.) that plague BLV are derivable (at least by the standard route) from HP + GCA. Thus, there seems no reason, as of yet, to doubt the satisfiability of HP + GCA. 36 This is a consequence of the fact that any finite linear order of size n has at most one cut property (on Hale’s definition of cut property) or n −1 cut properties (on the revised version of the definition, see note 30).
212
The Arché Papers on the Mathematics of Abstraction
Although we have no clear reason for doubting that HP + GCA is satisfiable, its satisfiability is a substantial mathematical assumption, as the following result illustrates: For a linear order ( A, <), let Comp(A, <) be the Dedekind cuts on (A, <):
Theorem 1:37 (AC): Given an infinite cardinal κ, there is a linear order (A, <) such that: |A| ≤ κ and | Comp (A, <)| > κ. Proof: Given an infinite cardinal κ, let λ be the least cardinal ≤ κ such that 2λ > κ. Let A be the subset of functions from λ (as an ordinal) into {0, 1} such that f ∈ A iff there is an ordinal γ < λ such that for all ordinals α ≥ γ , f (α) = 0. For f , g ∈ A, let f < g iff, at the least γ where f (γ ) = g(γ ), f (γ ) = 0. Then |A| ≤ κ by the following computation: |A| = | ≈ γ 2 | ≤ 2|γ | ≤ κ≤λ × κ=κ γ<λ
γ<λ
γ<λ
But: |Comp(A, <)| = 2λ > κ, since Comp(A, <) is isomorphic to the set of all functions from λ to {0, 1}. 38 It follows that HP + GCA is universally inflationary. 39 Thus, if HP + GCA is satisfiable, then any structure that satisfies it has a domain at least the size of a proper class. 40 The neo-logicist at this point might balk at our insistence that he accept GCA. After all, GCA is not an abstraction principle, but is a much stronger second-order universal generalization of a particular abstraction principle. He could argue that only some applications of GCA are epistemically “cheap” from the neo-logicist point of view, restricting the applicability of cut abstraction. A number of such restrictions can be ruled out. One route is to replace the second-order generalization GCA with a schema, thus avoiding the strength of the full second-order version: GCA-schema: All formulae of the form: (∀P)(∀Q)[CUT(P, {H, < }) =
CUT (Q, {H, < })
↔
((∀x)((Hx ∧ P, Q are cut properties on {H, < }) → (Px ↔ Qx)))] 37 The dependence on choice in the proof seems unproblematic, since the current neo-logicist reconstructions of both the naturals and the reals depend on choice as well. 38 A number of less general variants of this result, all depending on the generalized continuum hypothesis, were proved independently by myself, Stewart Shapiro, and an anonymous referee of Hale [2000]. 39 Take any infinite set of objects S of cardinality κ. The restriction of GCA to Swill generate γ many objects for some γ > κ, since Theorem 1 implies that there is a linear order on S that has more than κ cuts. 40 Also, if GCA is satisfiable by a proper class, then this class is not well-orderable, i.e., if GCA is satisfiable, then Global Well Ordering (a very strong choice principle) fails. (For discussion of Global Well Ordering, see Shapiro [1991] and Shapiro and Weir [1999]).
The State of the Economy: Neo-Logicism and Inflation
213
[Where < is a linear ordering on H] This approach only lessens the problem, however, it does not eliminate it. With a schematic approach we can avoid the rather unappealing conclusion that the principle can only be satisfied by structures whose domains are larger than any set, but the result we get is not much more appealing. Let κ1 be 2ℵω , and define κn+1 as the least cardinal γ such that γ > κn and γ = 2λ for some cardinal λ. If we allow tokens of the GCA-schema to contain arbitrary second-order variables, then HP + GCA-schema can only be satisfied by a structure with at least: Lim κi
i→ω
many objects in its domain. In other words, there are infinitely many cardinalities larger than the continuum yet smaller than the smallest domain satisfying HP + GCA-schema. The idea is as follows: We use HP to prove that there are infinitely many objects. Then, since Theorem 1 is expressible in purely second-order terminology, it is a truth 41 of second-order logic. Thus, we get that there exists a linear order on the objects existing so far such that there are 2ℵ0 cuts on that ordering, and by an application of the GCA-schema, we get that there are (at least) 2ℵ0 objects. Apply existential elimination to this statement, combined with the appropriate cut abstraction schema, and we have at least 2κ objects where 2κ > 2ℵ0 . Repeat the process to prove that there are at least 2λ objects where 2λ > 2κ . And so on. For any natural number n, we can prove that there are at least κn many objects using just HP, the GCA-schema, and second-order logic. 42 Thus, unlike the universally inflationary HP + GCA, the smallest structure 43 satisfying HP + GCA-schema might be a set, but if so it is a huge one. Another way to limit cut abstraction is to restrict its application to explicitly definable linear orders. While this might solve the problem of domain inflation, there is a methodological price to pay. The neo-logicist project depends on the fact that second-order quantification supplies expressive power not found in first-order theories. The derivation of arithmetic (or real analysis) is a non-starter if HP (or CA) is replaced by a schema or restricted to definable predicates. To back away from this expressive richness when it causes trouble is a bit ad hoc, given that it was the same richness that made the project seem feasible in the first place. This is not to say that the neo-logicist cannot come 41 Since the neo-logicist frequently make use of set and model theoretic arguments to ground the consistency and acceptability of various abstraction principles, we should feel no qualms in accepting that Theorem 1 is a legitimate second-order logical truth. 42 I have not proved that HP + GCA-schema is boundedly inflationary, but only that, if it is (which is still open) then the bound on the domain inflation is rather large. 43 I make no claim to have shown either that HP + GCA-schema is satisfiable or that the lower bound on the cardinality of a set satisfying HP + GCA-schema produced above is the best result possible. The result is enough to show that the ontological assumptions involved in HP + GCA-schema are substantial.
214
The Arché Papers on the Mathematics of Abstraction
up with principled reasons for restricting cut abstraction, but the prospects for such an account look grim. There is an additional worry regarding such restrictions. If one wants to reconstruct topology along neo-logicist lines then restrictions on the sort of cuts one can take will be a drawback. Often in topology new structures are obtained by adding to an existing structure the collection of limit points of the structure, and this is just a generalization of taking cuts. If we wish to preserve topology within the neo-logicist framework then restrictions on when limit points or cuts can be assumed to exist appear problematic. This is a technical issue that warrants further investigation, but as it stands it does little to ease our present worries. There is another way the neo-logicist might try to avoid rampant domain inflation. We can take a hint from Hale’s emphasis on quantities and restrict cut abstraction to one or another sort of quantitative domain. Restricting the cut principle to minimal q-domains does no good, however, as the following result illustrates: Theorem 2: (AC): Given an infinite cardinal κ, there is a minimal q-domain (A, ⊕) such that: |A| ≤ κ and | Comp (A, ⊕)| > κ. Proof: Given an infinite cardinal κ, let λ be the least cardinal ≤ κ such that 2λ > κ. Let A be the subset of functions from λ (as an ordinal) into the set of rationals Q such that f ∈ A iff there is an ordinal γ < λ such that for all ordinals α ≥ γ , f (α) = 0. For f , g ∈ A, let h = f ⊕ g iff, for every α ∈ λ h(α) = f (α) + g(α). It is easy to verify that A is a minimal q-domain. The proof that |A| ≤ κ and |Comp(A, ⊕)| > κ is similar to the proof of Theorem 1. Thus, if we restrict GCA to minimal domains, it is still universally inflationary. Hale is better off restricting cut abstraction to normal or full q-domains since (i) cut abstraction only inflates on dense domains and (ii) any dense, normal q-domain contains a substructure isomorphic to the rationals and is itself isomorphic to some subset of the reals. In other words, cut abstraction restricted to these types of quantitative domains is boundedly inflationary, and the upper bound in question is relatively low (2ℵ0 ). If we modify the definition of quantitative domains the worrisome inflation returns. Define a minimal q*domain to be: A non-empty collection Q that is (i) closed under an additive operation ⊕ that commutes and associates and (ii) ordered by a relation < that satisfies strong trichotomy: for any a, b ∈ Q, exactly one of a < b, a > b, or a = b holds.
Notice that, in addition to its obvious similarity to Hale’s original definition of minimal q-domain, this definition respects Hale’s intuition that:
The State of the Economy: Neo-Logicism and Inflation
215
. . . what makes the difference between quantitative ordering relations and others is that in the case of a quantitative ordering relation, but not otherwise, the entities which can significantly be asserted to stand in the relation can (at least in principle) be combined in such a way that compounds must come later in the relevant ordering than their components. In other words, for more than to be a quantitative ordering relation, there must be an operation of combination ⊕ on items lying in the field of more than, analogous to addition, such that for any a, b in more than’s field, a ⊕ b is more than a and a ⊕ b is more than b (p. 106)
In other words, the sum of two members of a quantitative domain must be larger than either member. 44 With this definition of minimal q*-domain in place, we get the following result: Theorem 3: (AC): Given an infinite cardinal κ, there is a minimal q*-domain (A, ⊕) such that: |A| ≤ κ and | Comp (A, ⊕)| > κ. Proof: Given an infinite cardinal κ, let λ be the least cardinal ≤ κ such that 2λ > κ. Let A be the subset of functions from λ (as an ordinal) into the set of positive rationals Q + such that f ∈ A iff there is an ordinal γ < λ such that for all ordinals αβ ≥ γ , f (α) = f (β). For f , g ∈ A, let h = f ⊕ g iff, for every α ∈ λ h(α) = f (α) + g(α). f < g iff at the least α such that (α) = g(α), f (α) < g(α). It is easy to verify that A is a minimal q-domain. The proof that |A| ≤ κ and |Comp(A, ⊕)| > κ is similar to the proof of Theorem 1. If we define normal q*-domain by substituting “minimal q*-domain” for “minimal q-domain” in Hale’s definition of normal q-domain, and similarly define full q*-domain by replacing “normal q-domain” with “normal q*domain”, then analogues of Theorem 3 hold for normal and full q*-domains. Given an f , g, h in the q*-domain A constructed in the proof of Theorem 3, let q be defined as: for all α < λ, q(α) = [ f (α) × h(α)] ÷ g(α). It follows that q ∈ A and f : g = q : h, so A is a full q*-domain. Universal inflation has returned. If the neo-logicist wishes to apply of cut abstraction to full q-domains but prohibit taking cuts on full q*-domains, then some relevant difference between the two sorts of structure needs to be explained. Neither definition appears, prima facie, to be more natural or intuitive than the other as an explication of our pre-formal notion of quantity, and the technical differences between the two definitions are subtle. 44 The main difference between a minimal q-domain and a minimal q*-domain is that a minimal q*domain does not have to be closed under subtraction. This does not seem absurd, at least if ordinary usage of language is our guide. It seems natural to say that “Cleopatra is as beautiful as Athena and Helen combined”, yet it is much less natural to say “Athena is as beautiful as Cleopatra is more beautiful than Helen” or “Athena’s beauty is equal to Cleopatra’s beauty minus Helen’s beauty”.
216
The Arché Papers on the Mathematics of Abstraction
There is one final move the neo-logicist might try in order to avoid the problems hovering around cut abstraction. Since the original Fregean idea of taking extensions as objects was salvaged via NewV by making use of the notion of “Big”-ness (see note 7), perhaps the neo-logicist can restrict cut abstraction to linear orders that are not “Big”, formulating something like the following Size-restricted Cut Abstraction Principle: SCA : (∀P)(∀Q)(∀H )(∀ <)[CUT(P, H, <) = CUT(Q, H, <) ↔ ((< is a linear ordering on the H’s and P, Q define cuts on {H, <} and (∀x)(Hx and H is not “Big”)) → (Px ↔ Qx))]44 This restriction defeats the entire purpose of taking cuts as objects in the first place. NewV + HP is satisfied by the (countably infinite) collection of hereditarily finite sets Vω . On this model, all non-“Big” properties have finite extensions. Cut abstraction does not inflate on finite collections, so Vω satisfies HP + SCA as well. But the point of introducing cut abstraction was to generate a theory that only had uncountable models. Thus, SCA is of no help in the search for a neo-logicist reconstruction of the reals. 46
7.
Possible neo-logicist responses
These results certainly throw doubt on the idea that something like cutabstraction can be used by the neo-logicist in an epistemically “cheap” reconstruction of the reals. Although CA is not rampantly inflationary, we have seen that natural generalizations of CA are. The challenge to the neo-logicist is to explain how CA delivers the sought after epistemological “cheapness”. In attempting to answer this question, there are (at least) four possible moves that the neo-logicist could make. 47 I will leave the defense of one or more of these to the neo-logicists themselves, and just mention the main obstacles each is likely to face. The first and most straightforward response is just to bite the bullet and accept the Generalized Cut Abstraction Principle and the resulting rampant inflation. If this is the route the neo-logicist takes, however, then he owes us an account of how unboundedly or universally inflationary abstractions can have the privileged epistemological status claimed for them. The discussion of 45 Hale ([2000], pp. 116–117) suggests something like this in his discussion of domain inflation. He argues that applications of cut abstraction, or any abstraction, on concepts that are coextensive with the entire universe are inadmissible, since we can only abstract on sortal concepts, and self-identity is not a sortal concept. In light of the result above, Hale seems to be shooting himself in the foot with this suggestion. 46 Although I have not explored the issue here, it is unlikely that invoking Cauchy sequences or the like instead of Dedekind cuts will do much good, as straightforward generalizations of the notion of convergent series are no harder to formulate than generalizations of the notion of cut. Since there are no other straightforward methods for generating uncountably many objects (except for the powerset axiom, which is what the neo-logicist is attempting to replace), it seems likely that the neo-logicist will have to be content with his epistemically “cheap” reconstruction of the natural numbers. 47 Thanks are due to Crispin Wright for suggesting (something very close to) this way of classifying the neo-logicist’s possible responses.
The State of the Economy: Neo-Logicism and Inflation
217
inflation above should have already made clear the difficulties such an account must face. Second, the neo-logicist can give up on the notion of abstraction processes, arguing that what is at issue is the acceptability of particular abstraction principles. Here, the neo-logicist would point out that CA is not unacceptably inflationary (nor is any other principle restricted to a particular domain and ordering), and would then claim that the only relevant consideration is the particular principles. On this response the neo-logicist owes us an explanation of why certain applications of cut abstraction, codified in particular abstraction principles, are licit while the general process underlying those applications is irrelevant, illegitimate, incoherent, or whatever. The third option proceeds along similar lines, and faces similar worries. The neo-logicist might restrict the generalization of cut abstraction to particular sorts of linear orders in such a way that Theorem 1 is (and variants of it are) blocked. In other words, the neo-logicist could argue that the abstraction process instantiated by CA is more restricted than GCA. Such a move needs to be accompanied by principled reasons why cut abstraction is acceptable on certain linear orders but not others, and these reasons must be independent of considerations regarding inflation. While I do not claim to have shown that such a defence of CA is impossible, I do believe the results of the previous section render it unlikely. 48 Fourth, and finally, the neo-logicist can accept every instance of GCA, but not simultaneously, avoiding the problematic inflation through a constructive approach to abstraction. The neo-logicist would accept that each instance of cut abstraction is true, and that many of these instances are boundedly inflationary, but by viewing each application of cut abstraction as akin to a construction and thus preventing himself from considering them all at once he can avoid the objection that there is any unbounded or universal inflation. This move seems counter to the spirit of the existing defenses of neo-logicism, however, in that it is decidedly anti-realist. If the subject matter of mathematics is objective and independent of our investigations then whether we consider abstraction principles individually or collectively would seem irrelevant to their truth or to their acceptability. 49 48 Perhaps the most promising such restriction on the applicability of cut abstraction is to make use of Dummett’s notion of indefinitely extensibility. In his discussion of the implications of Gödel’s theorem for our understanding of the natural numbers, Dummett writes:
A concept is indefinitely extensible if, for any definite characterization of it, there is a natural extension of this characterization, which yields a more inclusive concept; this extension will be made according to some general principle for generating such extensions, and, typically, the extended characterization will be formulated by reference to the previous, unextended characterization. ([1963], pp. 195–196) If it turns out that the concept “arbitrary linear order” is indefinitely extensible (Theorem 1 could be read as a “proof” of this), then the neo-logicist could possibly avoid the rampant domain inflation of GCA by restricting cut abstraction to those linear orders that are definite (i.e. not indefinitely extensible). The details of how such a defense would proceed are beyond the scope of this paper. 49 A version of this paper was presented to the members of Arché: The Center for the Philosophy of Logic, Language, Mathematics, and Mind at St. Andrews University and also to the Abstraction Workshop
218
The Arché Papers on the Mathematics of Abstraction
References Benacerraf, P. and H. Putnam [1983], Philosophy of Mathematics: Selected Readings, 2nd ed, Cambridge, Cambridge University Press. Boolos, G. [1987], “The Consistency of Frege’s Foundations of Arithmetic”, in Demopoulos [1995]: 211–233. Boolos, G. [1989], “Iteration Again”, Philosophical Topics 17: 5–21. Boolos, G. [1997], “Is Hume’s Principle Analytic?”, in Heck [1997]: 245–261. Coffa, A. [1991], The Semantic Tradition from Kant to Carnap, Cambridge, Cambridge University Press. Demopoulos, W. (ed.) [1995], Frege’s Philosophy of Mathematics, Cambridge, Mass., Harvard University Press. Dummett, M. [1963], “The Philosophical Significance of Gödel’s Theorem”, Ratio 5: 140–155, reprinted in Dummett [1978]: 186–201. Dummett, M. [1978], Truth and other enigmas, Cambridge, Harvard University Press. Dummett, M. [1991], “Frege’s Theory of Real Numbers”, in Demopoulos [1995]: 386–404. Fine, K. [1998], “The Limits of Abstraction”, in Schirn [1998]: 503–629. Frege, G. [1884], Die Grundlagen der Arithmetic, Breslau, Koebner; The Foundations of Arithmetic, Tr. by J. Austin, 2nd ed., New York, Harper, 1960. Frege, G. [1893], Grundgezetze der Arithmetik I, Hildesheim, Olms. Gödel, K. [1947], “What is Cantor’s Continuum Problem?”, in Benacerraf and Putnam [1983]: 470–485. Hale, R. [2000], “Reals by Abstraction”, Philosophia Mathematica 3: 100–123. Heck, R. [1993], “The Development of Arithmetic in Frege’s Grundgezetze der Arithmetik”, in Demopoulos [1995]: 257–294. Heck, R. (ed.) [1997], Language, Thought, and Logic: Essays in Honour of Michaal Dummett, Oxford, Oxford University Press. Hodes, H. [1984], “Logicism and the Ontological Commitments of Arithmetic”, Journal of Philosophy 81: 123–149. Kreisel, G. [1967], “Informal Rigour and Completeness Proofs”, in Lakatos [1967]: 138–186. Kunen, K. [1980], Set Theory: An Introduction to Independence Proofs, Amsterdam, North Holland. Lakatos, I. (ed.) [1967], Problems in the Philosophy of Mathematics, Amsterdam, North Holland. Quine, W.V.O. [1951], “Two Dogmas of Empiricism”, Philosophical Review 60: 20–43. Russell, B. and A. Whitehead [1913], Principia Mathematica, Cambridge, Cambridge University Press. Schirn, M. (ed.) [1998], Philosophy of Mathematics Today: Proceedings of an International Congress in Munich, The Mind Association, Oxford, Oxford University Press. Shapiro, S. [1991], Foundations without Foundationalism: The Case for Second-order Logic, Oxford, Oxford University Press. Shapiro, S. and Weir, A. [1999], “NewV, ZF, and Abstraction”, Philosophia Mathematica 7: 901–929. Simons, P. [1987], “Frege’s Theory of Real Numbers”, in Demopoulos [1995]: 358–385. Wright, C. [1983], Frege’s Conception of Numbers as Objects, Aberdeen, Aberdeen University Press. Wright, C. [1997], “On the Philosophical Significance of Frege’s Theorem”, in Heck [1997], 201–244.
at Arché, and benefitted considerably from the resulting discussion. Thanks are also owed to Peter Clark, Jon Cogburn, Randy Dougherty, Bob Hale, Fraser MacBride, Graham Priest, Agustin Rayo, George Schumm, Stewart Shapiro, Neil Tennant, Alan Weir, and Crispin Wright for helpful comments and criticism.
FREGE MEETS DEDEKIND: A NEO-LOGICIST TREATMENT OF REAL ANALYSIS 1 Stewart Shapiro
1.
Philosophical and technical preliminaries
This work takes off from the ongoing neo-logicist development of arithmetic that began with Wright [1983], and continues through many extensions, objections, and replies to objections. The basic plan is to develop branches of established mathematics using abstraction principles in the form: (ABS)∀a∀b((a) = (b) ≡ E(a, b)), where a and b are variables of a given type (typically individual objects or properties), is a higher-order operator, denoting a function from items of the given type to objects in the range of the first-order variables, and E is an equivalence relation over items of the given type. In what follows, I will usually omit the initial universal quantifiers. Frege [1884, 1893] himself employed three abstraction principles. One of them, used for illustration, comes from geometry: The direction of l1 is identical to the direction of l2 if and only if l1 is parallel to l2 . Call this the direction principle. The second was dubbed N= in Wright [1983] and is now called Hume’s principle: (Nx : Fx = Nx : Gx) ≡ (F ≈ G), where F ≈ G is an abbreviation of the second-order statement that there is a one-to-one relation mapping the F’s onto the G’s. Hume’s principle states that the number of F is identical to the number of G if and only if F is equinumerous with G. Unlike the direction-principle, the relevant variables, F, G here are second-order. 1 This paper first appeared in Notre Dame Journal of Formal Logic 41, [2000], pp. 335–364. Reprinted by kind permission of the editor and the University of Notre Dame.
219 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 219–252. c 2007 Springer.
220
The Arché Papers on the Mathematics of Abstraction
Let us call an abstraction principle logical if its right hand side contains only logical terminology and operators which are themselves introduced via logical abstractions. Hume’s principle is logical and the direction principle is not. The development of real analysis below invokes only logical abstractions. Frege’s Grundlagen [1884] contains the essentials of a derivation of the Peano postulates from Hume’s principle. This deduction, now called Frege’s theorem, reveals that Hume’s principle entails that there are infinitely many natural numbers. It is generally agreed that this is a powerful mathematical theorem. Who would have thought that so much could be derived from such a simple, obvious truth about cardinality? The third example is the infamous Basic Law V: (Ex : Fx = Ex : Gx) ≡ ∀x(Fx ≡ Gx). Like Hume’s principle, Basic Law V is a second-order, logical abstraction, but unlike Hume’s principle, it is inconsistent. An essential item on the neo-logicist agenda is to articulate principles that indicate which abstraction principles are legitimate and which are not (of which more later). For now, I simply assume that Hume’s principle is an acceptable abstraction principle, yielding the natural numbers. My purpose is to present other (logical) abstraction principles that can be employed to develop a theory of the real numbers, in much the same way that Hume’s principle yields a theory of the natural numbers. The crucial aspect of the treatment—where terminology for real numbers is introduced—roughly follows the development in Richard Dedekind’s celebrated Stetigkeit und irrationale Zahlen [1872], but I formulate the relevant existence principles as Fregean abstractions, rather than Dedekind-type structuralist principles. Dedekind considered himself a logicist, giving an analysis of continuity in logical terms. He was out to refute the Kantian view that continuity is an intuitive notion, underlying our perception of space and time. Dedekind argued that not only can we characterize continuity without invoking intuition, we have to. Continuity is not an intuitive notion, since intuition does not determine whether space is continuous. Dedekind’s own methodology invoked a rather different sort of abstraction. In contemporary terms, Dedekind gave a system that exemplified the target mathematical structure (arithmetic and analysis) and then abstracted the structure itself, as a “free creation” (see Shapiro [1997, Chapter 5, §4], and the more scholarly sources cited there). This seems to be an instance of a traditional, perhaps Aristotelian, process, where one abstracts a universal from one or more of its instances. Frege (e.g., [1884, §§13, 34], [1971, 125]) launched a sustained, bitter assault on abstraction procedures like these (see Shapiro [2000, 67–8]). There are two different perspectives toward the neo-logicist quest. One is the orientation of the established mathematician, observing the neo-logicist project. She in interested in determining which mathematical structures have been recaptured by the neo-logicist, and she inquires about the meta-theoretic
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
221
properties of the neo-logicist systems. Let us call this the external perspective. One prominent instance of this orientation is George Boolos’s [1987] proof that Hume’s principle is equiconsistent with second-order arithmetic, as well as his result that Hume’s principle is satisfiable on every infinite domain. From this perspective, the mathematician uses every tool at her disposal, whether the neo-logicist is able to reconstruct it or not. The external goal is to see what structures the neo-logicist has produced. The perspective is important for making sure that the neo-logicist reconstructions merge smoothly with established mathematics—to avoid unnecessary charges of revisionism, for example—and for assessing the scope and limits of the neo-logicist program. The other orientation tracked here is that of the neo-logicist herself. We focus on mathematical principles that can be stated and derived in a standard second-order logical deductive system, augmented with various abstraction principles. Call this the internal perspective. The deductive system might be the one presented in Frege’s Begriffschrift [1879]. Concerning the philosophical agenda of neo-logicism, the internal perspective is certainly the more important one. At bottom, neo-logicism is an epistemological program. The neo-logicist is trying to provide what is, or what could be, an epistemic foundation of mathematics. She wants to show how the mathematician comes to know, or could come to know, propositions about abstract objects. It will not do to presuppose any established mathematics along the way, since questions would then be begged. Frege’s theorem is a prime example of the internal perspective. Wright and Hale [2000] argue that abstraction principles are (or are like) implicit definitions. One lays down truth conditions for some new vocabulary, and, when successful, we introduce terms that denote abstract objects. Frege’s theorem thus shows how one can come to know the Peano axioms from an implicit definition of the number operator—assuming that Hume’s principle passes muster as an acceptable abstraction, of course. Some of the abstraction principles employed here are not quite in the above form (ABS), since they operate with variables taken two at a time. To illustrate, consider a principle that introduces terminology for ordered pairs: (PAIRS)∀x∀y∀z∀w(π(x, y) = π(z, w) ≡ E(x, y, z, w)), where E(x, y, z, w) is just (x = z & y = w). The principle (PAIRS) differs from the form (ABS) since it has four bound variables rather then two, and the relation E on the right hand side is not an equivalence since it is a four-place relation. Nevertheless, when the variables are taken two at a time, E has the properties corresponding to an equivalence relation. In particular, the relation is reflexive : ∀x∀yE(x, y, x, y)(i.e., ∀x∀y(x = x & y = y)), symmetric : ∀x∀y∀z∀w(E(x, y, z, w) → E(z, w, x, y)), transitive : ∀x∀y∀z∀w∀r ∀s((E(x, y, z, w) & E(z, w, r, s)) → E(x, y, r, s)).
222
The Arché Papers on the Mathematics of Abstraction
Thus, (PAIRS) is the same kind of thing as an abstraction principle, with the variables taken two at a time. Hale [2000] uses four-place abstractions in his own development of real analysis. The principle (PAIRS) lies between first-order abstractions, like the direction principle, and second-order abstractions like Hume’s principle or Basic Law V. Like the direction principle, the bound variables in (PAIRS) are firstorder, but like Hume’s principle, (PAIRS) is not satisfiable on any finite domain with more than one element, and so it is a principle of infinity. If the domain has size n, then we would need n 2 different ordered pairs. On an infinite domain of size κ, the satisfiability of the (PAIRS) principle is equivalent to κ 2 = κ, which is a consequence of the axiom of choice. So like Hume’s principle, if choice holds, then (PAIRS) is satisfiable on any infinite domain. In what follows I do not make direct use of (PAIRS), but many of the abstraction principles invoked do operate on objects taken two at a time. As an alternative, of course, we could first invoke (PAIRS) and then employ only ordinary abstractions defined over pairs.
2.
Integers and rational numbers
Our first chore is to define the integers via an abstraction over the differences between pairs of natural numbers: (DIF)INT(a, b) = INT(c, d) ≡ (a + d) = (b + c). It is straightforward that the relevant analogues of reflexivity, symmetry, and transitivity hold for the equation on the right hand side. Next we define addition on the integers, in the straightforward way: INT(a, b) + INT(c, d) = INT(a + c, b + d). The definition is not circular, despite appearances. The “ + ” sign on the left hand side represents addition on the integers, while the “+” signs on the right hand side represents addition on the natural numbers. It is straightforward to show (internally) that addition is well-defined: if INT(a, b) = INT(a , b ) and INT(c, d) = INT(c , d ), then INT(a + c, b + d) = INT(a + c , b + d ). Moreover, addition on the integers is associative and commutative. There is an identity element on the integers, namely INT(0, 0), and the integers are an abelian group. We define multiplication on the integers thus: INT(a, b) · INT(c, d) = INT(a · c + b · d, b · c + a · d). It is tedious but straightforward to verify that this function is well-defined, multiplication is associative and commutative, and multiplication distributes over addition. The integers form an integral domain.
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
223
The point here is that the relevant theorems are deductive consequences of Hume’s principle, the abstraction (DIF), and the other definitions, in a typical deductive system for second-order logic. In other words, the mathematical development can be carried out using only Fregean resources, and so it is internal to the neo-logicist program. The only open issue is whether the abstractions are kosher as implicit definitions. There is, of course, a natural embedding of the natural numbers in the integers: if a is a natural number, then define I (a), the integer a, to be INT(a, 0). It follows immediately from (DIF) that the embedding is one-toone, and is a homomorphism. Can we simply identify the natural numbers with the corresponding integers? Can we say, for example, that the natural number 6 just is the integer INT(6, 0)? Mathematicians typically talk that way, saying that the integers are an extension of the natural numbers. However, Bertrand Russell [1993, 64] claimed that we cannot make the identifications: “[The integer] + m is under no circumstances capable of being identified with [the natural number] m . . . Indeed, +m is every bit as distinct from m as −m is”. The reason, of course, is that for Russell, natural numbers and integers occur at different places in the type hierarchy. In Russell’s system it is nonsense (not just false) to say that the integer +m is the same as the natural number m. Statements of identity only when applied to items of the same type. So despite Russell’s rhetoric, he held that it is also nonsense to say that +m is distinct from m. In contrast, Frege and the neo-logicist take natural numbers and integers to be individual objects, both within the range of the first-order variables. In the Fregean and neo-logicist framework, the identity relation is unrestricted. So 6 = INT(6, 0) is well formed, and so either 6 = INT(6, 0) or 6 = INT(6, 0). Which is it? Students of logicism and neo-logicism will recognize this as an instance of the Caesar problem (see Hale and Wright [2001]). The general problem is to determine the criteria for determining whether the objects yielded by an abstraction principle are the same or distinct from objects not explicitly yielded by that abstraction principle. The instance here concerns the criteria for determining whether the values of one abstraction operator are the same or distinct from the values of another. Having said that, I propose to avoid the issue here, and speak of the function I as a natural embedding. If context makes it clear, I will sometimes use the term “natural number” ambiguously to refer to both the natural numbers and the non-negative integers, and I will use a numeral like “6” to denote both the natural number 6 and the integer INT(6, 0). We move on to the rational numbers. Here is another abstraction principle, giving quotients: (QUOT)Q(m, n) = Q( p, q) ≡ (n = 0 & q = 0) ∨(n = 0 & q = 0 & m · q = n · p),
224
The Arché Papers on the Mathematics of Abstraction
where m, n, p, q are integers. We define a rational number to be a quotient Q(m, n), where n = 0 (i.e., n = INT(0, 0)). Once again, it is straightforward that the relevant analogues of reflexivity, symmetry, and transitivity hold for the equation on the right hand side—due to the associative and commutative properties of multiplication on the integers. We define addition and multiplication thus: Q(m, n) + Q( p, q) = Q(m · q + p · n, n · q), Q(m, n) · Q( p, q) = Q(m · p, n · q). As is getting usual, it is straightforward but tedious to show that addition and multiplication are well-defined on the rational numbers, that addition and multiplication are associative and commutative, and that multiplication distributes over addition. The additive identity is Q(0, 1) (i.e., Q(INT(0, 0), INT(1, 0))), and the multiplicative identity is Q(1, 1). It can be established that the rational numbers are an ordered field. All of these results are deductive consequences of the various abstraction principles and the other definitions. That is, everything so far is internal. Again, there is a natural embedding of the integers in the rational numbers. If m is an integer, then define I (m) = Q(m, 1). The embedding is one-toone and preserves addition, multiplication, and the order. So the integers are isomorphic to a subset of the rational numbers. As with the natural numbers and integers, I duck this version of the Caesar problem, but I do sometimes speak ambiguously of some of the rational numbers as “integers”, and I use terms like “0” and “−1” ambiguously to denote the indicated integers and rational numbers.
3.
Enter Dedekind: the real numbers
If we start with a countable ontology and apply any abstraction principle that operates on pairs of first-order variables, then we will end up with at most countably many abstracts. So we cannot reconstruct the real numbers using the above techniques. Here we transform Dedekind’s insight into a second-order abstraction principle, via an equivalence relation on properties of rational numbers. Let P be a property (of rational numbers) and r a rational number. Say that r is an upper bound of P, written P ≤ r , if for any rational number s, if Ps then either s < r or s = r . In other words, P ≤ r if r is greater than or equal to any rational number that P applies to. Consider the Cut Abstraction Principle: (CP) ∀P∀Q(C(P) = C(Q) ≡ ∀r (P ≤ r ≡ Q ≤ r )). In words, the cut of P is identical to the cut of Q if and only if P and Q share all of their upper bounds. 2 2 I am indebted to John Burgess for suggesting that the abstraction be put this way.
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
225
It is easy to establish that the relation on the right hand side of (CP) is an equivalence. Notice that (CP) is (or abbreviates) a logical abstraction, in that all of the terminology on its right hand side is either logical or an operator introduced in another logical abstraction (i.e., DIF, QUOT). Define a property P to be bounded if there is a rational number r such that P ≤ r . That is, P is bounded if there is a rational number that is greater than or equal to every number to which P applies. Define a property P to be instantiated if there is a rational number s such that Ps. We hereby define a real number to be a cut C(P), where P is bounded and instantiated. This, of course, is the analogue of the usual “definition” of a real number as the cut of a bounded, non-empty set of rationals. As an aside, notice that the neo-logicist could also produce (a version of) real analysis by using Cauchy sequences instead of cuts: define a “sequence” to be a binary relation R between natural numbers and rational numbers such that for every natural number a there is exactly one rational number r such that Rar. Then introduce a limit abstraction on sequences, as follows: L(R1 ) = L(R2 ) if and only if for every rational number r , if 0 < r then there is a natural number a such that for every natural number b > a, and rational numbers s1 , s2 , if R1 bs1 and R2 bs2 , then −r < (s1 − s2 ) < r . It is straightforward to define what it is for a sequence, so construed, to be Cauchy. Our neo-logicist could thus define “real number” to be “limit” of a Cauchy sequence. Returning to Dedekind, if C(P) and C(Q) are real numbers, then define C(P) < C(Q) if C(P) = C(Q) and for every rational number r , if Q < r then P < r . It is straightforward to verify that this is well-defined: if C(P) = C(P ), C(Q) = C(Q ), and C(P) < C(Q) then C(P ) < C(Q ). From excluded middle, we see that this relation is a linear order on the real numbers. From an instance of the comprehension scheme, there is a property ZERO that holds of a rational number r if and only if r < 0. Define a real number C(P) to be zero if C(P) = C(ZERO). There is only one such real number. Define a real number C(P) to be positive if there is a rational number r > 0 such that Pr (and so C(ZERO) < C(P)). Define a real number C(P) to be negative if there is a rational number r such that r < 0 and P ≤ r . It is straightforward that for every real number C(P), exactly one of the following holds: C(P) is positive, C(P) is negative, or C(P) is zero. If P and Q properties of the rational numbers, then define P + Q to be the property that holds of a rational r just in case r is less than the sum of rational number of which P holds with a rational number of which Q holds: (P+ Q)r ≡ ∃x∃y(Px & Qy & r < x + y). The existence of P + Q follows from an instance of the comprehension scheme of the second-order language. If C(P) and C(Q) are real numbers, then so is C(P + Q). We define addition on the real numbers: C(P) + C(Q) = C(P + Q). Addition is well defined, and C(ZERO) is the additive identity.
226
The Arché Papers on the Mathematics of Abstraction
If P is a property of rational numbers, then let −P be the property that holds of a rational number r if and only if P ≤ −r . That is −Pr if and only if −r is an upper bound for P. Notice that if C(P) = C(P ) then −P is coextensive with −P . If P ≤ r then −P(−r ). So if P is bounded then −P is instantiated. Suppose that Ps. Then for any rational number r , if −Pr then P ≤ −r . So either s = −r or s < −r . So either r = −s or r < −s. So −P ≤ −s. Thus, if P is instantiated, then −P is bounded. Therefore, if C(P) is a real number then so is C(−P). We now show that if C(P) is a real number, then C(−P) is its additive inverse. If (P + −P)r then there are rationals s1 , s2 such that Ps1 , −Ps2 and r < (s1 + s2 ). So P ≤ −s2 . So either s1 = −s2 or s1 < −s2 . So either s1 + −s2 = 0 or s1 + −s2 < 0. Therefore r < 0. For the converse, suppose that r < 0. Then 0 < −r . Pick rationals s1 and s2 such that Ps1 , P ≤ s2 , and s2 − s1 < −r . We have that −P(−s2 ) and r < s1 + (−s2 ). So (P + −P)r . Therefore, (P+ −P)r if and only if r < 0. So C(P + −P) is C(ZERO). Thus, the real numbers are an abelian group under addition. The next item on the agenda is multiplication on the real numbers. I fear that matters get more tedious, mostly because we are dealing with the ordering of products of positive and negative rational numbers. Dedekind [1872, §6] himself gave a reasonably strict account of the addition of real numbers, and then added: Just as addition is defined, so can the other operations of the so-called elementary arithmetic be defined, viz., the formation of differences, products, quotients, powers, roots, logarithms, and in this way we arrive at real proofs of theorems . . . which to the best of my knowledge have never been established before. The excessive length that is to be feared in the definitions of the more complicated operations is partly inherent in the nature of the subject but can for the most part be avoided.
However, Dedekind provided hardly any detail on how to avoid the “excessive length”, beyond a few remarks about continuity. If P and Q are properties of rational numbers, then define P · Q to be the property that holds of a rational number r if and only if: ∃s∃t (Ps & Qt & 0 < s & 0 < t & r < s · t) ∨∃s∃t (P ≤ s & Q ≤ t & (s < 0 ∨ s = 0) & (t < 0 ∨ t = 0) & r < s · t) ∨P ≤ 0 & ∃t (Qt & 0 < t) & ∀u∀v((P ≤ u & Qv & (u < 0 ∨ u = 0) & 0 < v) → r < u · v)) ∨Q ≤ 0 & ∃t (Pt & 0 < t) & ∀u∀v((Q ≤ u & Pv & (u < 0 ∨ u = 0) & 0 < v) → r < u · v)). The first disjunct is for the cases where C(P) and C(Q) are both positive; the second disjunct is for the cases where neither C(P) nor C(Q) is positive; the third disjunct is for the cases where C(P) is not positive and C(Q) is positive;
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
227
and the last disjunct is for the remaining cases where C(P) is positive and C(Q) is not positive. Using classical logic, it is tedious but straightforward to verify that multiplication is well-defined, and is a function on the real numbers. Define a property ONE that holds of a rational number r if and only if r < 1. It is straightforward to verify that for any real number C(P), C(P)·C(ONE) = C(P), and so C(ONE) is the multiplicative identity. If P is a property of rational numbers, then let P −1 be the property that holds of a rational number r if and only if ∃s∃t∃u(Ps & 0 < s & P ≤ t & t · u = 1&r < u) ∨ ∃t∃u (P ≤ t & t < 0 & t · u = 1 & r < u). Notice that if P is bounded and instantiated by a positive rational number, then (by the first disjunct), P −1 is instantiated. Moreover, if Ps and 0 < s, then P −1 is bounded by s −1 . Similarly, if P is bounded with a negative rational number, then (by the second disjunct) P −1 is instantiated. In this case also, if Ps, then P −1 is bounded by s −1 . So if C(P) is a real number other than zero, then P −1 is instantiated and bounded, and so C(P −1 ) is a real number. It is straightforward to verify that if C(P) is a real number other than zero, then C(P) · C(P −1 ) = C(ONE). So the real numbers are a field. Since the positive real numbers are closed under addition and multiplication, the real numbers are an ordered field. We have all this internal to the neo-logicist framework. It is straightforward to embed the rational numbers in the real numbers. If r is a rational number, then let Pr be the property that holds of a rational number q if and only if q < r . Clearly, Pr is instantiated and bounded. Let I (r ) be the corresponding cut C(Pr ). Notice that I (0) = C(ZERO) and I (1) = C(ONE). The embedding I is one-to-one and preserves addition, multiplication, and the “less-than” relation. Therefore the rational numbers are isomorphic to a subset of the real numbers. Again, I neither assert nor deny that I is the identity mapping, but I will still speak of some of the real numbers as “rational”, noting the (possible) ambiguity, and I will let a term like “.5” ambiguously denote the indicated rational number and the corresponding real number. The usual proof that the square root of 2 is irrational can be carried out in the indicated neo-logicist deductive system. Let Q be the property that holds of a rational number q just in case there is a rational number r such that r · r < 2 and q < r . Then Q is instantiated and bounded, and then there is no rational number s such that C(Q) = C(Ps ). The remaining axiom for the real numbers is the completeness principle, stating that if a non-empty set S (or property) of real numbers is bounded from above, then S has a least upper bound. It has a straightforward formulation as a second-order sentence: ∀X {(∃yXy & ∃x∀y(Xy → (y < x ∨ y = x))) → ∃x[∀y(Xy → (y < x ∨ y = x)) & ∀z(∀y(Xy → (y < z ∨ y = z)) → (x < z ∨ x = z))]}.
228
The Arché Papers on the Mathematics of Abstraction
Here the second-order variable X ranges over all properties (or sets) of real numbers. Standard reasoning establishes that the completeness principle holds for the real numbers presented here. We present this argument with a little detail in order to show that the completeness principle is derivable from the above abstraction principles and other definitions in a typical second-order deductive system. That is, the completeness principle is derivable internally. Proceeding informally, let be a property (or set) of real numbers and assume that is non-empty and bounded from above. That is, there is a real number C( A) such that (C(A)) and there is a real number C(B) such that for any real number C(P), if (C(P)) then either C(P) = C(B) or C(P) < C(B). We need to show that has a least upper bound. Define a property Q that holds of a given rational number r if and only if there is a real number C(P) such that (C(P)) and Pr. That is Qr holds if and only if r instantiates a real number of which holds. 3 We first show that C(Q) is a real number (i.e., that Q is instantiated and bounded). We have that (C(A)). Since C(A) is a real number, there is a rational number q such that Aq. Thus Qq and so Q is instantiated. Since C(B) is a real number, B is bounded. Let B ≤ s. Suppose that Qq. Then there is a real number C(P) such that (C(P)) and Pq. Since C(B) is an upper bound for , either C(P) = C(B) or C(P) < C(B). So we have that either q < s or q = s. So Q ≤ s and Q is bounded. Thus, C(Q) is a real number. Next we show that C(Q) is an upper bound for . Suppose that (C(P)). For any rational number r , if Pr then Qr; so every bound of Q is a bound of P. Thus, either C(P) = C(Q) or C(P) < C(Q). Finally, we show that Q is the least upper bound for . So suppose that C(S) is an upper bound for . And suppose that S ≤ t. We have to show that Q ≤ t. Suppose that Qr. Then there is a real number C(P) such that (C(P)) and Pr. Since C(S) is an upper bound for , we have that C(P) is less than or equal to C(S). So P ≤ t, and thus r is less than or equal to t. So we have that Q ≤ t. Therefore, either C(Q) = C(S) or C(Q) < C(S). In sum, we can derive from the various abstraction principles and other definitions that the real numbers, as presented here, constitute a complete, ordered field. This completes the analogue of Frege’s theorem. Of course, unlike Frege’s own derivation, there is not much originality here. Thanks to Dedekind, I knew what I was looking for. Still, we have an internal derivation of the axioms of real analysis, from the various abstraction principles and explicit definitions. Recall that the axiomatization of second-order analysis is categorical. Thus, from the external perspective of the classical mathematician, the neologicist has reconstructed an instance of the familiar real-number structure. The real numbers, as presented here, are isomorphic to the continuum, as 3 If we think of and the various P as sets, then Q is the union of : Q = {r : ∃P(r ∈ P & P ∈ )}.
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
229
traditionally understood, and in particular, there are uncountably many real numbers. I presume that this is welcome confirmation of the present version of the neo-logicist program. Nevertheless, the last bit of information, concerning the size and structure of the real numbers, comes from the “outside”. It relies on the theorem of set theory that the real numbers are uncountable (i.e., Cantor’s theorem) and it relies on the well-known fact that all complete, ordered fields are isomorphic. As noted above, the neo-logicist is trying to capture as much of traditional mathematics as possible, from the “inside”, as it were. So the neologicist does not want to rely on an external set-theoretic meta-theory in order to establish the claims. But at least externally, we know that the neo-logicist has hit (a structure isomorphic to) the target—assuming, of course, that all of the invoked abstraction principles are acceptable. Fortunately, the relevant cardinality claim can be established internally. Within a pure second-order language augmented with the abstraction operators, one can formulate the statement that the neo-logicist’s real numbers are uncountable, and one can derive this statement in a typical deductive system: First, define a binary relation R to be a real-counter if for each natural number n there is exactly one real number C(P) such that RnC(P). That is, R is a real-counter if it establishes a function from the natural numbers to the real numbers. The real numbers are uncountable if and only if there is no real-counter that has every real number in its “range”. A diagonal argument establishes this: Theorem 1: In a standard deductive system for second-order logic, one can deduce the following from the indicated abstraction principles and explicit definitions: for every real-counter R, there is a real number C(Q) such that for every natural number n, it is not the case that RnC(Q). Proof sketch: The following is a sketch of a derivation within a typical second-order deductive system. It is a reproduction of a diagonal argument. Suppose that R is a real-counter. To fix notation, for each natural number n, let C(Pn ) be the unique real number such that RnC(Pn ). So the real numbers “counted” by R are C(P0 ), C(P1 ), . . . We now define a relation S between natural numbers and rational numbers. The relation S is to be a function, in the sense that for each natural number n, there is exactly one rational number r such that Snr. So we write S(n) = r for Snr. We can proceed by recursion, thanks to Dedekind’s and Frege’s techniques for converting definitions by recursion into explicit definitions in the secondorder language (using an instance of the comprehension scheme). If P0 ≤ 1 then let S(0) = 2; otherwise let S(0) = 0. If P1 ≤ (S(0) + .1) then let S(1) = S(0) + .2; otherwise let S(1) = S(0). If P2 ≤ S(1) + .01) then let S(2) = S(1) + .02; otherwise let S(2) = S(1) In general, suppose that S(n) has been defined. If Pn+1 ≤ (S(n) + 10−(n+1) ) then let S(n + 1) = S(n) + 2 · 10−(n+1) ; otherwise let S(n + 1) = S(n).
230
The Arché Papers on the Mathematics of Abstraction
Now define Q to be the property that holds of a rational number r if and only if there is a natural number n such that S(n) = r . Clearly, Q is instantiated, since either Q0 or Q2. One can show by induction that for each natural number n, S(n) < 4 − 10−n . A fortiori, for each natural number n, S(n) < 4, and so Q is bounded. Thus, C(Q) is a real number. All that remains is to show that C(Q) is not in the “range” of R. This amounts to showing that for each natural number n, C(Q) = C(Pn ). We proceed by induction. Recall that if P0 ≤ 1 then S(0) = 2. In this case, we have that Q2, and so it is not the case that Q ≤ 1 and so C(Q) = C(P0 ). If it is not the case that P0 ≤ 1 then S(0) = 0. In this case, we show by induction that for each natural number n, if 1 < n then S(n) < .5 − 10−n . So Q ≤ .5 and so Q ≤ 1. Thus, C(Q) = C(P0 ). The induction step is similar. If Pn+1 ≤ (S(n) + 10−(n+1) ) then S(n + 1) = S(n) + 2 · 10−(n+1) . In this case we have Q(S(n) + 2 · 10−(n+1) ) and so it is not the case that Q ≤ (S(n) + 10−(n+1) ), and C(Q) = C(Pn+1 ). If it is not the case that Pn+1 ≤ (S(n) + 10−(n+1) ) then S(n + 1) = S(n). Then as above we show by induction that for all m, S(m) < (S(n) + 10−(n+1) ). So Q ≤ (S(n) + 10−(n+1) ), and hence C(Q) = C(Pn+1 ). The above result is an internal version of Cantor’s theorem. It indicates that the cut principle (CP) has increased the size of the ontology. Starting with the countably infinite domain of rational numbers, it produces uncountably many “cuts”. If the neo-logicist expands to third-order language, then she can state and prove—internally—that the real numbers are “equinumerous” with the properties of natural numbers. That is, one can show that there is a “one to one” relation from the real numbers onto equivalence classes of properties of natural numbers, under coextensiveness. This corresponds to the set-theoretic theorem that there are as many real numbers as sets of natural numbers. It is widely agreed that model theory, and meta-mathematics generally, are foreign to the Fregean program. 4 However, Frege was able to use his logical system to recapitulate something sufficiently resembling meta-theory (see his later lectures on geometry ([1903, 1906], translated in [1971]). With characteristic rigor, Frege anticipated a technique now attributed to F. P. Ramsey [1925]: one replaces an axiomatization with an explicit definition of a secondlevel concept, i.e., a relation on relations. We can do the same here. In the neologicist language, we can formulate a (third-order) formula, which corresponds to the statement that a given sequence of properties, objects, functions, and relations is a complete-ordered field (i.e., a model of real analysis). The neologicist can prove that the real numbers, as defined above, together with the given functions and relations, satisfy this formula. Moreover, the neo-logicist can prove that any two sequences that satisfy this formula are isomorphic. That 4 See, for example, van Heijenoort [1967], Goldfarb [1979], and Shapiro [1997, Chapter 5].
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
231
is, the neo-logicist can show, internally, that the real numbers are a complete ordered field, and she can show that any two complete ordered fields are isomorphic. This completes the internal development. I noted above that Boolos [1987] showed that Hume’s principle is equiconsistent with second-order arithmetic. Similar techniques can be used to establish that Hume’s principle, (DIF), (QUOT), and (CP) together are equiconsistent with second-order real analysis. This theorem is “external” to the neo-logicist framework itself, in that the result is proved in the background set theory. Unlike the above categoricity result, the requisite background model theory has not been fully recaptured in the neo-logicist framework. One sticking point is the Boolos result that Hume’s principle is satisfiable on any infinite domain. In its full generality, this result uses the axiom of choice (in particular, that any set is equinumerous with a cardinal in the aleph-series). It turns out that this use of choice is necessary, since there are models of Zermelo–Fraenkel set theory in which Hume’s principle is not satisfiable on the continuum. However, in order to capture real analysis, the neo-logicist need not invoke the full power of Hume’s principle, since the only “cardinalities” used to develop real analysis are natural numbers. So for the purposes of developing real analysis, the neo-logicist may get by with a restricted version of Hume’s principle (see Heck [1997]). The issues involving consistency and choice are rather subtle, and I do not venture a conjecture as to whether the equiconsistency can be formulated and proved internally.
4.
The acceptability of abstractions I: conservation
One important, outstanding philosophical issue concerns the extent to which (CP), and the other abstraction principles invoked above, are acceptable neologicist principles. The tragic example of Basic Law V reminds us that not every abstraction principle can serve as an epistemic foundation for a mathematical theory. The neo-logicist must articulate and defend criteria that distinguish the legitimate abstraction principles from their syntactically similar pretenders. The response to this “bad company” objection remains an ongoing project on the agenda (see, for example, Boolos [1997], Wright [1997], and Weir [2000]). Here I test (CP), and related abstraction principles, against some of the ideas put forth in the literature, and I suggest refinements on those criteria in light of the present framework. One glaring difference between Basic Law V and Hume’s principle is that the latter is consistent while the former is not. Consistency is surely necessary for an abstraction principle to be acceptable, but it is not sufficient. Boolos [1997] has pointed out there are consistent abstraction principles that have no infinite models. One such is the Nuisance principle presented in Wright [1997], which is satisfiable on any finite domain, but not on any infinite
232
The Arché Papers on the Mathematics of Abstraction
domain. If we assume that Hume’s principle is an acceptable abstraction, then the Nuisance principle is not acceptable (and vice versa). The Nuisance principle cannot be satisfied on any domain that includes the natural numbers. Hume’s principle cannot be satisfied on any domain that satisfies the Nuisance principle. One natural suggestion is that a legitimate abstraction principle should be a conservative extension of any theory to which it is added. Formally, let A be an abstraction principle and let T be a theory whose language does not contain the operator introduced by A. Then A is conservative over T if for any sentence in the language of T , is a consequence of T + A only if is a consequence of T alone. That is, the addition of A to the theory T does not have any consequences in the old language that were not already consequences of the old theory. Suppose that A is conservative over every base theory, and suppose that contains no non-logical terminology. Then is a consequence of A only if is logically true. Although this would be a nice feature for a view that calls itself logicist, the requirement is too strong (if neo-logicism is to have any chance of success). Let INF be a second-order statement, with no non-logical terminology, that entails that the universe is Dedekind-infinite (see Shapiro [1991, 100]). Let T be any theory that does not entail that the universe is Dedekind-infinite. Then Hume’s principle entails INF, but by hypothesis T itself does not. So Hume’s principle is not a conservative extension of any consistent theory that does not already entail the existence of infinitely many objects. Wright [1997, 230–239] points out this violation of conservativeness is due solely to the existence of the natural numbers, and has nothing to do with the items recognized to be in the ontology of the base theory T . He thus proposes a modification of the conservativeness requirement: an acceptable abstraction principle A should not have any consequences other than what follows from the existence of the abstract objects yielded by A. That is, a legitimate abstraction principle should have no new consequences concerning any objects already recognized to be in the ontology of the base theory. Of course, Basic Law V violates this conservativeness requirement—big time. If T is consistent, then it plus Basic Law V has lots of consequences about the ontology of T that do not follow from T alone. Although it is consistent, the Nuisance principle also violates this requirement. Recall that this particular abstraction is not satisfiable in any infinite domain. Suppose that we add the Nuisance principle to a consistent theory about rock stars. It follows, in the combined theory, that there are only finitely many rock stars. So unlike Hume’s principle, the Nuisance principle has consequences concerning the ontology of the base theory. Perhaps it is plausible enough that there are only finitely many rock stars, but this may not have been a consequence of our prior theory about rock stars. Invoking
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
233
an abstraction principle should not by itself tell us how many rock stars there are. Let κ be a cardinal number. It seems that a legitimate abstraction principle should not entail that there are at most κ-many things (or exactly κ-many things)—unless the prior theory already entails this for the objects recognized to be in its ontology. Wright [1997, 230–239] provides a first approximation to a rigorous formulation of the revised conservativeness requirement. Suppose that A is an abstraction principle, and let Sx be a predicate “true of exactly the referents of the” newly introduced terms. In the case of Hume’s principle, Sx states that x is a cardinal number (i.e., ∃F(x = Ny : Fy)). If is a sentence in the language of the base theory, then let be the result of restricting the quantifiers 5 in to ¬S. So in the case of Hume’s principle, states that holds of the nonnumbers. Let T be any theory. The conservativeness requirement is that for any sentence in the language of T , A + T entails only if T entails . In other words, if the combined theory entails something about the non-abstracts, then that must be a consequence of the base theory alone. This formulation is not quite right, for two reasons. First, let be the sentence “Tony Blair is more intelligent than George W. Bush”. Since has no quantifiers, is just . Let U be the theory with the single axiom: “If the universe is Dedekind-infinite, then Tony Blair is more intelligent than George W. Bush”: {(INF → )}. Then U plus Hume’s principle entails . However, U itself does not entail . So Hume’s principle does not meet the letter of the present articulation of the conservativeness requirement. What went wrong? The intuitive idea behind the requirement is that abstraction principles should have no consequences concerning the “old” objects, the items not explicitly yielded by that very abstraction principle, and not explicitly recognized to be in the range of the first-order variables of the base theory. But as the base theory U is formulated, its quantifiers (i.e., the quantifiers in INF) are not explicitly restricted to non-abstracts. This suggests the following formulation of the requirement, as the next approximation: 6 for any sentence in the language of T , T plus the abstraction principle entails only if T entails . This handles the above counterexample. Recall that the theory is “If the universe is Dedekind-infinite, then Tony Blair is more intelligent than George Bush”. So T is “If the non-abstracts are Dedekind-infinite, then Tony Blair is more intelligent that George Bush”. Hume’s principle has no untoward consequences concerning that proposition. 5 That is, we restrict the first-order quantifiers to ¬S and restrict the second-order quantifiers to the properties and relations on ¬S. 6 This version of the conservativeness requirement is close to that formulated in Field [1980, 12], in a different context. Fine [1998, 626–627] suggests something in the neighborhood of this requirement, on behalf of neo-logicism. See also Shapiro and Weir [1999].
234
The Arché Papers on the Mathematics of Abstraction
The conservativeness requirement needs a little more tweaking. Like Wright’s original formulation of conservativeness this last, corrected formulation make the most sense when T is a theory about concrete objects. Assuming that no abstract objects are concrete, we can be sure that none of the items in the intended range of the quantifiers of the base theory T include the objects yielded by the abstraction principle A. So in this case, it is appropriate to restrict the quantifiers of the base theory to non-abstracts since, presumably, the base theory is solely about non-abstracts. But this is not the most general case. In the above treatment, for example, we introduce (DIF), (QUOT), and (CP) on theories that are already about abstract objects—the natural numbers, the integers, and the rational numbers respectively. In each case, the quantifiers of the base theories range over abstract objects. Recall that we leave it open whether some of the “introduced” abstracts are already in the range of the quantifiers of the base theory. For example, we leave it open whether the real number 2 is identical to or distinct from the rational number 2, the integer 2, and the natural number 2. Depending on how these instances of the Caesar problem are resolved, it may not be correct to restrict the quantifiers of the base theory to the items not “introduced” by the abstraction principle in question, for some of those items may already be in the range of the quantifiers of the base theory. Instead, when one adds an abstraction principle A to a base theory T , she should restrict the quantifiers of T to whatever range it had previously, explicitly leaving it open whether there is any overlap between that range and the abstracts yielded by A. Formally, let O be a monadic predicate that is not in the language of the abstraction principle A or the base theory T . Intuitively, the extension of O is to be the intended range of the quantifiers of the base theory—the class of objects its variables are is supposed to range over. If is a formula, then let O be the result of restricting the quantifiers in to O. Our final formulation of conservativeness is this: for any sentence in the language of T , T O + A entails O only if T entails . Since O is a new predicate, there are no formal constraints on its extension. So if T O + A entails O then it does so no matter how the Caesar question is resolved. If the neo-logicist has a general solution to the Caesar question (see Hale and Wright [2001]), we could further tweak the requirement so that the extension of O is the exact ontology of the base theory T , according to the Caesar resolution. The stronger, general requirement will do for present purposes. We are not finished articulating the conservativeness requirement. There is still an interesting and important issue concerning how logical consequence is to be understood in this context. There is, first, a deductive approach. Say that an abstraction principle A is deductively conservative over a base theory A if for any sentence in the language of T , if O can be deduced from T O + A, then can be deduced from T alone.
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
235
Unfortunately, the relevant results are not forthcoming for this notion of conservativeness, on standard deductive systems for second-order logic (see Shapiro [1991, Chapter 3]): 7 Theorem 2: The cut principle (CP) is not deductively conservative over its own base theory (Hume’s principle together with the (DIF), (QUOT), and the explicit definitions). Proof sketch: Recall that Frege’s theorem is a derivation of the axioms of second-order Peano arithmetic from Hume’s principle (plus explicit definitions). Let G be a standard Gödel sentence for second-order PA, so that G is true of the natural numbers, but G cannot be derived from the axioms of second-order Peano arithmetic. Boolos’s [1987] argument shows that Hume’s principle (plus the definitions) is conservative over second-order Peano arithmetic. So G cannot be derived from Hume’s principle. The abstractions (DIF) and (QUOT) used to introduce the integers and the rational numbers are conservative over Hume’s principle, since one can define a model of those structures in the natural numbers (with a pair function). So G cannot be derived from the base theory of (CP). However, we saw above that (CP) entails the axioms of second-order real analysis. The latter is equivalent to third-order Peano arithmetic, and is not deductively conservative over second-order Peano arithmetic. In particular, from (CP) one can define a truth predicate for secondorder Peano arithmetic, and prove the Gödel sentence G O for that theory. I submit that if neo-logicism is to have any chance of success, then deductive conservativeness is the wrong requirement. Whether the program in this paper or the one in Hale [2000] succeeds or not, at some point the neo-logicist is going to (try to) introduce the real numbers from abstraction principles, and derive the axioms of second-order real analysis from those. The resulting theory will thus fail to be deductively conservative over Hume’s principle. In general, strong theories are not deductively conservative over weaker ones. If the neo-logicist wants to develop theories as strong as classical real analysis, she must eschew deductive conservativeness. It is common in mathematics to learn more about a mathematical structure by embedding it in a richer one. In the case at hand, reference to the “new” abstracts—the real numbers—allows us to define properties (or sets) of natural numbers that cannot be defined in the language of arithmetic. Application of the induction principle to these properties yields the new theorems. I maintain, however, that there is something right about conservativeness. One option would be for the neo-logicist to formulate a more subtle deductive notion that restricts the instances of comprehension to be used in the derivations—more tweaking. For example, she might insist that all predicative 7 The same goes for the restricted version of Basic Law V that Hale [2000] employs to construct a complete quantitative domain. As far as I know, it is open whether Hume’s principle is deductively conservative over relatively weak, consistent theories.
236
The Arché Papers on the Mathematics of Abstraction
consequences of the combined theory be provable from the original theory. Besides looking ad hoc, however, this Lakatosian monster-barring misses the point. The resolution is to invoke a notion of logical consequence according to which the new theorems of T O + A are still entailed by the base theory T (alone). Since second-order logic is not complete (see Shapiro [1991, Chapter 4]), model-theoretic consequence does not match deductive consequence. Say that an abstraction principle A is model-theoretically conservative over a base theory T if for any sentence in the language of T , if O is true in every model of T O + A, then is true in every model of T . Suppose that A is model-theoretically conservative but not deductively conservative over a base theory T . Then by adding the abstraction principle, we can derive new theorems in the language of T , but these new theorems are in fact logical (model-theoretic) consequences of T alone. One might argue that the abstraction principle A allows us to see that these new theorems are in fact logical consequences of T alone. There is an interesting model-theoretic property that bears on this matter. Let P be an interpretation of the combined language of an abstraction A and base theory T , and let d be a subset of the domain of P that is closed under the functions of T . Define the d-restriction of P to be the interpretation whose domain is d in which the extensions of the non-logical terminology of T are the restrictions of their extensions in P. Let M be an interpretation of the base theory T . Say that P is an extension of M, written M∝P, if there is a subset d of the domain of P such that M is isomorphic to the d-restriction of P. In other words, P is an extension of M if M is isomorphic to a submodel of P. Define an abstraction principle A to be uniformly compatible with a base theory T if for each model M of T there is a model P of A such that M ∝ P. Uniform compatibility is a nice feature for a proposed abstraction principle to enjoy: if A is uniformly compatible with T then every model of T can be extended to a model of the abstraction A, possibly by adding elements to the domain of discourse. The new elements, of course, are the abstracts, or some of the abstracts (depending on how the Caesar issue is resolved). This seems to be the main idea behind the neo-logicist program. Uniform compatibility is sufficient for model-theoretic conservativeness: Theorem 3: If A is uniformly compatible with T then A is modeltheoretically conservative over T . Proof: Suppose that a sentence is false in some model M of T . Let P be a model of A such that M_P. Then M is isomorphic to a submodel of P. Let the extension of O be the domain of this submodel. So P satisfies T O + A. Since is false in M, O is false in P. The desired results are now forthcoming, sometimes for rather trivial and unilluminating reasons:
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
237
Theorem 4: Hume’s principle is uniformly compatible with (and so modeltheoretically conservative over) any theory T . Proof: Let M be a model of T . Suppose, first, that the domain of M is finite. Let the domain of P consist of the domain of M together with the natural numbers and the one additional set ℵ0 . Interpret the non-logical terminology of T in P as it is in M. For each subset F of the domain of P, define Nx:Fx to be the cardinality of F. It is straightforward to verify that P makes Hume’s principle true under this interpretation, and that M ∝ P. Now suppose that the domain of M is infinite. Then by a well-known result noted in Boolos [1987], it is possible to interpret the Nx:Fx operator on P to make Hume’s principle true. 8 So M ∝ P (even without adding new elements). Thus, Hume’s principle is uniformly compatible with T . Theorem 5: (DIF) is uniformly compatible with (and so model-theoretically conservative over) second-order Peano arithmetic. (QUOT) is uniformly compatible with (and so model-theoretically conservative over) the second-order theory of the integers. (CP) is uniformly compatible with (and so modeltheoretically conservative over) second-order rational analysis. Proof: The second-order theories of Peano arithmetic, the integers, and the rational numbers are all categorical. Each theory has only one model, up to isomorphism. The foregoing treatment shows how to extend the standard model of each theory to a model of the respective abstraction principle. Model-theoretic conservativeness is not as illuminating as it might look. Since second-order Peano arithmetic is categorical, it is semantically complete. For any sentence in the language of Peano arithmetic, either is a modeltheoretic consequence of the axioms or ¬ is a model-theoretic consequence of the axioms. Thus, every arithmetic truth is already a model-theoretic consequence of the theory. So the only way an abstraction principle could yield “new” arithmetic consequences would be for it to have no models that contain a model of second-order Peano arithmetic. That is, the only way A could fail to be model-theoretically conservative over second-order Peano arithmetic would be for A to not have any (Dedekind-)infinite models at all, in which case it is incompatible with arithmetic! In general, let T be a semantically complete base theory. Then the only way a proposed abstraction principle A could fail to be model-theoretically compatible with T would be for T O + A to have no models at all. Similarly, let T be categorical. Then the only way a proposed abstraction A can fail to be 8 This result uses the axiom of choice. In particular, we assume that the domain of M can be wellordered. This use of choice is necessary, since there are models of Zermelo–Fraenkel set theory in which Hume’s principle cannot be satisfied on the real numbers. I do not know whether one can show that Hume’s principle is uniformly compatible with (or model-theoretically conservative over) every theory, without using the axiom of choice.
238
The Arché Papers on the Mathematics of Abstraction
uniformly compatible with T is for there to be no models of T O + A, so that A is in fact logically incompatible with T . Thus, for semantically complete base theories, model-theoretic conservativeness is not a discerning requirement; and for categorical base theories, uniform compatibility is not discerning. It just comes to joint satisfiability. As they stand, model-theoretic conservativeness and uniform compatibility make direct reference to models of the various theories, and thus presuppose a fairly substantial set theory (given that the languages are higher-order). If the neo-logicist manages to reconstruct a sufficiently strong set theory, she can formulate the constraints internally and invoke them from that point onwards. Those constraints might then serve as an after-the-fact check on the principles used to develop the set theory, and the other mathematical theories that appeared along the way. For now, however, model-theoretic conservativeness and uniform compatibility are taken in the external perspective. We investigate them using whatever techniques are available. They serve as the mathematician’s explication of an intuitive constraint that the neo-logicist places on abstraction principles. To sum things up so far, the deductive articulation of conservativeness is available internally to the neo-logicist, but is inappropriate, since the powerful target theories (real analysis in this case) are not deductively conservative over the relatively weak base theories (arithmetic). The model-theoretic articulation is external. So how should the neo-logicist herself understand the conservativeness requirement internally? One option would be for the neo-logicist to simply leave the notion of logical consequence at an intuitive, pre-theoretic level. To say that a conclusion is entailed by a set of premises is to say that it is not possible for the members of to be true and false, or that is somehow implicit in the members of . To be sure, this “articulation” of conservativeness makes it harder for the neo-logicist to prove that a proposed abstraction principle is acceptable. But the neo-logicist is not without resources. For to be entailed by , it is necessary, but perhaps not sufficient, for to be true in all set-theoretic models of ; and it is sufficient, but not necessary, that be deducible from . If a proposed abstraction is in fact model-theoretically conservative over the relevant base theory or theories, the neo-logicist might take it as a working hypothesis that the abstraction principle is conservative, in the appropriate intuitive sense. She might adopt an attitude that the principle is acceptable until reason is shown otherwise, shifting the burden of proof to someone who wishes to challenge the principle. The remaining onus on the neo-logicist would be to deal with any violations of deductive conservativeness, perhaps on a case by case basis. Suppose, for example, that the neo-logicist assumes that the cut principle (CP) is conservative over Peano arithmetic, in the relevant intuitive sense of “conservative”. It follows that the standard Gödel sentence G for second-order Peano arithmetic is a consequence of the base theory, secondorder Peano arithmetic. The neo-logicist would have to defend this conclusion,
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
239
and argue that the Gödel sentence is in fact implicit in the original theory, despite not being deducible from it.
5.
The acceptability of abstractions II: inflation
Boolos [1997, 249–50] reiterates a view, widely held since at least Kant (Critique of pure reason, B622–623), that nothing in the neighborhood of an analytic truth has ontological consequences: . . . it was a central tenant of logical positivism that the truths of mathematics were analytic. Positivism was dead by 1960 and the more traditional view, that analytic truths cannot entail the existence either of particular objects or too many objects, has held sway ever since.
The widely held view in question is that one cannot learn about what objects exist from meaning or conceptual analysis alone. Any logicist-style account that accepts the existence of mathematical objects must reject, or at least attenuate this view. The neo-logicist claims that the existence of mathematical objects follows from abstraction principles (and logical truths). She might back off of the thesis that acceptable abstraction principles are analytic, or true solely in virtue of the meanings of the terms (as, for example, Wright [1997] does), but she maintains that acceptable abstractions have a privileged epistemic status, something at least akin to implicit definitions. The opposing view is that we cannot know about the existence of objects in such an epistemically inexpensive manner. Boolos’s phrase “either of particular objects or too many objects” suggests a compromise. Perhaps an acceptable abstraction principle can entail the existence of some objects, but not “too many” of them. We can then focus on the question of how many is too many. Roy Cook [2001] is a detailed study of the “inflationary” aspects of abstraction principles like the cut principle (CP). Much of this section is a response to that paper. The discussion here (and in Cook [2001]) is, for the most part, external to the neo-logicist framework. We invoke a substantial set theory in order to compare the sizes of various models of various principles. Sometimes the unacceptability of a given abstraction principle is due to its inconsistency, which, of course, is internal. But not always. Consider, first, Basic Law V. Of course, this abstraction is inconsistent and so it has no models at all. Suppose that an intended model of our base theory has a domain of size κ. Then there are 2κ -many extensions composed of those items. So Basic Law V implies the existence of more abstracts than objects in the base theory. This “inflation” does not stop there. Since the quantifiers of Basic Law V are unrestricted, it implies the existence of extensions of properties of the extensions of objects in the original domain. There are 22κ of those. And Basic Law V entails the existence of extensions composed of
240
The Arché Papers on the Mathematics of Abstraction
those extensions; and on it goes. The problem, of course, is that this inflation does not stop. Now consider Hume’s principle. Suppose that the intended interpretation of the base theory is finite, of size n. Then Hume’s principle entails the existence of n + 1 cardinal numbers—zero and one for each non-empty size of objects from the base theory (see Boolos [1987]). So there is some mild inflation. Since the quantifiers in Hume’s principle are unrestricted, it implies the existence of numbers of properties of those numbers. There are n + 2 such cardinal numbers; and so on. But, in a sense, this inflation does end. As above, the result of adding the natural numbers and ℵ0 to the domain of the original model is a structure that satisfies Hume’s principle. There is no more inflation—at least not on that model. The addition of Hume’s principle to any denumerably infinite domain does not inflate. In that context, Hume’s principle only entails the existence of countably many cardinal numbers, the same size as the domain we start with. Suppose that the intended domain of the base theory has cardinality ℵα . Then the addition of Hume’s principle yields ℵ0 + |α| ≤ ℵα cardinal numbers. So Hume’s principle does not inflate on this domain either. So under the axiom of choice, Hume’s principle does not inflate on any (Dedekind-)infinite set. The following is a modification of Cook’s useful framework for treating the inflationary aspects of abstraction principles. Let A be an abstraction principle and, as above, let O be a monadic predicate that does not occur in A (or in the base theory, if there is one). Let A O be the result of restricting all of the quantifiers in A to O. For example, if B is Basic Law V, then B O says that all properties of objects that have O have extensions. It does not entail that the properties of those extensions have extensions. In fact, B O is satisfiable. Let d be a set. Define a d-model of A to be a model of A O in which the extension of O is d. So A O helps measure the abstracts yielded by A on d. Let κ be a cardinal number. If |d| = κ, then every d-model of Basic Law V has at least 2κ -members, and every d-model of Hume’s principle has at least κ + 1 members. Say that an abstraction principle A is κ-inflationary if for every set d of size κ, the cardinality of the domain of every d-model of A is greater than κ. In other words, A is κ-inflationary, if starting with a domain of size κ, A yields more than κ-many abstracts on that domain (if it is satisfiable on that domain at all). 9 If κ is finite, then Hume’s principle is κ-inflationary, and if κ is infinite and well-ordered then Hume’s principle is not κ-inflationary. It is independent of Zermelo–Fraenkel set theory whether Hume’s principle is inflationary on the continuum (see note 7). Define an abstraction principle A to be strictly non-inflationary if there is no κ such that A is κ-inflationary. So if A is strictly non-inflationary, then for 9 Cook restricts κ to infinite cardinals. Since I do not follow that here, my definitions are slightly different from his.
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
241
any domain d, A does not yield more than |d| abstracts. This, I presume is best. Say that an abstraction A is boundedly inflationary if there is some cardinal λ such that for all κ > λ, A is not κ-inflationary. 10 This is second-best. If A is boundedly inflationary, then if the starting domain is sufficiently large, then A does not inflate on it. Define A to be unboundedly inflationary if it is not boundedly inflationary, and say that A is universally inflationary if for every κ, A is κ-inflationary. This is worst. Basic Law V, of course, is universally inflationary—worst. If we assume the axiom of choice, then Hume’s principle is boundedly inflationary—secondbest. Shapiro and Weir [1999] show that Boolos’s [1989] New V (called “VE” in Wright [1997]) is unboundedly inflationary—it inflates on all singular cardinals. Recall that since the quantifiers of many of the abstraction principles under consideration are unrestricted, their range includes the abstracts yielded by that very principle. In such cases, at least, we are more interested in A itself than in the restricted A O . Notice that A is not κ-inflationary if and only there is a model of A O in which the extension of O and the domain of the model both have cardinality κ. If κ is finite, then the extension of O would have to be the whole domain, in which case the model of A O is also a model of A. Assuming choice, we can establish something similar in general: A is not κ-inflationary if and only if A itself (and not just A O ) has a model of size κ. So if A is not κ-inflationary, then it is consistent with A for the universe to be of size κ exactly. If A is strictly non-inflationary then for every cardinal κ, A has a model of size κ. If A is boundedly inflationary, then there is some cardinal λ such that for all κ > λ, A has a model of size κ. So if A is unboundedly inflationary then for every cardinal λ there is a κ > λ such that A has no model of size κ. Say that A is unboundedly satisfiable if, for every cardinal λ, there is a κ > λ such that A has a model of size κ. Notice that if A is unboundedly satisfiable, then (assuming choice) we can turn any set into a model of A by adding more elements: for every set d, there is a model of A whose domain contains d. In the best cases, the “new” elements will be the new abstracts. Shapiro and Weir [1999] show that if the generalized continuum hypothesis is true, then New V is satisfiable at all regular cardinals, and so it is unboundedly satisfiable. However, it is independent of Zermelo–Fraenkel set theory (with choice) whether New V is actually unboundedly satisfiable. It might not have any uncountable models at all. So, again, how much inflation is too much? Cook [2001] argues that only strictly non-inflationary and boundedly inflationary abstraction principles should be acceptable to a neo-logicist. Let us examine the arguments, since they go to the heart of the goals of neo-logicism. 10 The usefulness of this notion turns on the axiom of choice, since in that case, for any distinct cardinals κ, κ , either κ < κ or κ < κ.
242
The Arché Papers on the Mathematics of Abstraction
Concerning unbounded inflation, Cook writes: The neo-logicist is claiming that the abstraction principles implicitly define, or at least ground our use of mathematical concepts and theories. Definitions of the abstract objects of mathematics, even implicit ones, ought to determine a unique group of objects which necessarily fall under the definition. If this ‘defining’ abstraction principle is unboundedly inflationary, however, then the neo-logicist has failed in his task.
Assume, for example, that an abstraction principle A is unboundedly inflationary, and suppose that M is a model of both A and the background theory T . Let κ be the cardinality of the domain of M, and let γ be the smallest cardinal greater than κ such that A is γ -inflationary. Cook continues: . . . had there been γ objects in the universe, there would have, by [ A], been more than γ (and thus more than κ) abstracts. But then the original abstracts are not all of the objects whose identity conditions are given by [A]. This process can be repeated indefinitely (and transfinitely), so that we never have all the objects that fall under the purview of [ A]. In other words, if [A] is unboundedly inflationary then it fails to secure a definite collection of objects as the domain of its abstraction operator, but instead gives us different abstracts relative to how many objects exist.
In a note, Cook adds that “an adequate definition should determine a unique extension independently of the existence of any other objects”. Some abstraction principles do characterize a unique domain of objects, at least up to isomorphism. The present cut abstraction principle (CP), for example, yields all and only the real numbers, plus two extra abstracts. Another example would be the restriction of Hume’s principle to finite concepts (see Heck [1997]). This yields all and only (an isomorphic copy of) the natural numbers. Cook is correct that if an abstraction principle A is intended to characterize a unique structure (such as the natural numbers or the real numbers), then it should not be unboundedly inflationary. In this case, A should yield the required objects and no others. It should not inflate on any domain that is as large as or larger than the requisite structure. However, it is not true that every legitimate abstraction principle determines “a definite collection of objects” as the range of the defined operator. Some principles do yield “different abstracts relative to how many objects exist”. Consider Hume’s principle. Thanks to Frege’s theorem, it implies the existence of the natural numbers and the cardinality of the natural numbers (i.e., ℵ0 ). But what of other cardinalities? Since Hume’s principle has countable models, it does not, by itself, entail that the cardinality of the continuum exists. But Hume’s principle does entail that if there is a property that holds of continuummany objects, then the cardinality of the continuum exists. So, for example, Hume’s principle and (CP) together entail that the cardinality of the continuum exists. In general, which cardinal numbers exist depends on how many objects
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
243
there are. I, at least, do not see this as a problem with Hume’s principle as an abstraction. Some acceptable abstractions are open-ended, in the sense that the abstracts they yield depend on the ontology of the background theory. Increasing the ontology might increase the abstracts. A different problem with unboundedly inflationary abstraction principles is that they may conflict with each other. Weir [2000] formulates a pair of “distraction” principles B, B , such that B and B are each unboundedly satisfiable, but are mutually inconsistent. Suppose that the background theory has a model of size κ0 . To extend this to satisfy B, we add κ1 > κ0 abstracts. But this new model does not satisfy B . To satisfy B , we add κ2 > κ1 abstracts. But once we add these abstracts to satisfy B , we no longer satisfy B. To (re-)satisfy B, we have to add κ3 > κ2 more abstracts. But then we no longer satisfy B . In short, the conjunction B&B is universally inflationary. Faced with such a pair of abstractions, the neo-logicist must find a principled way to choose among them. Or else she can play it safe and reject any unboundedly inflationary abstraction principle, and require that all acceptable abstractions be boundedly inflationary. Then, once we are satisfied that the universe is sufficiently large, the abstraction will be satisfied no matter how much larger we go on to recognize the universe to be. Let us turn to Cook’s treatment of universally inflationary principles—those that inflate on every cardinality. Of course, if an abstraction A is inconsistent, then it is unacceptable. Suppose that A is consistent, but universally inflationary. Let b be a set and κ = |b|. Since A is κ-inflationary, A cannot be satisfied on b. Since b is arbitrary, A has no models whose domain is a set (with a cardinality). As Cook puts it, A “will be satisfied (if satisfied at all) only by a structure that is at least the size of proper class”. This, he says, is problematic, since proper classes are “extremely badly behaved”. The idea is that if A can be satisfied only on a proper class, then it yields a proper class of abstracts (so to speak). Thus, the abstraction takes “us far from the epistemically innocent implicit definitions that the neo-logicists argue acceptable abstraction ought to provide”. In sum, Cook’s claim is that the “generation” of a proper class of abstracts is incompatible with the epistemic goals of neo-logicism. Notice that this judgement comes from the external perspective. Internally, the neo-logicist claims that we can come to know about the existence of some objects through deduction from principles in the neighborhood of implicit definitions or analytic truths. Externally, we use the set-theoretic meta-theory, accepted already by the established mathematician, to show that a certain abstraction principle yields more objects than there are members of any element of the set-theoretic hierarchy. Cook seems to hold that a proper class of abstracts is indeed “too many” objects to obtain this way. As noted above, for neo-logicism to have a chance, we have to temper the widely-held view that definitions, or principles much like definitions, have no ontological consequences. Cook’s claim, in effect, is that enough is enough. He presupposes that from the external
244
The Arché Papers on the Mathematics of Abstraction
perspective of an advocate of Zermelo–Fraenkel set theory, the abstracts must constitute (or be equinumerous with) a set. But the fact is that the objects of mathematics do not constitute a set, for well-known reasons. So Cook’s thesis entails that neo-logicism must fall short of its grand goal of providing an epistemic foundation for all of mathematics. A neo-logicist set theory and a neo-logicist theory of ordinals and cardinals is out of the question. Thus, the neo-logicist must rest content with an account of arithmetic, real and complex analysis, and perhaps a little more. The main (external) question that remains is just how big the neo-logicist’s ontology can be. What is the cardinality of the objects yielded by all acceptable abstractions together? Presumably, it will be ℵα , for some ordinal α. If the neo-logicist wants to avoid demanding revisions to established mathematics, she must provide some other epistemic foundation for those branches of mathematics— such as set theory, ordinal theory, and cardinal theory—whose ontology is not a set. For what it is worth, I believe that Cook’s view begs the question against the neo-logicist quest. So far as I know, no argument has been given that the objects yielded by an abstraction principle must constitute a definite, set-sized totality. The neo-logicist thesis is that an acceptable abstraction is akin to an implicit definition, providing an epistemic foundation of the theory of the objects it yields. There is no requirement that the objects be limited in any way, or that they constitute a definite totality. Perhaps further discussion of this should await either specific arguments concerning the limits of abstraction principles or the presentation of particular candidate principles that do yield a proper class of abstracts. We briefly revisit the issue at the end of the next section.
6.
The acceptability of cut abstraction
I now turn to the inflation and satisfiability of the abstraction principles presented here: (DIF), (QUOT), and (CP). Unlike Basic Law V and Hume’s principle, the quantifiers in all three of these principles are restricted. Since the right hand side of the difference principle (DIF) explicitly invokes addition on the natural numbers, (DIF) entails the existence of a difference-abstract for each pair of natural numbers, but that is all. It says nothing about “differences” of (pairs of) other objects, and in particular, it does not yield differenceabstracts for (pairs of) difference-abstracts. Similarly, the quotient principle (QUOT) yields a ratio for each pair of integers, but nothing else. And (CP) yields a cut for each property of rational numbers, but nothing else. Since there are only countably many integers, the difference principle is satisfiable on any domain that contains the natural numbers (using standard coding techniques if necessary). In a sense, (DIF) is universally satisfiable in that it is satisfiable on any domain over which it is defined. And so it does not inflate on any such domain. Since there are only countably many
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
245
rational numbers, the quotient principle (QUOT) is satisfiable on any domain that contains the integers, i.e., on any domain on which it is defined. Since there are continuum-many distinct cuts, (CP) inflates on the rational numbers, but that is the end of its inflation. So (CP) is boundedly inflationary, in that it is satisfiable on any domain that is at least the size of the continuum and contains the rational numbers. Concerning inflation and satisfiability, the neo-logicist cannot do any better than this. If she hopes to recapture real analysis, she will need principles that yield continuum-many abstracts. The cut principle does that, and no more. Perhaps we should not be sanguine. The main reason why (CP) does not inflate beyond the real numbers is that it only defines cuts for properties of rational numbers. But we can mimic the development of (CP) for any linear order 11 “< ” defined on a set or class h. Let P be a property of items in h, and suppose that r ∈ h. Say that r is an upper bound of P, written P ≤ r , if for any s ∈ h, if Ps then either s < r or s = r . In other words, P ≤ r if r is greater than or equal to any object that P applies to (under the given linear order). Consider the following abstraction principle: (h, <)-(CP)
∀P∀Q(C(P) = C(Q) ≡ ∀r (P ≤ r ≡ Q ≤ r )).
In strict analogy with (CP), the cut of P is identical to the cut of Q if and only if P and Q share all of their upper bounds (under<). The abstraction (h,<)(CP) might inflate in the sense that it may yield more cuts than members of h. If the cardinality of h is κ, then there can be as many as 2κ cuts. But in light of the above, this is the extent of the inflation for this one principle. There is, of course, a linear order on the real numbers which extends the linear order on the rational numbers (under the natural embedding of the rational numbers into the real numbers). The version of the cut principle formulated on the natural linear order for the real numbers does not inflate. It is a consequence of the completeness property of the real numbers that the “cuts” of non-empty, bounded properties of real numbers are isomorphic to the real numbers themselves. A pleasing, well-known result, and more good news on the inflation front. Something similar holds for (h,<)-(CP) in general. Let h be the collection of cuts of non-empty, bounded properties. There is a natural embedding of h into h , and an extension of the linear order “<”on h to a linear order “< ” on h . But there is no new inflation on that linear order. The cuts yielded by (h ,< )-(CP) are isomorphic to those yielded by (h,<)-(CP). So each of the various cut abstraction principles is at least relatively innocuous. Some of the cut-principles do inflate, but in each case, the inflation is contained. The problem, if there is one, is that the totality of cut abstraction principles together might generate too much inflation. It seems ad hoc to claim 11 Actually, the construction can be carried out on any binary relation, but I will stick to linear orders for convenience.
246
The Arché Papers on the Mathematics of Abstraction
that the original (CP) is the only legitimate abstraction principle in this form. If (CP) is acceptable, then at least some of the others are. Perhaps they all are. Cook formulates a generalization of the cut-abstraction principle used in Hale [2000] (a restricted version of Basic Law V). In the present context, the analogous principle is a single, second-order sentence asserting the existence of the cuts of every linear order: (GCP) ∀H ∀R[ if R is a linear order on H then ∀P∀Q[∀x(Px → Hx) & ∀x(Qx → Hx) → (C(P, H, R) = C(Q, H, R) ≡ ∀r (∀x(Px → (x = r ∨ Rxr)) ≡ ∀x(Qx → (x = r ∨ Rxr))))]]. That is, for any property H and any relation R, if R is a linear order on the objects that have H , then if P and Q are sub-properties of H , then the cut of P is identical to the cut of Q (relative to H and R) if and only if P and Q have the same upper bounds under R. Again, the neo-logicist who accepts (CP) might be committed to the existence of the cuts on any linear order. The sentence (GCP) is a formulation of that commitment. The alternative, of course, is to articulate a principled distinction between the linear orders that have cuts and those that do not. Cook establishes an interesting result that bears on the inflation of cut principles: Theorem 6: Cook [2001] Assume the axiom of choice in the meta-theory. Let κ be an infinite cardinal number. There is a set h such that |h| ≤ κ, and a linear order “<” on h such that (h,<)-(CP) yields more than κ cuts. Proof: Let λ be the least cardinal such that 2λ > κ. Of course, λ ≤ κ. Let h be the set of subsets of λ (as an ordinal) that are smaller than λ. So h= {b ⊆ λ: |b| < λ}. A straightforward calculation shows that |h| ≤ κ (or see Cook [2001]). Define the linear order as follows: a < b ≡ ∃α(α ∈ b & α ∈ / a & ∀β < α(β ∈ a ≡ β ∈ b)). In other words, a < b if the first ordinal on which they differ is in b. The cuts yielded by (h,<)-(CP) are isomorphic to the subsets of λ: if x ⊆ λ, then x corresponds to the cut of the property of being an initial segment of x. So there are 2λ > κ such cuts. The original cut principle (CP) yields continuum-many real numbers. By Theorem 6, there is a linear order on a subset of those real numbers, and the cut principle on that linear order yields more than continuum-many abstract objects. There is a linear order on a subset of those objects that yields even more abstract objects. And on it goes. The process can be carried into the transfinite: there is a linear order on (a subset of) the union of the objects yielded by the first, second, third, . . . , of these abstractions, such that the cut principle on that big linear order yields more abstract objects than the totality of objects yielded by the previous principles. Is the neo-logicist committed to
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
247
the thesis that we can know that all of these objects exist via principles that are a priori knowable, akin to implicit definitions? It might be objected that one cannot exactly define these linear orders internal to the neo-logicist program, unless the background theory T includes a pretty substantial set theory. That is, the neo-logicist cannot define the various linear orders (h,<) unless she has produced (internally) a set theory sufficient to manipulate sets of ordinals. If the neo-logicist manages that, then any worries about inflation should be focused on the set theory. The powerset axiom produces at least as much inflation as the various cut principles, possibly more (depending on the generalized continuum hypothesis). So perhaps the neo-logicist is not committed to the generalized cut principle (GCP). At most, she is only committed to the acceptability of those cut principles (h,<)-(CP) in which the linear order is definable internally. 12 Neo-logicism is, after all, an epistemic program. The goal is to show how mathematical principles can become known with minimal epistemic presuppositions. A mathematical domain is brought into the fold if its axioms (or characterizing properties) can be derived from abstraction principles which are akin to implicit definitions, all but analytic. To accomplish that, the abstraction principles must be explicitly formulated in an acceptable language. In the present context, this entails that the linear order must be definable. Nevertheless, if the original cut principle (CP) is acceptable, then one would think that for the neo-logicist, any other cut principle (h,<)-(CP) is acceptable at least in a relative sense. The idea is that if the objects in the domain h exist and are ordered as indicated, then the indicated cuts exist. It does not matter if the objects in h are themselves grasped through an abstraction principle. The neo-logicist is a realist in ontology, holding that mathematical objects exist independently of the mathematician. If mathematical objects are not of our making, then why think that the universe is constrained by the limited expressive resources of human languages? Recall that, at present, we are in the external perspective, seeing how various abstraction principles mesh with accepted mathematics. So it is fair to see what happens if principles in the form (h,<)-(CP) are added to various mathematical domains. What of (GCP) itself? I do not know if (GCP) is consistent, but even if it is, it inflates a lot. It follows from Theorem 6 above that (GCP) cannot be satisfied on any set. It inflates universally. I do not know if (GCP) can be satisfied on a proper class, or on the set-theoretic hierarchy itself. There may be a problem for linear orders defined on proper classes. Let be the class of all sets of ordinals, and consider the corresponding variant of the linear order from Theorem 6, defined on : A < b ≡ ∃α(α ∈ b & α ∈ / a & ∀β < α(β ∈ a ≡ β ∈ b)). 12 The neo-logicist herself cannot formulate this restriction internally, unless she has a coherent formulation of definability.
248
The Arché Papers on the Mathematics of Abstraction
Again, a < b if the first ordinal on which they differ is in b. Call the principle (,<)-(CP). Notice that in the context of second-order Zermelo–Fraenkel set theory, the generalized cut principle (GCP) entails (,<)-(CP). I do not know whether (,<)-(CP) is consistent, but it does go beyond Zermelo–Fraenkel set theory. There are as many cuts yielded by (,<)-(CP) as properties (or classes) of ordinals: if P is a property of ordinals, then let P be the property (of sets) of being an initial segment of P (under membership). The cut of P under (,<)-(CP) corresponds to P. So there are more cuts than ordinals. The above internal proof that the real numbers are uncountable can be extended to (,<)(CP). That is, we can show, internal to the neo-logicist framework, that the cuts yielded from (,<)-(CP) cannot be mapped one-to-one into the ordinals. Externally, if Zermelo–Fraenkel set theory is the meta-theory, then (,<)(CP) entails that there are more sets than ordinals. A fortiori, the cuts cannot be well-ordered. This contradicts global choice. 13 This might be troublesome since many theorists, including Hilbert and Zermelo, hold that global choice is a logical truth (see Shapiro [1991, 106–108]). On the other hand, the fact that (,<)-(CP) goes beyond established set theory may not be problematic in itself (provided that it is consistent). Neologicism might have contributions to make to set theory itself. There is a recognized problem with applying unrestricted abstractions to what are, in effect, proper classes. Boolos [1997] once pointed out that Hume’s principle entails that the property of being self-identical has a cardinal number. This would be the number of all objects whatsoever. Similarly, Hume’s principle entails that there is a number of all cardinal numbers, and in the context of a background set theory, Hume’s principle entails that there is a number of all sets and a number of all ordinals. Boolos notes that prima facie, this presents a conflict with ordinary Zermelo–Fraenkel set theory: [I]s there such a number as [the number of all objects whatsoever?] According to [ZF] there is no cardinal number that is the number of all the sets there are. The worry is that the theory of number [based on Hume’s principle] is incompatible with Zermelo–Fraenkel set theory plus standard definitions. (Boolos [1997, 260])
Wright [1999] accepts the force of this objection, and grants “the plausible principle . . . that there is a determinate number of F’s just provided that the F’s compose a set”. Since “Zermelo–Fraenkel set theory implies that there is no set of all sets . . . it would follow that there is no number of sets”. Wright’s proposed response is to restrict the second-order variables in Hume’s principle, so that some properties do not have numbers—those which are what Dummett calls “indefinitely extensible”. According to Dummett, an “indefinitely extensible concept is one such that, if we can form a definite 13 The abstraction principle New V entails that the universe is well-ordered (see Shapiro and Weir [1999]). So, as Cook notes, New V is inconsistent with (,<)-(CP).
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
249
conception of a totality all of whose members fall under the concept, we can, by reference to that totality, characterize a larger totality all of whose members fall under it” (Dummett [1993, 441], emphasis mine). Ordinal numbers and cardinal numbers are paradigm cases of indefinitely extensible notions. Wright writes: I do not know how best to sharpen [the notion of indefinite extensibility] . . . But Dummett could . . . be emphasizing an important insight concerning certain very large totalities—ordinal number, cardinal number, set, and indeed “absolutely everything”. If there is anything at all in the notion of an indefinitely extensible totality . . . one principled restriction on Hume’s Principle will surely be that [cardinal numbers] not be associated with such totalities. (Wright [1999, 13–14])
Thus, Wright suggests that the second-order variables in Hume’s principle be restricted to definite properties—those not indefinitely extensible. Hale [2000] follows suit. If this is sound, it suggests a general thesis that the second-order variables in acceptable abstractions be restricted to definite properties. This would rule out (,<)-(CP), since the sets of ordinals constitute an indefinitely extensible totality, if anything does. Moreover, let (GCP-) be the result of restricting the initial second-order variables of (GCP) to definite properties (i.e., to sets). The resulting principle is still universally inflationary, thanks to Theorem 6 above: given any set b with |b| = κ, there is a linear order on a subset of b which has more than κ cuts. As Cook notes, Theorem 6 shows that the notion of being a cut of a linear order is itself indefinitely extensible. Nevertheless, it is a theorem of Zermelo–Fraenkel set theory that if “<” is a linear order on a set h, then there is a function defined on the powerset of h that satisfies the consequent of (GCP). So the restricted (GCP-) does not go beyond Zermelo–Fraenkel set theory (with choice). It is satisfiable on the iterative hierarchy. The problem now is to formulate the restriction to abstraction principles like Hume’s principle and (GCP) more rigorously. What is it for a property to be indefinitely extensible? Wright [1999] conceded that he does not have a more rigorous, internal articulation of the notion of indefinite extensibility. The details of the proposal go beyond the scope of this already lengthy section, and I do not have much to add in any case. I agree with Cook that the proposal to restrict abstractions to definite properties is “[p]erhaps the most promising . . . restriction on the applicability of cut abstraction”. The present state is programmatic. Articulating the Dummettian notion of indefinite extensibility is a central item on the agenda of neo-logicism. 14 14 See Clark [2000] and Shapiro [2003]. The basic theme of the latter is restrict Basic Law V similarly, and resurrect set theory along neo-logicist lines. That paper contains a more detailed discussion of the notion of indefinite extensibility.
250
7.
The Arché Papers on the Mathematics of Abstraction
Brief philosophical epilogue
Richard Heck [1997] makes an important distinction between interpreting a theory (like arithmetic or analysis) in an analytically true theory (or a theory based on abstraction principles), and showing that the theory itself can be derived from abstraction principles. Frege himself surely knew that Euclidean geometry could be interpreted in real analysis, and yet he did not hold that Euclidean geometry was analytic. Heck’s thesis is that Frege’s theorem does not, by itself, provide an epistemic foundation for arithmetic. We need to make sure that the relevant abstracts are indeed the natural numbers that we all know and love. What can be said about the present case? Can one claim that the cuts on bounded, instantiated properties of rational numbers are the real numbers that we all know and love? I do not know how to even begin a definitive resolution of this issue, but one or two points can be made. It seems to me that continuity is essential to the real numbers. So their neo-logicist characterization should have continuity built in. And both present account and Bob Hale’s [2000] rival account do so. This is probably the most important place where we are indebted to Dedekind, who showed us what continuity is, and used only logical resources in the process. Another sticky philosophical matter, related to Heck’s requirement, is Frege’s insistence that an account of the applications of a mathematical structure must be built into its characterization. Hume’s principle, for example, recapitulates an important application of the natural numbers—to measure cardinalities of sortal properties. The contrast is with Dedekind’s [1888] account of the natural numbers, in his second great foundational work. Dedekind provides a direct description of the natural number structure, and after that provides an account of the application of this structure—to both cardinal numbers and ordinal numbers. Frege would complain that this account of application comes too late. It should be built into the very constitution of the natural numbers. Frege’s constraint is completely ignored in the present account. By now, we know exactly which structure we were looking for, and the present account zeros in on this very structure. To be sure, it would be easy to tack on an account of applications—the measurement of quantities—to this structure. We know how to this for any complete ordered field. But for Frege, this account of application comes too late. It should be built into the application. Bob Hale’s [2000] rival account of the real numbers is more in line with Frege’s constraint, since he develops the real number structure from one of its applications, the measurement of ratios of complete quantitative domains which have no negative or zero quantities, and which are not “cyclical” (like angles). As a structuralist (Shapiro [1997]), I do not know what to make of Frege’s constraint. Hale’s account and the present one deliver the same structure
Frege Meets Dedekind: A Neo-Logicist Treatment of Real Analysis
251
(eventually). With Dedekind, I’d say they are both accounts of the real number structure. If we take Frege’s constraint concerning applications seriously, however, then at least one of us—me presumably—has delivered an isomorphic imposter, perhaps in the same sense that R3 is an isomorphic imposter to Euclidean space. This metaphysical issue is pursued in Crispin Wright’s contribution to this issue. Acknowledgments: I would like to thank the audiences at Arché workshop on abstraction at the University of St. Andrews, Autumn, 2000 and the conference on logicism chronicled in this issue. Special thanks to Peter Clark, Roy Cook, Bob Hale, Richard Heck, Fraser MacBride, Alan Weir, and Crispin Wright for many hours of fruitful conversation.
References Boolos, G. [1987], “The consistency of Frege’s Foundations of arithmetic”, in On being and saying: Essays for Richard Cartwright, edited by Judith Jarvis Thompson, Cambridge, Massachusetts, The MIT Press, 3–20. Boolos, G. [1989], “Iteration again”, Philosophical Topics 17, 5–21. Boolos, G. [1997], “Is Hume’s principle analytic?”, in Language, thought, and logic, edited by Richard Heck, Jr., Oxford, Oxford University Press, 245–261. Clark, P. [2000], “Indefinite extensibility and set theory”, talk to Arché workshop on abstraction, University of St. Andrews. Cook, R. T. [2001], “The state of the economy: neo-logicism and inflation”, Philosophia Mathematica, (3) 10, 43–66. Dedekind, R. [1872], Stetigkeit und irrationale Zahlen, Brunswick, Vieweg; translated as Continuity and irrational numbers, in Essays on the theory of numbers, edited by W. W. Beman, New York, Dover Press, 1963, 1–27. Dedekind, R. [1888], Was sind und was sollen die Zahlen?, Brunswick, Vieweg; translated as The nature and meaning of numbers, in Essays on the theory of numbers, edited by W. W. Beman, New York, Dover Press, 1963, 31–115. Dummett, M. [1993], The seas of language, Oxford, Oxford University Press. Field, H. [1980], Science without numbers, Princeton, Princeton University Press. Fine, K. [1998], “The limits of abstraction”, The philosophy of mathematics today, edited by Mathias Schirn, Oxford, Oxford University Press, 503–629. Frege, G. [1879], Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens, Halle, Louis Nebert, translated in van Heijenoort [1967], 1–82. Frege, G. [1884], Die Grundlagen der Arithmetik, Breslau, Koebner; The foundations of arithmetic, translated by J. Austin, second edition, New York, Harper, 1960. Frege, G. [1893], Grundgesetze der Arithmetik 1, Olms, Hildescheim. Frege, G. [1903], “Über die Grundlagen der Geometrie”, Jahresbericht der MathematikerVereinigung 12, 319–324, 368–375. Frege, G. [1906], “Über die Grundlagen der Geometrie”, Jahresbericht der MathematikerVereinigung 15, 293–309, 377–403, 423–430. Frege, G. [1971], On the foundations of geometry and formal theories of arithmetic, translated by Eikee-Henner W. Kluge, New Haven, Connecticut, Yale University Press. Goldfarb, W. [1979], “Logic in the twenties: The nature of the quantifier”, Journal of Symbolic Logic 44, 351–368. Hale, Bob [2000], “Reals by abstraction”, Philosophia Mathematica (3) 8, 100–123. Hale, Bob, and C. Wright [2001], “To bury Caesar . . . ”, The Reason’s Proper Study; by Bob Hale and Crispin Wright, Oxford, Oxford University Press, 335–396. Heck, R. [1997], “Finitude and Hume’s principle”, Journal of Philosophical Logic 26, 589–617.
252
The Arché Papers on the Mathematics of Abstraction
Ramsey, F. P. [1925], “The foundations of mathematics”, Proceedings of the London Mathematical Society (2)25, 338–384. Russell, B. [1993], Introduction to mathematical philosophy, New York, Dover (first published in 1919). Shapiro, S. [1991], Foundations without foundationalism: A case for second-order logic, Oxford, Oxford University Press. Shapiro, S. [1997], Philosophy of mathematics: structure and ontology, New York, Oxford University Press. Shapiro, S. [2000], Thinking about mathematics: the philosophy of mathematics, Oxford, Oxford University Press. Shapiro, S. [2003], “Prolegomenon to any future neo-logicist set theory: abstraction and indefinite extensibility”, British Journal for the Philosophy of Science 54, 59–91. Shapiro, S. and A. Weir [1999], “New V, ZF, and abstraction”, Philosophia Mathematica (3) 7, 293–321. Van Heijenoort, J. [1967], “Logic as calculus and logic as language”, Synthese 17, 324–330. Weir, A. [2000], “Neo-Fregeanism, an embarrassment of riches?”, talk to Arché workshop on abstraction, University of St. Andrews. Wright, C. [1983], Frege’s conception of numbers as objects, Aberdeen University Press. Wright, C. [1997], “On the philosophical significance of Frege’s theorem”, Language, thought, and logic, edited by Richard Heck, Jr., Oxford, Oxford University Press, 201–244. Wright, C. [1999], “Is Hume’s principle analytic”, Notre Dame Journal of Formal Logic 40, 6–30. Wright, C. and Bob Hale [2000], “Implicit definition and the a priori”, in New essays on the a priori, edited by P. Boghossian and C. Peacocke, Oxford, Oxford University Press, 286–319.
NEO-FREGEAN FOUNDATIONS FOR REAL ANALYSIS: SOME REFLECTIONS ON FREGE’S CONSTRAINT 1 Crispin Wright St. Andrews, New York and Columbia Universities
Abstract We now know of a number of ways of developing Real Analysis
on a basis of abstraction principles and second-order logic. One, outlined by Shapiro in his contribution to this volume, mimics Dedekind in identifying the reals with cuts in the series of rationals under their natural order. The result is an essentially structuralist conception of the reals. An earlier approach, developed by Hale in his “Reals by Abstraction”, differs by placing additional emphasis upon what I here term Frege’s Constraint, that a satisfactory foundation for any branch of mathematics should somehow so explain its basic concepts that their applications are immediate. This paper is concerned with the meaning of and motivation for this constraint. Structuralism has to represent the application of a mathematical theory as always posterior to the understanding of it, turning upon the appreciation of structural affinities between the structure it concerns and a domain to which it is to be applied. There is therefore a case that Frege’s Constraint has bite whenever there is a standing body of informal mathematical knowledge grounded in direct reflection upon sample, or schematic, applications of the concepts of the theory in question. It is argued that this condition is satisfied by simple arithmetic and geometry, but that in view of the gap between its basic concepts (of continuity and of the nature of the distinctions among the individual reals) and their empirical applications, it is doubtful that Frege’s Constraint should be imposed on a neo-Fregean construction of Analysis.
1. The basic formal prerequisite for a successful neo-Fregean—or as I shall sometimes say: abstractionist—foundation for a mathematical theory is to 1 This paper first appeared in Notre Dame Journal of Formal Logic 41, [2000], pp. 317–334. Reprinted by kind permission of the editor and the University of Notre Dame.
253 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 253–272. c 2007 Springer.
254
The Arché Papers on the Mathematics of Abstraction
devise presumptively consistent abstraction principles strong enough to ensure the existence of a range of objects having the structure of the objects of the intended theory. In the case of Number Theory, for instance, the task is to devise presumptively consistent abstraction principles sufficient to ensure the existence of a series of objects having the structure of the natural numbers: a series of objects that constitute an ω-sequence. As is now familiar, second-order logic, augmented by the single abstraction, Hume’s Principle, accomplishes this formal prerequisite. 2 The outstanding question is therefore whether Hume’s Principle, beyond being presumptively consistent, may be regarded as acceptable in a fuller, philosophically interesting sense. The neoFregean programme inherits from Frege the anterior conviction that in mainstream classical mathematics, we deal in bodies of necessary truths of which we have a priori knowledge. So in order for Hume’s Principle to serve the neoFregean purpose, the least that will have to be argued is that it too is necessary, and knowable a priori (and that second-order logic can serve as a medium for the transmission of those characteristics). That raises an intriguing complex of metaphysical and epistemological issues—with which I will not here be primarily concerned. In parallel, the basic formal prerequisite for a successful abstractionist foundation of Real Analysis must be to find presumptively consistent abstraction principles which, again in conjunction with a suitable—presumably secondorder—logic, suffice for the existence of an array of objects that collectively comport themselves like the classical real numbers; that is, compose a complete, ordered field. Recently a number of ways have emerged for achieving this result. In this volume Stewart Shapiro has described one which I’ll call the Dedekindian Way. 3 We start with Fregean arithmetic, that is, Hume’s Principle plus second-order logic. Then we use the Pairs abstraction: (∀ x) (∀ y) (∀ z) (∀ w) (x,y = z,w ↔ x = z & y = w) to arrive at the ordered pairs of the finite cardinals so provided. 4 Next we abstract over the Differences between such pairs: Diff(x,y) = Diff(z,w) ↔ x + w = y + z and proceed to identify the integers with these differences. We proceed to define addition and multiplication on the integers so identified and then, where m, n, p and q are any integers, form Quotients of pairs of integers in accordance 2 This result is now commonly known as Frege’s Theorem. It is prefigured by Frege in [6] §§82–3 and reconstructed in detail by Wright in [15], §xix. Other detailed accounts of the proof are given in Boolos [1], in an appendix to Boolos [2] and in Boolos & Heck [4]. 3 Shapiro [13]. 4 Shapiro himself does not make direct use of the Pairs abstraction, moving directly to abstraction principles which “operate on objects taken two at a time”. However since the order in which the objects are taken matters for these principles, it seems better to signal the assumptions involved in an explicit principle and to treat their abstractive domains as composed by the appropriate ordered pairs delivered by it.
Neo-Fregean Foundations for Real Analysis
255
with this abstraction: Qm,n = Qp,q ↔ (n = 0 & q = 0) ∨ (n = 0 & q = 0 & m × q = n × p) We now identify a rational with any quotient Qm,n whose second term n is non-zero. Then, defining addition and multiplication and the natural linear order on the rationals so generated, we can move on to the objects which are to compose the sought-for completely ordered field via the Dedekind-inspired Cut Abstraction: (∀ P)(∀ Q)(Cut(P) = Cut(Q) ↔ (∀ r)(P ≤ r ↔ Q ≤ r) where ‘r’ ranges over rationals and the relation, “≤”, holds between a property, P, of rationals and a specific rational number, r, just in case any instance of P is less than or equal to r under the constructed linear order on the rationals. Cuts are the same, accordingly, just in case their associated properties have exactly the same rational upper bounds. Finally we identify the real numbers with the cuts of those properties P which are both bounded above and instantiated in the rationals. On the Dedekindian Way, then, successive abstractions take us from oneto-one correspondence on concepts to cardinals, from cardinals to pairs of cardinals, from pairs of finite cardinals to integers, from pairs of integers to rationals and finally from concepts of rationals to (what are then identified as) reals. Although the path is quite complex in detail and the proof that it indeed succeeds in the construction of a completely ordered field is at least as untrivial as Frege’s Theorem, it does make for a near-perfect abstractionist capture of the Dedekindian conception of a real number as the cut of an upperbounded non-empty set of rationals. True, the series of abstractions used do not of course collectively provide for the transformability of any statement about the reals, so introduced, back into the vocabulary of pure second-order logic with which we started out. But that—pure logicist—desideratum was already compromised at the very first stage, in the construction of Number Theory on the basis of Hume’s Principle. Something weaker but still interesting remains in prospect. Suppose we are persuaded that each of the successive abstractions serves to fix the meaning of contexts of the type schematised on its left-hand side just provided one already understands the corresponding right-hand side: then we allow that there is a route of successive concept formations that starts in second-order logic and winds up with an understanding of the Cuts and a canonical mathematical theory of them. If the abstraction principles involved can be regarded as epistemologically definition-like—as a kind of implicit definition of the type of contexts they serve to introduce on their left-hand sides—then the effect of the Dedekindian Way is to provide a foundation for Analysis in second-order logic and (implicit) definitions. Dedekind did not have the notion of an abstraction principle. But it seems likely that his
256
The Arché Papers on the Mathematics of Abstraction
logicist sympathies would have applauded this construction and its philosophical potential. 5 The Dedekindian Way contrasts significantly, however, with the route followed by Bob Hale in his important recent study. 6 In claiming to supply a foundation for Analysis—in particular, in claiming that the series of abstractions involved effectively leads to the real numbers—the Dedekindian Way may be viewed as resting on an essentially structural conception of what a real number is: in effect, the idea of a real number as a location in a certain kind of—completely—ordered series. For one following the Dedekindian Way, success just consists in the construction of a field of objects—the Cuts, as defined—having the structure of the classical continuum. Against that, contrast what is accomplished by Hume’s Principle in providing neo-Fregean foundations for Number Theory. The corresponding formal result is that Hume’s Principle plus second-order logic suffices for the construction of an ω-sequence. That is certainly of mathematical interest. But it doesn’t distinguish the situation from what can be accomplished in a system consisting, say, of second-order logic and George Boolos’s axiom New V. 7 What gives Frege’s Theorem its distinctive philosophical interest is that Hume’s Principle also purports to encapsulate an account of what cardinal numbers are. The philosophical payload turns not on the mathematical reduction as such but on the specific character of the abstraction by which the reduction is effected. Hume’s Principle effectively incorporates a variety of philosophical claims about the nature of number for which Frege prepares the ground philosophically in the sections of Grundlagen preceding its first appearance—for example, the claims (i) that number is a second-level property—a property of concepts; concepts are the things that have numbers,
which is incorporated by the feature that the cardinality operator is introduced as taking concepts for its arguments; and (ii) that the numbers themselves are objects;
which is incorporated by the feature that terms formed using the cardinality operator are singular terms. And in addition, of course, Hume’s Principle purports to explain (iii) what sort of things numbers are
It does so by framing an account of their criterion of identity in terms of when the things that have them have the same one: numbers, according to 5 An illuminating brief discussion of Dedekind’s “logicism” may be found in Shapiro [14] at pp. 170–76. 6 Hale [8]. Hale’s was the first neo-Fregean treatment of the real numbers. 7 For discussion of which, see Hale [9].
Neo-Fregean Foundations for Real Analysis
257
Hume’s Principle, are the sort of things that concepts share when one-to-one correspondent. Now you could, it is true, read a corresponding set of claims about real number off the Cut Abstraction principle featured in the Dedekindian Way. You would then conclude, correspondingly, that real numbers are objects, that the things which have real numbers are properties of rationals, and that real numbers are the sort of things that properties of rationals share just when their instances have the same rational upper bounds. One could draw these conclusions. But—apart from the first—they are strange-seeming conclusions to draw. There is no philosophical case that real number is a property of properties of rationals which stands comparison with Frege’s case that cardinal number is a property of sortal concepts. On the contrary, the intuitive case is that real number belongs to things like lengths, masses, temperatures, angles and periods of time. We could conclude that the Dedekindian Way incorporates poor answers to questions whose analogues about the natural numbers Hume’s Principle answers relatively well. But a better conclusion is that the Dedekindian Way was not designed to take those questions on. The fact is that Hume’s Principle accomplishes two quite separate tasks. There is, a priori, no particular reason why a principle intended to incorporate an account of the nature of a particular kind of mathematical entity should also provide a sufficient axiomatic basis for the standard mathematical theory of that kind of entity. It’s one thing to characterise what kind of entity we are concerned with, another thing to show that and why there are all the entities of that kind that we standardly take there to be, and that they compose a structure of the kind we intuitively understand them to do. Of course we can expect the two projects to interact. But the striking feature of the neo-Fregean foundations for Number Theory is that the one core principle, Hume’s Principle, discharges both roles. This is not a feature which we should expect to be replicated in general when it comes to providing abstractionist foundations for other classical mathematical theories. And what the reflections of a moment ago suggest is that the Dedekindian Way, for its part, is best conceived as addressing only the second project. It is the distinction between these two projects—the metaphysical project of explaining the nature of the objects in a given field of mathematical enquiry and the epistemological project of providing a foundation for our standard mathematical theory of those objects—that, as I read his discussion, drives the approach taken by Hale and—so far as one can judge from the incomplete discussion in Grundgesetze—by Frege himself. If we start with the metaphysical questions: what kind of thing are real numbers, what is real number a property of—what are the things that have real numbers—and what is the criterion of identity for reals, we are taken straight to the territory to which Hale devotes the initial part of his discussion. Real numbers, as remarked, are things possessed by lengths, masses, weights, velocities, etc.—things which allow of some kind of magnitude or, in Hale’s preferred term, quantity. To
258
The Arché Papers on the Mathematics of Abstraction
stress, though, quantities, or magnitudes, are not themselves the reals, but the things which the reals measure. As Frege says, . . . the same relation that holds between lines also holds between periods of time, masses, intensities of light, etc. The real number thereby comes off these specific kinds of quantities and somehow floats above them. (Grundgesetze §185) 8
If we want to formulate an abstraction principle incorporating an answer to the metaphysical question, what kind of thing are the reals, after the fashion in which Hume’s Principle incorporates an answer to the metaphysical question, what kind of thing are the cardinal numbers, then quantities will feature not as the domain of reference of the new singular terms which that abstraction will introduce but rather as the abstractive domain: as the terms of the abstractive relation on the right-hand side. On the other hand, it’s clear that individual quantities don’t have their real numbers after the fashion in which a particular concept, say speaker at the Notre Dame 2001 Logicism Reappraisal conference, has its cardinal number. We are familiar with different systems of measurement, like the Imperial and Metric systems for lengths, volumes and weights, or the Fahrenheit and Celsius systems for temperature, but there is no conceptual space for correspondingly different systems of counting. Of course, there can be different systems of counting notation: we can count in a decimal or binary system, for instance, or in Roman or Arabic numerals. But if they are used correctly, they won’t differ in the cardinal number they deliver to any specified concept, but only in the way they name that number. By contrast, the Imperial and Metric systems do precisely differ in the real numbers they assign to the length of a specified object. One inch is 2.54 cm. The real number properly assigned to a length depends on a previously fixed unit of comparison. So real numbers are relations of quantities, just as Frege says. Quick as they are, these reflections seem to enforce a view about what a principle would have broadly to be like whose metaphysical accomplishment for the real numbers matches that of Hume’s Principle for the cardinals. Where Hume’s Principle introduces a monadic operator on concepts, our abstraction for real numbers will feature a dyadic operator taking, in each use, as its arguments a pair of terms standing for quantities of the same type; more specifically, it will be a first-order abstraction: Real Abstraction
Ra,b = R c,d ↔ E(a,bc,d)
where a and b are quantities of the same type, c and d are quantities of the same type (but not necessarily of the type of a and b) and E is an equivalence relation on pairs of quantities whose holding ensures that a is proportionately to b as c is to d. In effect, the analogy is between the abstraction of cardinal 8 The translation of this passage, and others from Grundgesetze (Frege [7]) given below, is Sven Rosenkranz’s.
Neo-Fregean Foundations for Real Analysis
259
numbers from one–one correspondence on concepts, and abstraction of real numbers from equi-proportionality on pairs of suitable quantities. With this preliminary analogy in place, it’s clear that the neo-Fregean now has his work cut out into three large sub-tasks. First a philosophical account is owing of what in the first place a quantity is—what the ingredient terms of the abstractive relation on the right-hand side of the Real Abstraction principle are. Second—if the aspiration is to give a logicist treatment in the sense in which Hume’s Principle provides a logicist treatment of Number Theory—it must be shown that, parallel to the definability of one–one correspondence using just the resources of second-order logic, both the notion of quantity and the relevant equivalence relation, E, allow of (ancestral) 9 characterisation in (second-order) logical terms. (Should it prove impossible to do this, that would not necessarily deprive the abstractionist project of interest. But the point would have to be faced that an abstractionist treatment of Analysis would apparently have to originate in a special non-logical subject matter, with significant possible impact on the epistemological pay-off of the project.) Third a result needs to be established analogous to Frege’s Theorem: specifically, it needs to be shown that there are sufficiently many appropriately independent truths of the type depicted by the right-hand side of Real Abstraction to ground the existence of a full continuum of real numbers. And while, as stressed, Hume’s Principle itself suffices for the corresponding derivation for the natural numbers, here it is clear that additional input is going to be required to augment the Real Abstraction principle. Although not explicitly structured by a separation of these three issues, it is an achievement of Hale’s discussion that contains points of response to each of them. It is informed by taking to heart Frege’s injunction that to take the question, What is a quantity?, head-on is to put . . . the wrong question. There are many different kinds of quantities: lengths, angles, periods of time, masses, temperatures, etc., and it will hardly be possible to specify in virtue of what the members of these various kinds of quantities are distinct from objects that do not belong to any kind of quantity. And nothing would be gained thereby anyway; for we would still lack the means to recognise which of these quantities belonged to the same realm of quantities. Instead of asking: which properties must an object have in order to be a quantity? one must ask: what must a concept be like in order for its extension to be a realm of quantities? For brevity’s sake, let us now use ‘class’ instead of ‘extension of a concept’. Then we can put the question as follows: which 9 “Ancestral” characterisation in the sense that a chain of effective implicit definitions, eventually
grounding in concepts of second-order logic, may be reckoned good enough, even if it does not provide the resources for eliminative paraphrase of the definienda. This, of course, as noted, is the most that is achieved by the Dedekindian Way. But there is still a disanalogy: no issue arose, on that approach, concerning the logical character of the items in the abstractive domain for the abstraction that yields the reals. On the Dedekindian Way, those items were concepts (of ancestrally logical objects.) On Hale’s route they are (pairs of) quantities.
260
The Arché Papers on the Mathematics of Abstraction properties must a class have in order to be a realm of quantities? Something is not a quantity all by itself, rather it is a quantity only insofar as it belongs, with other objects, to a class which is a realm of quantities. 10
The leading idea in Hale’s [8] is to distinguish a number of different kinds of quantitative domain—“realms of quantities”—with more complex kinds obtainable from simpler ones by successive abstractions on the latter, culminating in a quantitative domain of a kind to which the Real Abstraction principle can be applied so as to generate the full continuum of real numbers. I shall not here attempt to do justice to the detail, but the basic moves are not dissimilar to those followed in the Dedekindian Way. The route goes once again via the natural numbers, as provided by Hume’s Principle, and then via a Ratio abstraction principle to what Hale calls a full quantitative domain in which the ingredients exhibit a structure corresponding to that of the positive rationals. Since such a domain is countable, and since the Real Abstraction principle is first-order and so delivers uncountably many reals only if applied to an uncountable domain of quantities on its right-hand side, an intermediate step is now required in Hale’s construction to take us from a full quantitative domain to what he calls a complete quantitative domain, in which in addition every class of elements which is bounded above has a least upper bound. Hale’s proposal to turn this trick is an abstraction principle he too calls Cut. We consider a full quantitative domain—like the rationals—and restrict our attention to a special kind of property—what Hale calls cut-properties—of its elements. Cut-properties of the elements of such domains are non-empty, have no greatest instance, and are such that anything in the domain smaller than any instance of them is likewise an instance. With F and G restricted to such properties, and the range of the objectual variable on the right-hand side restricted to elements in the domain in question, the relevant principle— Hale-Cut abstraction—then (ironically enough) turns out to be a syntactic doppelganger of Basic Law V: (∀ P)(∀ Q) (Cut(P) = Cut(Q) ↔ (∀ x)(Px ↔ Qx)) Applied to the full domain provided by a neo-Fregean construction of the rationals, Hale-Cut abstraction will generate a completely ordered field in just the way in which the Dedekindian Way’s Cut abstraction principle did so. (Indeed, Hale could just as well have used the latter at this stage of his construction.) But whereas, on the Dedekindian Way, the game ends once acceptable abstraction principles have been provided which lead to such a domain, all that construction serves to achieve, on the Hale route, are the needed raw materials for the right-hand side of the Real Abstraction principle itself. It remains, via that principle, to advance to the real numbers themselves, and to prove that they correspondingly compose a completely ordered field, 10 Frege [7] §161.
Neo-Fregean Foundations for Real Analysis
261
thus bringing the mathematical construction into mesh with the overarching metaphysical account of what a real number is.
2. Now we can get to our main issue. In the foregoing comparison I have deliberately encouraged the impression that the Dedekindian Way is best viewed as passing over certain legitimate general metaphysical questions: what is the nature of real number, and what is real number characteristic of—what are the things which have real numbers—to which the Frege/Hale approach rightly gives a central place. But are those questions rightly given a central place? Two well-known lines of thought converge on the contention that they are. First there is the tendency, exemplified for instance by Richard Heck, to think that there is a good distinction to be drawn between (the neo-Fregean delivery of ) a theory which allows of interpretation as, say, Number Theory, or Analysis, or Geometry and (the delivery of ) Number Theory, or Analysis, or Geometry itself. Heck writes What is required if logicism is to be vindicated is not just that there is some conceptual truth or other from which what look like axioms for arithmetic follow, given certain definitions: That would not show that the truths of arithmetic, as we ordinarily understand them, are analytic, but only that arithmetic can be interpreted in some analytically true theory. To put the point differently, if we are so much as to evaluate logicism, we must first uncover the ‘basic laws of arithmetic’, laws which are not just sufficient to allow us to prove translations of arithmetical truths, but laws from which arithmetical truths themselves can be proven. (The distinction is not a mathematical one, but a philosophical one. . . . ) 11
The distinction seems plausibly made in at least some cases. There is no reason, for instance, why a derivation within ZFC of a theory which allows a geometrical interpretation, say, should do anything to illuminate the status of geometry—it all depends on the status of the principles from which the derivation proceeds after they receive whatever may be the corresponding interpretation. But if we restrict attention to second-order axiomatisations, then a theory will allow of interpretation as Number Theory in particular— or so we may take it for the present purpose—if and only if it is categorical: if all its (standard) models have domains comprising ω-sequences. So one who follows the tendency exemplified by Heck’s remarks is urging a type of distinction illustrated by that between a second-order theory which is so categorical and one which somehow, beyond that, genuinely concerns the finite cardinals themselves. Such a distinction can make no sense unless the finite cardinals have a nature which goes beyond their collective composition of an ω-sequence. One who presses Heck’s distinction is accordingly committed 11 Heck [11], at pp. 596–97.
262
The Arché Papers on the Mathematics of Abstraction
to taking seriously the general questions about the nature of numbers (of different kinds) which the Dedekindian Way, I have suggested, should be seen as passing by. Compare Frege’s own thought in Grundgesetze §159. There he writes: The path that is to be pursued here thus lies between the old way of founding the theory of irrational numbers, the one H. Hankel used to prefer,
—in which geometrical quantities were predominant— and the paths followed more recently [Cantor and Dedekind]. We retain the conception of real number as a relation of quantities . . . , but dissociate it from geometrical or any other specific kinds of quantities and thereby approach more recent efforts. At the same time, on the other hand, we avoid the drawback showing up in the latter approaches, namely that any relation to measurement is either completely ignored or patched on solely from the outside without any internal connection grounded in the nature of the number itself . . . our hope is thus neither to lose our grip on the applicability of [Analysis] in specific areas of knowledge nor to contaminate it with the objects, concepts and relations taken from those areas and so to threaten its peculiar nature and independence. The display of such possibilities of application is something one should have the right to expect from [Analysis] notwithstanding that that application is not itself its subject matter. Whether our plan can be carried out is something the attempt must show . . .
This is one of the clearest passages in which Frege gives expression to something that I propose we call Frege’s Constraint: that a satisfactory foundation for a mathematical theory must somehow build its applications, actual and potential, into its core—into the content it ascribes to the statements of the theory—rather than merely ‘patch them on from the outside’. The constraint is repeatedly emphasised, with approval, by Michael Dummett in [5]. A typical passage is as follows: A correct definition of the natural numbers must, on [Frege’s] view, show how such a number can be used to say how many matches there are in a box or books on a shelf. Yet number theory has nothing to do with matches or with books: its business in this regard is only to display what, in general, is involved in stating the cardinality of the objects, of whatever source, that fall under some concept, and how the natural numbers can be used for their purpose. In the same way, analysis has nothing to do with electric charge or mechanical work, with length or temporal duration; but it must display the general principle underlying the use of the real numbers to characterise the magnitude of quantities of these and other kinds. A real number does not directly represent the magnitude of a quantity, but only the ratio of one quantity to another of the same type; and this is in common to all the various types. It is because one mass can bear to another the very same ratio that one length bears to another that the principle governing the use of real numbers to state the magnitude of a quantity, relatively to a unit, can be displayed without the need to refer to any particular type of quantity. It is what is in common to all such uses, and only that, which must be incorporated into the characterisation of the real numbers as mathematical objects: that is how statements about them can be allotted a sense which explains their applications,
Neo-Fregean Foundations for Real Analysis
263
without violating the generality of arithmetic by allusion to any specific type of empirical application. 12
What is it to observe Frege’s Constraint? To insist that the general principle governing the application of a type of number be built into their characterisation from the start is in effect just to insist such numbers be characterised by reference to a principle which explains what kind of entities they apply to—are of —and what it is for such entities to be associated with the same or different such numbers. And of course that is exactly what a suitable abstraction principle will do. It is a feature shared by both Hume’s Principle and the Real Abstraction principle. To view such principles as philosophically and mathematically foundational is accordingly to view the applications of the sorts of mathematical objects they concern as belonging to the essence of objects of those sorts. To take stock. Frege’s Constraint and the insistence on a contrast between establishing a mathematical theory and merely establishing a theory which allows of interpretation as that theory have in common the thought that the objects of e.g. the classical theories of the natural and real numbers, or of classical geometry, have an essence which transcends whatever is shared by the respective types of models of even categorical (second-order) formulations of those theories. Frege’s Constraint explicitly incorporates the additional thought that this essence is to be located in the applications; and so much was tacitly built into my characterisations above of the basic metaphysical questions which a satisfactory foundation for a particular pure mathematical theory should address, in particular in the central role accorded to the question what kinds of thing the numbers in question are numbers of ? Heck’s distinction—between deriving the axioms of Number Theory or Analysis and merely deriving a body of statements which allow of interpretation as those axioms—might in principle, I suppose, be grounded in some other kind of conception of what makes for the essence of natural or real number. But no candidate is on the table besides that incorporated in Frege’s Constraint. And it is hard to see what alternative there could be. For the pure mathematical theories of those entities make no distinction between them and any other isomorphic structure—so what could distinguish them except something to do with application? Well, it should now stand out quite clearly what is arguably tendentious about Frege’s, Hale’s, Heck’s and Dummett’s position. It is, in effect, the presupposition that there has to be more to the natural, or real numbers than any broadly structuralist view of them can accommodate. For structuralism, there is no essence shared by the natural numbers beyond their composition of an ω–sequence; and there is no essence shared by the real numbers beyond their composition of a complete, ordered field. We may, for certain purposes, 12 Dummett [5], at pp. 272–73.
264
The Arché Papers on the Mathematics of Abstraction
reify the ‘elements’ in these respective types of structure as though they were entities in their own right. But for structuralism, the real ‘objects’ of pure mathematical enquiry are the structures themselves; and the applications of the relevant pure mathematical theories derive from the appreciation of structural affinities between (segments of ) the pure structures and certain structured collections of entities taken from the domain of application. From this perspective, the Dedekindian Way is not to be seen as neglecting a range of bona fide metaphysical questions which the Frege/Hale approach rightly takes seriously, but rather as discounting them—or better, as answering them in their only legitimate form, by providing for the derivation of a theory which appropriately characterises the collective structure which is the true subject matter of Analysis. No doubt there is a good philosophical issue about what provides for the applicability of a pure mathematical theory— what enables it to bestow on us knowledge of certain characteristics of the domains to which we do apply it. But structuralism may insist that it does not neglect this question; to the contrary, it provides a general rubric for a response to it—again, the applications of pure mathematical theories are grounded in our recognising certain structural affinities between (segments of ) the pure structures they concern and situations they are applied to. (To assimilate, for example, applications of arithmetic, conceived as the pure science of ω-sequences, to the purpose of simple counting of ordinary collections of objects, we conceive the latter as suitably serially ordered in some way and ask which initial segment of the naturals (excepting 0) is isomorphic to that ordering.) Structuralism does not—in intention anyway— neglect the issue of application. Its contention is rather—in flat contradiction to Frege, Dummett, Heck and Hale—that it is a philosophical mistake to think of natural or real numbers, as having an objectual essence at all, whether or not grounded in their applications, which a satisfactory metaphysical account of them must build in from the start, rather than “patch on as an afterthought”. 13
3. At this point it appears that a decision between the Dedekindian Way and something akin to the Hale construction must ultimately depend on a verdict about the adequacy of a broadly structuralist conception of the classical continuum. More generally, it appears that whether the abstractionist should respect Frege’s Constraint in recovering a given region of mathematics depends on whether we should think of that region structurally or not. If we should—if a full understanding of the mathematical theory in question invokes no specific conception of the kind of entities it is concerned with save as 13 For discussion of the applications of mathematics from a structuralist point of view, see Shapiro [14] chapter 8.
265
Neo-Fregean Foundations for Real Analysis
occupants of particular nodes in the structure—then there is no need for the abstractionist to observe Frege’s Constraint, whatever aesthetic or other merits may be possessed by accounts which do so. But if understanding the theory requires grasping that its characteristic objects have a kind of distinguishing feature going beyond their occupancy of places in a structure—if, in particular, it requires grasping that they are of certain kinds of item, in the way that, for Frege, natural numbers belong to concepts, directions belong to lines and geometrical shapes belong to figures—then an abstractionist account which ignores Frege’s Constraint will not succeed in recovering the whole content of the targeted statements, and its claim to provide a foundation will thereby be compromised. It is implicit in the foregoing that the exigency of Frege’s Constraint may vary as a function of field. But how should we decide whether we should “think of a region of mathematics structurally”? Let me close on one type of consideration that might move us not to do so—the crucial question will be how wide a range of cases it covers. According to structuralism, the appreciation of any pure mathematical truth is the appreciation of a statement of it as holding good of any particular instance of a targeted kind of structure; applications of pure mathematics will then depend upon an additional appreciation of structural affinities between any such instance and the intended realm of application. Because additional, this appreciation may be lacking in one who understands the statement. So, the structuralist should claim, a grasp of the content of a pure mathematical statement need never per se involve knowledge of its applications. But this claim promised to be difficult to sustain in full generality. It seems clear that one kind of access to e.g. simple truths of arithmetic precisely proceeds through their applications. Someone can—and our children surely typically do—first learn the concepts of elementary arithmetic by a grounding in their simple empirical applications and then, on the basis of the understanding thereby acquired, advance to an priori recognition of simple arithmetical truths. I say “a priori” because I see no reason to deny that a child who reasons on her fingers, or with a diagram, say— 1 2 41 62
3 54 73
—that 4 + 3 = 7 has indeed acquired a piece of knowledge a priori in much the same way that a general geometrical intuition can be facilitated by means of a construction with paper and pencil. But if that is right, then there is a kind of a priori arithmetical knowledge which flows from an antecedent understanding of the way that arithmetical concepts are applied. It is not that pure knowledge comes first, as the apprehension of an a priori truth about structures, with the applicability of the knowledge so acquired only dawning
266
The Arché Papers on the Mathematics of Abstraction
on one after one has grasped how certain empirical situations can be viewed as, in effect, modelling aspects of that structure. Rather the content of the a priori knowledge in question already configures concepts drawn directly from the applications. The last is the important point. The objection to the structuralist account in such a case is not that it misrepresents the actual typical order and nature of the acquisition of at least some basic arithmetical knowledge—coming from a neo-Fregean, that would be pretty rich, for no-one actually gets their arithmetical knowledge by second-order reasoning from Hume’s Principle either! Rather, the significant consideration is that simple arithmetical knowledge, so acquired, has to have a content in which the potential for application is absolutely on the surface, since the knowledge is induced precisely by reflection upon sample, or schematic, applications. By contrast, the structuralist reconstruction of this knowledge will involve a representation of its content from which an appreciation of potential application will be an additional step, depending upon an awareness of certain structural affinities. So the structuralist will be open to the charge of changing the subject: whatever the detail of her epistemological story about the simplest truths of arithmetic, the content of the knowledge thereby explained will not be that of the knowledge we actually have—for, again, that can be grounded in reflection upon sample, or schematic applications. The point will also bear plausible illustration by simple geometrical knowledge. It is no part of a grasp of analytic geometry, as structurally conceived, to think of it as concerned with spatial figures at all. So the kind of account of knowledge of geometrical features of space that such a structuralist theory can provide will have to go via the recognition (a priori?) that space puts up a model of the pure theory. Again, the point is not that one could not arrive at geometrical knowledge that way, though it is manifest that in general we do not. Rather, it is that there is a route that goes through reflection—by all means, diagram-assisted reflection—on geometrical concepts as given by ordinary rough and ready empirical illustrations and leads to—apparently— a priori knowledge of simple geometrical truths on the basis of the concepts thereby understood. Think of how you first persuade yourself that: “A straight line divides a circle at exactly two points if at any”. Again, the crucial consideration is what this shows about the content of the knowledge thereby achieved. This suggests a distinction which, wherever it can be upheld, will mandate something close to Frege’s Constraint. It is one thing to explain how (a priori) knowledge could be acquired of a system which, taken in conjunction with certain supplementary reflections, can then be applied in the same ways as an entrenched mathematical theory. But that will not suffice to provide a correct (if idealised) reconstruction of the content of what we actually know in knowing that theory if at least some of that knowledge can be achieved just by the reflective exercise of concepts acquired and applied in the course of
Neo-Fregean Foundations for Real Analysis
267
ordinary counting and calculation, measurement, and the kind of geometrical routines employed in joinery. For in that case the (simple) pure mathematical statements thereby known take on a content which makes those applications immediate. It is accordingly not knowledge of those contents of which we have given an account—even an idealised account—if the statements to which a given theoretical reconstruction leads are ones which, even if knowable a priori, can be wholly grasped without any inkling of their applications at all. Perhaps it is by a development of this thought—and perhaps only thereby— that Frege’s Constraint can be made to prevail against what, I have suggested, are the essentially structuralist roots of resistance to it. But if that is right, then—to emphasise again—there is no reason to think it should prevail right across the board—and a doubt in particular about whether it should do so in the case we are now concentrating on: Real Analysis. The immediate obstacle is, briefly, that it is simply not the case that the distinctive concepts of Real Analysis can be grounded in their applications after the fashion in which, at least in principle, arithmetical concepts and simple geometrical concepts can. For instance, while the cardinal number of a group can be empirically determined, and the application of at least small cardinals schematised in thought, as in the little diagram above, no real number can ever be given as the measure of any particular empirically given quantity. There is simply no such thing as determining a real value of a quantity by measurement or indeed by any other empirical procedure—any set of measurements we take will be finite, and even in the best case there will be no empirical distinction between their convergence upon a particular real value as opposed to uncountably many others sufficiently close to but distinct from it. How then can analogues grip the reals of the kind of thought-experimental or imaginative routines that can engage the objects of arithmetic and geometry and which form the basis of the simplest kinds of reflective knowledge of them? And if no such analogues are possible, what reason is there to suppose that any of our knowledge of Analysis is of propositions whose applications are immediate? That, at any rate, is the issue. Frege’s Constraint is justified, it seems to me, when—and I am tempted to say, only when—we are concerned to reconstruct a branch of mathematics at least some—if only a very basic core—of whose distinctive concepts can be communicated just by explaining their empirical applications. However the fact is that both our concepts of the identity of particular real numbers and—more important—the entire overarching conception of continuity, as classically conceived—the density and completeness of the range of possible values within a parameter determined by measurement—are simply not manifest in empirical applications at all. Rather, so one would think, the flow of concept formation goes in the other direction: the classical mathematics of continuity is made to inform a non-empirical reconceptualisation of the parameters of potential variation in the empirical domains to which it is applied.
268
The Arché Papers on the Mathematics of Abstraction
To explore that thought properly, one would have to take on a complex set of issues for which I have no space here, even if I were confident how to proceed. But if it is good, and if the only really compelling motivation for Frege’s Constraint is the one I have reviewed, there will be no significant shortcoming, from the neo-Fregean point of view, in an abstractionist reconstruction of the reals that follows the Dedekindian Way.
Appendix: Abstractionism and structuralism I have suggested that an abstractionist reconstruction of a pure mathematical theory may be absolved from Frege’s Constraint in any case where it is appropriate to take a structuralist view of the content of that theory. That may seem an unstable claim. After all, the whole raison d’être of abstractionism is the recovery of an account of what is preconceived as knowledge of certain specific kinds of mathematical objects. By contrast, structuralists characteristically do not view mathematics as, in the appropriate way, objectdirected in the first place but see the mathematician’s concern as being with the structural features that collections of objects—whose nature is otherwise irrelevant—may exemplify. So it may seem that Frege’s Constraint must be in force at least in all cases where abstractionism has any point—where there is a range of specific mathematical objects, with a proper intrinsic nature, which a targeted theory concerns; and that in cases where the Constraint does not apply, according to my proposal, because a structuralist view is appropriate, there is anyway no point to the abstractionist project that it might have constrained. Let me briefly explain why I do not think this is so—explain why and how abstractionism and structuralism can co-operate. There is a kind of structuralism whose whole purpose is ontological frugality. For this—eliminative—kind of structuralism, the point of the emphasis on pure mathematics’ (alleged) structural concerns is by way of a counterweight to, and thereby to liberate one from, what is viewed as the problematic notion that it is really concerned with any objects at all—that there is any such thing as specifically mathematical existence. This structuralism is indeed at odds with neo-Fregeanism. But its spirit is quite different to that of a second kind of structuralism, advocated by writers such as Resnik and Shapiro. 14 For these theorists, to emphasise the concern of Number Theory, or Analysis, with certain distinctive kinds of structure goes with the idea not that we should think of such theories as innocent of ontological commitments but rather that they are precisely about articulated structures and that it does not matter what, if any, objects we take to be configured within them so long as they 14 Resnik [12], Shapiro [14]. These two authors’ similar ontological views are married to large differences, however, concerning mathematical epistemology. The remarks to follow are addressed to the overall structuralist position advocated by Shapiro. Resnik’s epistemological views are more purely empiricist and Quinean.
Neo-Fregean Foundations for Real Analysis
269
collectively compose a structure of the appropriate kind. It is in that object— in the articulated structure itself—that the mathematical interest lies. It is structuralism of this ontologically liberal—as Shapiro styles it, ante rem—kind which is, as it seems to me, potentially consonant with a programme of neo-Fregean foundations. Ontologically frugal structuralism does not require there actually to be any examples of the various types of structure in which it represents the mathematician as interested. There might actually be no completely ordered fields; there may even be no ω-sequences; but for frugal structuralism, we can still investigate what such structures would be like if they existed. Mathematics, on this view, is the science of hypothetical structures. It describes how things would be if there were structured collections of entities of the various relevant kinds. By contrast, the ante rem structuralist takes a Platonic view of structures: they exist and are available for mathematical description as complex objects in their own right, whether or not exemplified by any independent collections of objects. The ante rem structuralist must therefore address the question: what guarantee can be given that, so conceived, classical mathematical structures, like the continuum, do indeed exist? And how do we gain knowledge of them? Shapiro’s answer 15 is nuanced but, in the end, broadly Hilbertian. In the best case, he holds, it is by giving a (categorical) characterisation of an intended structure that we grasp the structure—make it available as an object of intellection. And once so made available, it may be investigated by exploring the deductive and model-theoretic consequences of the characterisation by which it was communicated. The mere intelligibility of an appropriate characterisation is enough—enough not just to communicate a concept of the structure involved but to present the very object to the mind. For example, the second-order (categorical) Dedekind–Peano axioms themselves present the structure: ω-series. Thus in Shapiro’s final view, all that the theorist needs to do in order to explain how a particular mathematical structure, is accessible to us as an object of mathematical investigation is to call attention to the fact that we are capable of grasping a canonical axiomatic description of it. Mathematical access is achieved merely by mathematical understanding. It seems to me that there are two ways that abstractionism may complement and assist this view. One is completely in keeping with the proposal and is indeed in effect remarked on by Shapiro himself. I said above that Shapiro’s position was broadly Hilbertian. But for Hilbert—at least so the legend goes—consistency was enough: the mere consistency of an axiom set sufficed to ensure the reality of a mathematical subject matter for those axioms to treat of. Shapiro’s view is more qualified. He does not accept that just any old consistent description serves to communicate—make accessible to us as an object of intellection—an ante rem structure. Tighter constraints are wanted, and he flags them in his notion of coherence. He has much to say about how 15 Developed in chapter 4 of [14].
270
The Arché Papers on the Mathematics of Abstraction
the relevant notion of coherence should be understood but I shall not attempt to evaluate the detail of his discussion here. Suffice it to say that, as he intends the notion, a characterisation is coherent just in case it is satisfiable in the standard iterative hierarchy of sets. (That, anyway, is the intended extension of the notion of coherence: it would be a serious concern for Shapiro’s account if the best that could be done to explicate coherence were simply to help oneself, in that way, to an assumed prior ontology and epistemology of sets, since one would be left with no (non-trivial) account of the coherence of the axioms of ZFC.) 16 Now the first way in which abstractionism may marry with this form of structuralism is precisely by delivering an assurance of the coherence of a given axiomatic characterisation. For however a notion of coherence, apt for Shapiro’s purpose, should be elucidated in general, it ought manifestly to suffice for the coherence of an axiom set if we can reach for an independently given domain of objects which those axioms may then be recognised to characterise. There should thus be no question about the coherence, in a sense consistent with Shapiro’s purpose, of axioms which we can model in a domain composed of objects independently furnished by suitable abstraction principles. The second point of complementarity is less friendly. The principal reservation abstractionism will have with Shapiro’s approach concerns his idea that, merely by giving a coherent axiomatisation, we can do more than convey a concept. Shapiro holds that we can, in addition, induce awareness of an articulate, archetypal object, at once representing the concept in question and embodying an illustration of it. But what ground is there for supposing so? Someone who writes a fiction, even the most coherent fiction, does not thereby create a range of entities whose properties and relations are exactly as the fiction depicts. Rather—it can be agreed on all hands—she merely creates a concept, a description of a possible scenario in which certain things, real or imagined, might be so qualified and related. It is implicit in Shapiro’s view, by contrast, that there can be no such thing as a fictionalised structure. Try to write about merely imaginary structures, as about imaginary people, and the very description of your fiction, if coherent, will defeat your purpose. Only write coherently and, willy-nilly, a Platonic entity—a Shapironian structure— will step forward to fulfil your descriptive demand. Shapiro is, naturally, fully self-conscious and deliberate about this aspect of his view. If he’s right, mathematical fictionalism is simply an incoherent philosophy of mathematics from the start. With structures, the coherence of a description suffices for the existence of a realiser. Against that, abstractionism will set the orthodox idea, repeatedly stressed by Frege himself, that in mathematics, as elsewhere, there is a gap between concept and object, that it is 16 If the observations at the conclusion of this Appendix are correct, this difficulty will come back to haunt the structuralist in any case.
Neo-Fregean Foundations for Real Analysis
271
one thing to give a however precise and coherent characterisation and another to have reason for thinking that it is actually realised. If one takes this view, the question becomes pressing: what, if not Shapiro’s ‘fast track’, could constitute a recognition of the existence of structures conceived as pure objects in their own right, in the fashion of ante rem structuralism? And the obvious suggestion, in the present context, is to attempt a view of structures as arrived at by abstraction, taking pure structures as in effect the order-types associated with given domains of objects and specific ordering relations. 17 Thus, for example, the structure, ω-series, is the order type associated with the natural numbers under the less-than relation. And in general Structure abstraction Structure (F, R) = Structure (G, S) if and only if the Fs under relation R are isomorphic to the Gs under relation S.
Cognoscenti will immediately protest that we cannot have exactly this form of abstraction, since in full generality, it will implicate a form of the BuraliForti paradox. 18 It remains, however, that it does correctly encode the ante rem structuralist’s implicit conception of the identity conditions of pure structures, and that the resolution of the attendant paradox, as analogously with Basic Law V, must accordingly consist, as a first step, in the recognition that not every pair, (F, R), determines a structure. Obviously there is much more to say; but it seems better to confront these questions squarely—even if it leaves the structuralist in difficulty in finding an overarching structure to accommodate the ordinals, for instance, or indeed the iterative hierarchy of sets—than to mask them with the, as the abstractionist will unkindly view it, mythology of coherence as sufficient for existence. This range of issues about ‘large’ structures to one side, it seems to me that the ante rem structuralist should welcome the situation whenever the abstractionist programme goes well locally and it can be shown how it is possible to arrive at a specific collection of abstracts exemplifying a given interesting structure, and to recognise that they do indeed exemplify it; for that is all that should be needed to set up an abstraction to the structure itself. Very simply, there are all the—ontological and epistemological—reasons to attempt an abstractionist treatment of structures as of any other kind of mathematical object. The foreseeable difficulties in recovering structures comprehending all the ordinals, or sets, should be seen not as exposing undesirable limitations in the approach but as pointers to genuine difficulties in the ascription to ourselves of valid conceptions of the appropriately comprehensive structures.
17 This idea surfaces in Shapiro’s own writing—see for instance [14] at p. 123. 18 As Harold Hodes first noted. See Boolos [3], pp. 175–76.
272
The Arché Papers on the Mathematics of Abstraction
References [1] Boolos, G (1987) ‘The Consistency of Frege’s Foundations of Arithmetic’ originally published in J. Thomson, ed. On Being and Saying: Essays in Honor of Richard Cartwright (Cambridge, Mass.: MIT Press), pp. 3–20, and reprinted in [3], pp.183–201. [2] Boolos, G (1990) ‘The Standard of Equality of Numbers’ in George Boolos, ed. Meaning and Method: Essays in Honor of Hilary Putnam (Cambridge: Cambridge University Press), and reprinted in [3], pp. 202–19. [3] Boolos, G (1998) Logic, Logic and Logic (Cambridge, Mass.: Harvard University Press). [4] Boolos, G & Heck, R (1998) ‘Die Grundlagen der Arithmetik §§82–3’ in M. Schirn, ed. Philosophy of Mathematics Today (Oxford, The Clarendon Press) pp. 407–28, and reprinted in [3], pp. 315–38. [5] Dummett, M (1991) Frege: Philosophy of Mathematics (London, Duckworth). [6] Frege, G (1884) Die Grundlagen der Arithmetik (Breslau: Wilhelm Koebner) translated into English by J.L. Austin as The Foundations of Arithmetic (Oxford, Blackwell, 1959). [7] Frege, G (1893) Grundgesetze der Arithmetik 1 (Olms, Hildescheim). [8] Hale, R (2000) “Reals by Abstraction”, Philosophia Mathematica (3) 8 pp. 100–23; reprinted in Hale and Wright [10], pp. 399–420. [9] Hale, R (2002a) “Abstraction and Set Theory”, Notre Dame Journal of Formal Logic, 41(4), pp. 379–80. [10] Hale, R & Wright, C (2001) The Reason’s Proper Study (Oxford, Clarendon Press). [11] Heck, R (1997) “Finitude and Hume’s Principle”, Journal of Philosophical Logic 26 pp. 589–617. [12] Resnik, M (1997) Mathematics as a Science of Patterns (Oxford, Clarendon Press). [13] Shapiro, S (2000) “Frege Meets Dedekind: A neo-logicist treatment of real analysis” Notre Dame Journal of Formal Logic, 41(4), pp. 335–64. [14] Shapiro, S (1997) Philosophy of Mathematics: Structure and Ontology (New York, Oxford University Press). [15] Wright, C (1983) Frege’s Conception of Numbers as Objects (Aberdeen: Aberdeen University Press).
IV
BASIC LAW V AND SET THEORY
NEW V, ZF, AND ABSTRACTION 1,2 Stewart Shapiro The Ohio State University
[email protected].
Alan Weir University of Glasgow
[email protected].
Abstract We examine George Boolos’s proposed abstraction principle for extensions based on the limitation-of-size conception, New V, from several perspectives. Crispin Wright once suggested that New V could serve as part of a neo-logicist development of real analysis. We show that it fails both of the conservativeness criteria for abstraction principles that Wright proposes. Thus, we support Boolos against Wright. We also show that, when combined with the axioms for Boolos’s iterative notion of set, New V yields a system equivalent to full Zermelo–Fraenkel set theory with a principle of global choice. This advances Boolos’s longstanding interest in the foundations of set theory.
1.
Introduction: abstraction, numbers, and sets
Much of George Boolos’s work on Frege and the foundations of arithmetic centers on an abstraction principle proposed by Frege, and now known as Hume’s principle. It states that for any properties F, G, the number of F is the number of G if and only if F and G are equinumerous: #F = #G ≡ (F ≈ G), 1 This paper first appeared in Philosophia Mathematica 7, [1999], pp. 293–321. Reprinted by kind permission of the editor and Oxford University Press. 2 This paper was provoked by a talk given by Crispin Wright at a workshop in the philosophy of mathematics held in St. Andrews, Scotland, in April of 1996. It benefitted considerably from the discussion at the Boolos Memorial Symposium at Notre Dame. Thanks especially to Kit Fine, Richard Heck, Gabriel Uzquiano, and Wright. We appreciate the spirit of collegiality.
275 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 275–302. c 2007 Springer.
276
The Arché Papers on the Mathematics of Abstraction
where ‘F ≈ G’ abbreviates the second-order formula that there is a one-to-one function from the extension of F onto the extension of G: (F ≈ G) : ∃R[∀x(F x → ∃!y(Gy & Rx y)) & ∀y(Gy → ∃!x(F x & Rx y))]. Hume’s principle is thus a formula in a second-order language augmented with a symbol ‘#’ denoting a function from properties to objects. A number of authors, including Boolos [1987] and Crispin Wright [1983], have pointed out that Frege’s development of arithmetic [1884, 1893] contains the essentials of a derivation of full second-order Peano arithmetic from Hume’s principle in second-order logic. This derivation is now called Frege’s theorem. Recent work reveals that Hume’s principle is consistent if secondorder arithmetic is (see, for example, Boolos [1987]). Richard Heck [1993] showed that in Frege’s treatment of arithmetic in [1893], the only substantial use of the ill-fated Axiom V is to derive Hume’s principle. No one doubts that Frege’s theorem is a substantial mathematical achievement, illuminating the natural numbers and their foundation. Who would have thought that so much could be derived from such a simple and obvious fact about counting? Over the years, Boolos, Wright, and others carried on a spirited debate over the philosophical significance of Frege’s theorem. Define a neo-logicist or neo-Fregean to be someone who holds the following two theses: (i) a significant core of mathematical truths are knowable a priori, by derivation from rules which are analytic or meaning-constitutive; and (ii) this mathematics concerns an abstract realm of objects which are objective, or mind-independent in some sense. Neo-logicism may be attractive to those sympathetic to the traditional view of mathematics as a body of a priori, objective truths but worried about the standard epistemological problems faced by platonism. How can we know anything about a realm of causally inert abstract objects? The neo-logicist answers: by virtue of our knowledge of what we mean when we use mathematical language. The neo-logicist is a contemporary heir of what Alberto Coffa [1991] calls ‘the semantic tradition’. Beginning with [1983], Wright takes Frege’s theorem to vindicate a version of neo-logicism concerning the natural numbers. He concedes that Hume’s principle may not be a definition of ‘cardinal number’ or of ‘identity of cardinal number’. Nevertheless. Frege’s theorem will still ensure . . . that the fundamental laws of arithmetic can be derived within a system of second-order logic augmented by a principle whose role is to explain, if not exactly to define, the general notion of identity of cardinal number, and that this explanation proceeds in terms of a notion which can be defined in terms of second-order logic. If such an explanatory principle . . . can be regarded as analytic, then that should suffice . . . to demonstrate the analyticity of arithmetic. Even if that term is found troubling, as for instance by George Boolos, it will remain that Hume’s principle—like any principle serving implicitly to define a certain concept—will be available without significant epistemological presupposition . . . So one clear a priori route into a recognition of the truth of . . . the fundamental laws of arithmetic . . . will have been made out. And if in
New, V, ZF, and Abstraction
277
addition [Hume’s principle] may be viewed as a complete explanation—as showing how the concept of cardinal number may be fully understood on a purely logical basis—then arithmetic will have been shown up by Hume’s principle . . . as transcending logic only to the extent that it makes use of a logical abstraction principle—one [that] deploys only logical notions. So, always provided that concept-formation by abstraction is accepted, there will be an a priori route from a mastery of second-order logic to a full understanding and grasp of the truth of the fundamental laws of arithmetic. Such an epistemological route . . . would be an outcome still worth describing as logicism . . . (Wright [1997], pp. 210–211).
Like the original Fregean logicism, Wright’s program has a chance only if second-order logic is semantically and epistemically neutral, against Quine’s ([1986], Chapter 5) claim that second-order logic is set-theory in disguise, a ‘wolf in sheep’s clothing’. If substantial mathematics is already built into the logic, then as far as logicism goes, Frege’s theorem begs the question. Of course, boundary disputes are not particularly interesting or illuminating. What matters here is whether the axioms and rules of second-order logic preserve the privileged semantic and epistemic status claimed for Hume’s principle. It is absolutely essential to the logicist project that when attempting to establish a mathematical truth, we need not invoke Kantian or Gödelian intuition, empirical fruitfulness, etc. This applies to the starting points and the logic used to pursue the program. For sake of argument, we do not question that status of second-order logic. 3 We have other fish to fry. One argument against neo-logicism (and logicism) is that Hume’s principle has ontological consequences: the existence of infinitely many numbers. Boolos argues that logic, properly so-called, should have no ontological consequences. We should not derive the existence of anything from considerations of meaning alone. If this restriction is observed, and if we follow Frege in taking arithmetic at face value, then logicism is a non-starter. Arithmetic has ontological consequences; logic does not. 4 Rather than end the debate before it gets started, we note that the thesis that logic has no ontology begs the question against the logicist. Wright offers to retrench the logicist claim. He argues that Hume’s principle is an ‘explanation’—possibly a ‘complete 3 Michèle Friend [1997] argues that second-order logic is analytic, and is ‘logic’ in the relevant, traditional sense. In contrast, Shapiro [1991] takes a more Quincan approach, defending the thesis that there is no sharp border between mathematics and logic. That perspective makes little sense of the traditional question of logicism. As far as we know, Wright and Hale do not defend the use of second-order logic in their programs. We discuss the appropriate logic for neo-logicism in Shapiro and Weir [2000]. 4 See Boolos [1997]. Here is a relevant anecdote (SS). When I was writing Shapiro [1991], I considered introducing notation for a pair-function, in order to avoid separate variables ranging over relations and functions (and the resulting notational nightmare). Boolos talked me out of this, pointing out that a pairfunction is a principle of infinity (since there is no pair-function on finite models) and so is not appropriate as a logical primitive. Surely, the same goes for Hume’s principle and numbers. Incidentally, to maintain the position that logic has no ontology, one must waive the nicety that ∃x(x = x) is a standard, first-order logical truth, derivable in virtually all systems. This ‘logical truth’ is more of an artifact of the inconvenience of empty models than a deep metaphysical commitment on the part of logicians. Those who insist on purity can adopt a system in which ∃x(x = x) is not logically true. The details are routine, but tedious (see Quine [1954]). In any case, ∃x∃y(x = y) is not logically true in standard (non-logicist) systems.
278
The Arché Papers on the Mathematics of Abstraction
explanation’—of the notion of ‘identity of cardinal number’, and that this explanation is formulated in logical terms. We depart from logic, strictly so-called, when we draw ontological consequences from the explanation, but Wright holds that with Frege’s theorem we still have something ‘worth describing as logicism’. Wright concedes that his logicism hinges on the proviso that ‘conceptformation by abstraction’ be accepted. This is the crux of his debate with Boolos. Hume’s principle is one of a genus of abstraction principles of the form: @α = @β ≡ E(α, β), where α and β are variables of the same type (either first-order or higherorder), E(α, β) is an equivalence relation, and ‘@’ is a new function symbol, so that ‘@α’ and ‘@β’ are singular terms. Frege invokes two other abstraction principles. One is at least relatively innocuous: the direction of l is identical to the direction of l if and only if l is parallel to l . The other example is Frege’s [1893] Axiom V: Ext(F) = Ext(G) ≡ ∀x(F x ≡ Gx), which is part of his theory of extensions. Of course, Axiom V is inconsistent with the comprehension principle of classical (or intuitionistic) second-order logic. One response, following Whitehead and Russell [1910], would be to forbid impredicative definitions, so that we could not instantiate Axiom V with a property defined in terms of extensions. This restriction would restore consistency, but it would also keep us from instantiating Hume’s principle with properties defined with the #-sign. The restriction would thus block Frege’s theorem, the hoped-for route to logicism. So for the neo-Fregean who wishes to invoke Frege’s theorem, forbidding impredicativity would be to throw out the baby with the bath water (see Wright [1998] and Shapiro and Weir [2000]). Boolos does not accept ‘concept-formation by abstraction’ as a legitimate maneuver for a prospective logicist. The most prevalent of his arguments is the ‘bad company objection’. Boolos proposes that there is no non-ad hoc way to distinguish good abstraction principles like Hume’s principle, from bad ones like Axiom V. To be sure, Hume’s principle is consistent while Axiom V is not, but that distinction is too coarse-grained. Hume’s principle is an ‘axiom of infinity’ in the sense that it is satisfiable only in infinite domains. Boolos points out that there are consistent abstraction principles, with the same form as Hume’s principle (and Axiom V), that are satisfiable only in finite models (see also Heck [1992]). If Hume’s principle is acceptable, then so are these others, but they cannot all be correct. How then to distinguish the legitimate abstraction principles (see Weir [2003] and Fine [1998])?
New, V, ZF, and Abstraction
279
Wright ([1997], pp. 230–239) rises to the challenge and proposes some conservativeness requirements that good abstraction principles must meet. Let A be an abstraction principle and let T be any theory whose language does not contain the operator introduced by A. A straightforward conservativeness requirement would be that if is a sentence in the language of T , then is a consequence of T + A only if is a consequence of T alone. That is, the addition of A to a given theory T should not produce any consequences in the old language that were not already consequences of the old theory. 5 Because of the ontological consequences of some legitimate abstraction principles, this requirement is too strong. For example, Hume’s principle is not a conservative extension of any consistent theory that does not already entail the existence of infinitely many objects. Wright’s proposal is that a good abstraction principle should not have any consequences other than what follows from the existence of the newly introduced objects. For example, we establish the infinitude of the universe from Hume’s principle only because there are infinitely many natural numbers. Wright’s conservativeness requirement is that a legitimate abstraction principle should have no new consequences concerning any objects recognized to be in the previous ontology. For example, if we add Hume’s principle to a theory about cats, then should be no new consequences concerning cats. Of course, Axiom V violates this conservativeness requirement— big time. Now suppose that A is satisfiable only in finite domains and not in infinite domains. Then if we add A to a theory about cats, we can derive a statement that there are only finitely many cats. This is plausible enough, but it may not have been a consequence of our prior cat theory. Invoking a theory via abstraction should not by itself tell us how many cats there are. Let κ be a cardinal number. Wright finds it kosher for an abstraction principle to entail that there are at least κ things, but not kosher for an abstraction principle to entail that there are at most κ things (or exactly κ things). He provides a rigorous formulation of the conservativeness requirement: let Sx be a predicate ‘true of exactly the referents of the’ newly introduced terms. In the case of Hume’s principle, Sx states that x is a number (i.e., Sx is ∃F(x = #F)). Let be a sentence in the new language. Define the -restriction of , written , to be the result of restricting the range of the quantifiers 6 in to ¬S. So in case of Hume’s principle, states that holds of the non-numbers. Let T be any theory. The conservativeness requirement is that for any sentence in the language of T, T plus the abstraction principle entails only if T entails . So, for example, for any theory consistent with Hume’s principle and any in the language of that theory, if the theory plus Hume’s principle entails the restriction of to non-numbers, then the original theory should entail itself. 5 For now, we will ignore the distinction between deductive consequence and semantic model-theoretic consequence, but this matter will become relevant later. 6 That is, we restrict the first-order quantifiers to ¬S and the second-order quantifiers to the properties and relations on ¬S.
280
The Arché Papers on the Mathematics of Abstraction
As formulated, this conservativeness requirement is not quite right. Let be the sentence ‘Clinton lied under oath’, and let T be the theory with the single axiom: ‘If the universe is infinite, then Clinton lied under oath’. Then T plus Hume’s principle entails and is just . So T plus Hume’s principle entails . However, T itself does not entail . So Hume’s principle does not meet the letter of Wright’s conservativeness requirement. The main idea behind Wright’s requirement is that abstraction principles should have no consequences concerning the non-abstracts, the non-numbers in this case. To elaborate this, we need to restrict the quantifiers of the old theory explicitly to non-abstracts. The following is thus a friendly proposal, which seems to get at what Wright has in mind. Let T be any theory. The amended conservativeness requirement 7 is that for any sentence in the language of T, T plus the abstraction principle entails only if T entails . If ‘entailment’ is formulated in terms of model-theoretic consequence, then Hume’s principle satisfies the amended conservativeness requirement. If ‘entailment’ is formulated in terms of deductive consequences, then, as far as we know, it is open whether Hume’s principle satisfies the amended conservativeness requirement. Even if Wright is correct that Frege’s theorem is a successful defense of something resembling logicism concerning arithmetic, the neo-logicist should not rest content. Arithmetic is only a small fragment of the mathematics that Frege wanted to reduce to logic. What of real analysis, complex analysis, functional analysis, etc.? Wright ([1997], pp. 233–244) proposes to seek an abstraction principle that does for real analysis what Hume’s principle does for arithmetic. This principle must entail that there are at least continuummany objects, and yield a standard axiomatization, of, say, second-order analysis. As a step toward this goal, Wright considers an abstraction principle for extensions developed in Boolos [1989]. The idea is to come as close as we can to Axiom V: Ext(F) = Ext(G) ≡ ∀x(F x ≡ Gx), without encountering disaster. Of course, Axiom V requires that there be too many extensions. A version of Cantor’s theorem—that there is no one-to-one association of properties with objects—can be derived in pure second-order logic. 8 7 Wright mentions that his notion of conservativeness is adapted from Hartry Field’s [1980] (p. 12) defense of nominalism. The present amended requirement is closer to Field’s own. Fine ([1998], pp. 626– 627) suggests something in the neighborhood of our amended requirement. See also Weir [2003]. 8 A function from properties to objects would be a ‘third-order’ item. However, such a function can be ‘coded’ as a binary relation, and Cantor’s theorem can be formulated in a second-order language with no non-logical terminology. See Shapiro ([1991], p. 104).
281
New, V, ZF, and Abstraction
One rescue would be to restrict Axiom V to a class of ‘safe’ properties, holding that the rest do not have extensions. This would move us in the direction of free logic, since for many (indeed most) properties F, Ext(F) would not exist. Boolos suggests instead that we identify the extensions of many properties, and he proposes the following: 9 Ext(F) = Ext(G) ≡ [(‘F is bad’ & ‘G is bad’ ∨ ∀x(F x ≡ Gx)], where ‘F is bad’ is some property of properties characterized in a secondorder language. Call this the good-extensions principle. It follows from goodextensions that there is at least one ‘bad’ property. One candidate for ‘F is bad’ is ‘F applies to at least κ distinct objects’, where κ is a fixed cardinal definable in second-order logic. Examples include finite, infinite, uncountable, ℵ12 , continuum-many, inaccessibly-many, and hyper-Mahlo-cardinal-many (see Shapiro [1991], §5.1.2]). It follows from such an articulation of the good-extensions principle that there is a property that applies to at least κ objects, and so it follows that there are at least κ objects. If κ is at least the size of the continuum, then real analysis can be developed from the abstraction principle in standard ways. If κ is at least the size of the powerset of the continuum, then functional analysis can be developed. For the Fregean, this maneuver has the feeling of theft over toil. She wants to show how real analysis, and the existence of the real numbers, flows from an abstraction principle, and she picks a version of the good-extensions principle which almost explicitly states the existence of the relevant ontology. Moreover, these versions of the good-extensions principle seem to violate another constraint that Wright proposes. The abstraction principles ‘embed a paradox’ in the sense that the crucial clause of Axiom V is a disjunct of the central equivalence relation, and Wright points out that the existence of κ objects exploits this ‘paradoxical component’. His requirement is that an abstraction is . . . acceptable provided it meets another . . . constraint: roughly, that any consequences which may be elicited by exploiting its paradoxical component should be, a priori, in independent good standing. Theorems of logic are so par excellence. But ‘independent good standing’ might also reasonably be taken to cover the case where a consequence elicited from such an abstraction by ‘fishy’— paradox-exploitive—means can also be obtained . . . innocently from additional resources provided by that very abstraction. Wright [1997], p. 237)
Call this the near-paradox constraint. Prima facie, we seem to have no independent reason to believe that the universe is the size of the continuum 9 There is a potential confustion of terminology here. We use the word ‘extension’ for the referents of the Ext function, so that ‘the extension of F’ is a singular term denoting an object in the domain of discourse. That is, extensions are in the range of the first-order variables. By contrast, in much of the literature, the ‘extension’ of a predicate is the class of objects that satisfy the predicate, and this class may not be an object in the range of the first-order variables.
282
The Arché Papers on the Mathematics of Abstraction
(or its powerset), beyond the derivation from these versions of the goodextensions principle or a question-begging reference to the real numbers. 10 Wright concedes that this constraint is only suggestive, and is not rigorous. Here we can leave things at this level. Another natural characterization of ‘F is bad’ for the good-extensions principle is ‘F is equinumerous with the universe’. Call such properties ‘big’: F is big: ∃R[∀x(F x → ∃!y Rx y) & ∀y∃!x(F x & Rx y)]. Define a property to be ‘small’ if it is not big. The abstraction to be considered is thus: Ext(F) = Ext(G) ≡ [(‘F is big’ & ‘G is big’ ∨ ∀x(F x ≡ Gx)]. Boolos dubs the resulting principle ‘New V’, and Wright calls it ‘VE’, for ‘V enlightened’. Wright denotes the second-order theory whose only axiom is New V ‘SOLVE’. With New V, Boolos calls Ext(F) the ‘subtension’ of F, but we will continue to call it the ‘extension’. Under New V, the extensions of all big properties are the same. Identifications among the extensions of small properties are made as in Axiom V. In the case of New V, the existence of a ‘bad’ property is just the existence of a property equinumerous with the universe. This, of course, is an innocuous logical truth, and so the existence of a ‘bad’ property is of ‘independent good standing’ if anything is. So the gist of Wright’s near-paradox constraint seems to be met. So far, so good. Boolos’s concern with New V is related more to his longstanding interest in the philosophical foundation of set theory than to Fregean logicism. New V is a straightforward expression of the ‘limitation of size’ idea due to von Neumann (see Lévy [1968], Boolos [1971], and Hallett [1984]). Boolos shows how to develop a set theory from New V. First define membership in the straightforward manner: x is a member of y if y is the extension of a property that holds of x. That is, x ∈ y ≡ ∃F(F x & y = Ext(F)). Boolos defines y to be a set if y is the extension of a small property. When restricted to sets, the axioms of extensionality, separation, and replacement follow from New V. Let b be the extension of the universal property, defined by (x = x). Then b is also the extension of every big property, and we have ∀x(x ∈ b) and, in particular, b ∈ b. Of course, b is not a set. Let {b} be the extension of the property defined by (x = b). One can show that New V entails that the universe is infinite. It follows that {b} is a set (since it has only one element). Therefore, some sets have members that are not sets. Moreover, {b} and b have a member 10 Wright argues that if κ is ℵ , then the relevant existence principles are independently in good standing. 0 However, this does not yield real analysis, and the argument does not generalize to a larger κ.
New, V, ZF, and Abstraction
283
in common (namely b itself), and so the principle of foundation fails (even for sets). The axiom of unions, restricted to sets, also fails. The union of {b} is b, which is not a set. Following Lévy [1968], Boolos gives a rigorous definition of a ‘pure set’. Roughly, a set is pure if it is a member of the iterative hierarchy, with no urelements. So {b}, for example, is not pure. When the quantifiers are restricted to pure sets, extensionality, separation, pairing, unions, foundation, choice, and replacement all follow from New V. Frege’s own development of arithmetic cannot be carried out in the theory of extensions. Suppose we define the number 1 to be the extension of the property of being a singleton set and we define the number 2 to be the property of being a doubleton set. Well, both of these properties are big and so 1 = 2 = b. However, arithmetic can be developed in the set theory underlying New V by following Zermelo, von Neumann, or Dedekind. Define the number zero to be the extension of the property defined by (x = x). Define the number one to be the extension of (x = 0), etc. The second-order Peano postulates are forthcoming. This is almost as exhilarating as Frege’s theorem. Who would have thought that so much could be derived from so little? Moreover, New V is consistent. Let the domain of discourse be the collection of hereditarily finite sets, Vω . Let f be any one-to-one function from Vω into Vω such that some set c is not in the range of f . Let A be a subset of the domain. If A is infinite, then let Ext(A) be c. If A is finite, then let Ext(A) be fA. It is easy to see that New V holds under this interpretation. The result entails that New V falls short of even Zermelo set theory. Although all models of New V are infinite, it does not follow from New V that there is an infinite set. Thus, the axiom of infinity is not derivable from New V. One also cannot derive a powerset principle, that for every set, there is a power set. Since Vω is countable, one cannot recapture real analysis via New V alone. So New V is not the sole rescue for neo-logicism. However, if we augment New V with a principle that entails that there is an infinite extension, then real and complex analysis can be developed along standard lines—following Dedekind for example. If we further augment the theory with the powerset principle, the axioms of full Zermelo–Fraenkel set theory would be true of the pure sets, and thus virtually all of classical mathematics could be developed. Not bad for three simple axioms. Once again the issue here is the philosophical significance of this, and its relation to logicism in particular. Wright ([1997], p. 241) concedes that it will not do just to stipulate that the principles of infinity and powerset hold: To one interested primarily in the mathematical capabilities of SOLVE, [the inability to derive the infinity and powerset principles] will simply invite the addition of supplementary axioms . . . But for the neo-Fregean, such a move is
284
The Arché Papers on the Mathematics of Abstraction tantamount to folding one’s hand. The project was to explain how a recognition of the existence of a domain of objects called for by classical mathematical theories might be accomplished. The mooted additional axioms would contribute nothing to that project: they merely amount to stipulations about the size of the intended domains—the fundamental question, how the existence of such domains might be recognized in the first place, would be left wholly unaddressed.
Wright proposes that we get the infinite extensions by combining New V with some other abstraction principle, to be named later: The sought-after abstraction does not have to lead directly to the reals. Any uncountable population will provide a backcloth against which [the natural numbers] will not be too big in the sense of VE. So SOLVE would, in the context of such a background ontology, provide the resources for a standard set-theoretic construction of Analysis. In other words, augment SOLVE by any otherwise acceptable abstraction—it need not even be a logical abstraction— which demands an uncountable population of objects and you will have the resources to carry through a set-theoretic construction of Analysis along tried and tested lines without making any direct use of the new objects at all. In such a scenario, the reals could still be viewed as logical objects—since they would be identified with certain of the abstracts introduced by VE—but the recognition of their existence, in contrast with that of the natural [numbers], would not be grounded in a purely logical abstraction. (Wright [1997], pp. 243–244)
The assumption underlying this proposal is that New V is itself in good standing for the would-be Fregean logicist, and that this principle provides the structural background for developing real analysis and beyond. The logicist looks elsewhere for the ontology.
2.
ZF, New V, and models thereof
This finally sets the stage for our entry into the fray. We propose to further the technical comparison of New V and ZF as alternative foundations, and alternative conceptions of the notion of ‘set’—pursuing Boolos’s interest in the foundations of set theory. Our results cast doubt on the presupposition that New V is an acceptable abstraction principle, given the standards and goals of neo-logicism. Thus, we side with Boolos against Wright. Begin with a second-order language, whose logic contains a comprehension scheme: ∃R∀x 1 · · · xn (Rx1 · · · xn ≡ ), one instance for each formula not containing R free. The idea behind the comprehension scheme is that each formula defines a relation on the universe. With standard semantics, in each interpretation the monadic predicate variables range over the entire powerset of the domain, and similarly for the other higher-order variables (see Shapiro [1991], Chapter 4). Under this semantics, the comprehension scheme is a logical truth, even though it is impredicative.
New, V, ZF, and Abstraction
285
Augment the language with λ-terms, so that if is a formula and x a firstorder variable, then (λx) is a predicate term, and can occupy the predicate place in an atomic formula. We allow λ-terms in which the embedded formula itself contains λ-terms. If t is a singular term, then (λx)(t) is a formula equivalent to the result of substituting t for all free occurrences of x in . The language also contains binary and n-place λ-terms. In light of the comprehension scheme, the language with λ-terms is a conservative extension of the corresponding language without them. 11 The λ-term (λx) is a convenient ‘name’ for the property defined by . As above, we further augment the language with a higher-order function constant Ext, so that if P is a predicate term (such as a property variable or a one-place λ-term) then Ext(P) is a singular term denoting something in the range of the first-order variables. To repeat, New V is the following sentence in this language: ∀F∀G[Ext(F) = Ext(G) ≡ ((‘F is big’ & ‘G is big’) ∨ ∀x(F x ≡ Gx))], where ‘F is big’ is an abbreviation of: ∃R[∀x(F x → ∃!y Rx y) & ∀y∃!x(F x & Rx y)]. As an alternative, we could define ‘F is big’ as ‘F is equinumerous with (λx(x = x))’. In the following sections, we approach New V from several different perspectives. Section 3 deals with deductive and semantic consequences of New V, using an ordinary second-order logic. According to Wright’s suggestion, this is the framework for developing mathematics and sustaining much of the logicist insight. We show that New V runs afoul of Wright’s conservativeness requirement, the amended conservativeness requirement, and the near-paradox constraint. In Section 4 we assess the prospects for adding New V to more standard set theories. Our purposes are two-fold. One concerns the foundations of set theory. As noted above, Boolos claimed that there are two different notions of ‘set’, the iterative conception and the limitation-of-size conception. New V captures the latter. We show that the combined theory is richer than full ZFC, and is thus much more powerful than either theory alone. This sharpens Boolos’s [1989] thesis that no single conception of set underlies ZFC (see also Parsons [1983], Essays 10–11). Our second purpose concerns neo-logicism. If New V is to serve as part of a foundation for all mathematics, it should be compatible with the most powerful theories around. Even if the neo-logicist manages to use New V (plus other things) to recapture analysis and functional analysis, it will be a blow to the program if New V does not fit smoothly into the now comfortable, but non-logicist, set-theoretic foundation for mathematics. In short, to determine whether the neo-logicist can have his cake and eat 11 Each instance of the comprehension scheme can be derived from the corresponding instance of the conversion scheme: (λx(x))(ι) ≡ (ι).
286
The Arché Papers on the Mathematics of Abstraction
it too, we have to see what is involved in the existence of an Ext-function defined on the entire set-theoretic hierarchy, the ‘domain’ of all set-theoretic properties. We show that New V can be made true if, but only if, a powerful choice principle is assumed. Our final Section 5 is model-theoretic. We examine set-sized models of New V. We establish that some central aspects of New V are independent of ZFC. In particular, we show that it is consistent with ZFC that there are no uncountable models of New V. Thus, it is consistent that Wright’s program of using secondorder logic and New V to recapture real analysis is mathematically impossible. This orientation presupposes a background theory—an ordinary set theory—from which we do the model theory. Much of the Boolos–Wright discussion presupposes a framework like this. For example, both of them point out that Hume’s principle is satisfiable in any infinite domain and in no finite domain, and they point out that New V is satisfiable in a countable domain. In a sense, this third perspective is that of the classical model-theorist examining the work of the neo-logicist, trying to determine the prospects from the modeltheorist’s point of view. Presumably, Wright envisions a time when the modeltheoretic ladder is kicked away or possibly recaptured in the object language via the abstraction principles. 12
3.
A troubling consequence of new V
The main result here is that an exceptionally strong choice principle is a deductive consequence of New V. As above, a more or less standard theory of sets can be developed in the language of New V. To repeat, we define an object x to be a set if it is the extension of a small property, and define membership in a straightforward manner, in terms of predication: Set(x) ≡ ∃F(x = Ext(F) & ‘F is not big’), y ∈ x ≡ ∃F(x = Ext(F) & F y). So a set is the extension of a property that is not the same size as the universe. Boolos [1989] shows that an object x is a pure set if x is a set and if all members of x are pure. As usual, define x to be transitive if every member of every member of x is a member of x: ∀y∀z((z ∈ y & y ∈ x) → z ∈ x), and define x to be an ordinal if x is a pure set, x is transitive, and every member of x is transitive. Let On(x) be the formula in the language of New V that states that x is an ordinal. The usual set-theoretic arguments establish that every member of an ordinal is an ordinal and every ordinal is well-ordered by membership. 12 See Fine ([1998], pp. 511–516) for an insightful discussion of the relationship between neo-logicism and the background set theory.
New, V, ZF, and Abstraction
287
Boolos shows that every non-empty property of ordinals has a least element: ∀G[∃x(On(x) & Gx) → ∃x(On(x) & Gx & ∀y(y ∈ x → ¬Gy))]. It then follows that the ordinals are themselves strongly well-ordered under membership. Let On be the property λx(On(x)). The reasoning underlying the BuraliForti paradox establishes that On is big: Theorem 1: It follows deductively from New V that there is a one-to-one function from the ordinals to the universe. Proof: [sketch] Let o = Ext(On). Suppose that On is small. Then o is a set. Every member of o is an ordinal and so every member of o is a pure set. It follows that o is itself a pure set. Moreover, if x is a member of o and y is a member of x then y is an ordinal and so y is a member of o. Again, every member of o is transitive, since it is an ordinal. Thus o is itself an ordinal and so o ∈ o. This contradicts the well-foundedness of the pure sets. Thus, On is big, which entails that there is a one-to-one function from the ordinals onto the universe. One striking consequence of Theorem 1 is that the well-ordering of the ordinals gives rise to a well-ordering of the universe: Corollary 2: It follows deductively from New V that there is a well-ordering of the universe. Proof: Let R be a one-to-one function from the ordinals onto the universe. Define a relation x ≺ y as follows: x ≺ y ≡ ∃u ∃v(On(u) & On(v) & Rux & Rvy & u ∈ v). Since membership is a well-ordering on the ordinals and R is one-to-one, the relation ‘≺’ is a well-ordering of the universe. A principle of global choice follows from the well-ordering principle: ∀R(∀x∃y Rx y → ∃ f ∀x Rx f x). The various local choice principles, like the axiom of choice for first-order ZFC, follow from global choice. One such principle is that for every set s of pairwise disjoint non-empty sets, there is a set consisting of exactly one member of each member of s. Corollary 2 indicates that New V violates Wright’s original conservativeness requirement, the above-amended version, and the near-paradox constraint. Let us start with the latter. Wright points out that New V ‘embeds a paradox’ in the sense that one of its disjuncts is the crucial clause of Frege’s original Axiom V. Here we invoke another paradox. The proof of Theorem 1 reproduces the reasoning leading to the Burali-Forti paradox, and then makes a disjunctive syllogism to conclude that On is big. Thus, for Wright, the
288
The Arché Papers on the Mathematics of Abstraction
proof and its corollary ‘exploit the paradoxical component’ of New V. The near-paradox constraint is that any such consequence should be ‘a priori, in independent good standing’. Does the existence of a well-ordering of the universe enjoy independent good standing? Can one show that the universe is well-ordered ‘innocently’, without ‘exploiting the paradoxical component’? The same question arises, in sharper focus, with the conservativeness requirements. Wright asserts that an acceptable abstraction principle should have no consequences in the original language, other than what follows from the more existence of the newly defined objects. Recall the precise requirement: let Sx be a predicate ‘true of exactly the referents of the’ newly introduced terms. Here Sx is ∃F(x = Ext(F)), stating that x is an extension. Let be a sentence in the new language. Define the -restriction of , written , to be the result of restricting the range of the quantifiers in to ¬S, so that states that holds of the non-extensions. Let T be any theory. The original requirement is that for any sentence , T plus New V entails only if T entails . The amended requirement is that for any sentence , T plus New V entails only if T entails . New V fails both of these, if they are formulated in terms of deductive consequence. Let WO be a second-order sentence asserting the existence of a well-ordering of the universe (see Shapiro [1991], p. 106) and let T be any theory such that WO is not a theorem of T and not a theorem of T . Corollary 2 is that WO can be deduced from New V. The sentence WO is an immediate consequence of WO, since the restriction of the global wellordering to the non-extensions is a well-ordering of the non-extensions. So WO is a theorem of T plus New V. By hypothesis, WO is not a theorem of T or T alone and so the conservativeness requirements fail. We can mimic Wright’s claims concerning abstraction principles that he says are not conservative. Suppose that we add New V to a theory about cats, space–time points, and space–time regions. In the new theory, it follows that there is a well-ordering of the cats and the space–time points and regions. There may be nothing in the original theory that entails the existence of this well-ordering. Of course, no one will find the existence of a well-ordering of the cats all that implausible, since cats are born one at a time. Recall, however, that Wright rejects some abstraction principles as non-conservative just because they entail that there are finitely many cats, which is similarly plausible. Moreover, given ordinary physics, a well-ordering of the space– time points and regions is not at all obvious. For example, the Tarski–Banach paradox follows from the existence of such a well-ordering. Of course, one can prove the existence of a well-ordering of space–time in the theory of space–time plus ZFC, but there the choice principle is an explicit axiom. Here it follows from a supposedly analytic truth, or explanation of a concept, or something without significant epistemological presupposition. In response, Wright might demur from the talk of deductive consequence and argue that the well-ordering principle is a semantic consequence of any
New, V, ZF, and Abstraction
289
theory T . This amounts to a claim that WO is logically true. Something along these lines would also be needed to argue that the well-ordering principle is in ‘independent good standing’ in order sustain the near-paradox constraint. Here, of course, we cannot be as precise, but we do submit that the principle is not in independent good standing. At the least, WO is a substantial mathematical assumption. The well-ordering principle is, of course, connected with the axiom of choice, in light of Zermelo’s [1904] (and [1908]) celebrated proof that the axiom of choice entails that any set can be well-ordered. There is an established precedent for arguing that the axiom of choice is logically true. Zermelo, for example, regarded his [1904] derivation to be a proof of the principle that any set can be well-ordered, not just a hypothetical inference. He pointed out that choice principles are implicit in much of mathematics, as it had developed in his day. If anything, choice is even more entrenched in contemporary mathematics. Hilbert argues similarly, and the principle of global choice is an axiom in Hilbert and Ackermann’s [1928] treatment of second-order logic (and in Shapiro [1991], Chapter 3). Hintikka [1996] claims that a principle equivalent to global choice is implicit in the very meaning of the quantifiers— the ‘∀x∃y’ combination in particular (see also Miraglia [1996]). This would make global choice a logical truth par excellence. At best, these arguments indicate that some choice principles are logical truths, but they do not establish that the second-order sentence WO is a logical truth, nor do they show that it is in ‘independent good standing’. Zermelo’s theorem that any set can be well-ordered is equivalent to the so-called local axiom of choice of first-order set theory. Global choice is stronger than local choice, and WO is even stronger than global choice. That is, Zermelo’s theorem does not establish global choice and, more important for us here, global choice 13 does not entail WO. Admittedly, this looks strange in light of the fact that the well-ordering principle for sets follows from the local axiom of choice (i.e., choice for sets). Why does the global well-ordering principle (WO) not follow from global choice, i.e., choice for properties? An analysis of Zermelo’s proof resolves the puzzle. To ‘construct’ a well-ordering for a given set s, Zermelo invokes a choice function on the powerset of s. Since s is a set—an element of the intended domain—the powerset of s is a set (assuming the powerset axiom) and so the axiom of choice yields the requisite choice function. Although the global-choice principle gives a choice function on the universe, to well-order the universe we need a choice function on what would be the powerclass of 13 See Shapiro [1991], pp. 106–108. Notice, incidentally, that WO is a theorem of second-order Zermelo–Fraenkel set theory plus global choice. From global choice, there is a function f such that for every ordinal α, f α is a well-order of the rank Vα . Then for two sets x, y, define x ≺ y if either the rank of x is less than the rank of y, or else the ranks are equal and x comes before y in f α, where α is the rank of x. This relation is a global well-ordering. The proof here makes essential use of the axiom of foundation in ZFC (which entails that every set has a rank).
290
The Arché Papers on the Mathematics of Abstraction
the universe—something along the lines of a third-order version of the globalchoice principle. 14 It is too much to claim (without further ado) that this powerful choice principle is somehow implicit in the meaning of the logical terminology of second-order languages, or is otherwise analytic or without significant epistemological presupposition. Perhaps our neo-logicist will invoke Zermelo’s theorem anyway, claiming that he is envisioning a set-sized domain of discourse for New V. If the axiom of choice is true (in the set-theoretic hierarchy) then the principle WO (as well as global choice) is technically a semantic logical truth since it holds in any model whose domain is a set. However, this maneuver would give up a major Fregean component. The first-order variables of the relevant abstraction principles are supposed to range over the entire universe, not just some setsized collection. The goal of logicism and neo-logicism is to show how the truths of arithmetic and analysis are all but logically true, by showing how they follow from the abstraction principles alone. To invoke an external set theory, and the entire set-theoretic hierarchy, would undermine the epistemic gains of the old logicism and the new neo-logicism. The failure of conservativeness indicates that the set-theoretic ladder cannot be kicked away, and that logicism must be augmented with a presumably non-logical theory of sets. We return to this model-theoretic perspective in the final Section 5. A better response, more in line with Fregean logicism, would be for Wright to reformulate his conservativeness principle so that consequences like WO are acceptable. The infinitude of the universe is an acceptable consequence of Hume’s principle, so why should WO disqualify New V? If Wright wishes to maintain that New V is an acceptable abstraction principle, he will need to argue that the global well-ordering principle has the same status that he attributes to arithmetic truths. To adapt what he says about arithmetic ([1997], p. 210), Wright will have to maintain that WO is a consequence of ‘a system of second-order logic augmented by a principle whose role is to explain, if not exactly to define [a] general notion . . . If such an explanatory principle . . . can be regarded as analytic, then that should suffice . . . to demonstrate the analyticity of’ WO. Moreover, New V ‘like any principle serving implicitly to define a certain concept . . . will be available without significant epistemological presupposition . . . So one clear a priori route into a recognition of the truth of [WO] . . . will have been made out.’ This is quite a burden for the logicist to bear. Recall that the purpose of New V is to allow the extension of logicism to real and complex analysis. We submit that the epistemological cost is too high. It is hard to show just how WO is known ‘without significant epistemological presupposition’. 14 See Shapiro [1991], §5.1.3 (and some of references cited there), for a comparison of several choice principles formulated in second-order languages. For example, one can formulate a version of Zorn’s lemma using only the logical terminology of second-order languages. The principle WO is the strongest of the choice principles considered.
New, V, ZF, and Abstraction
4.
291
Set theory plus New V
Here we examine the prospects for adding New V to the axioms of established set theories. If New V has the privileged epistemological status claimed by Wright, then it should blend with any established mathematical theory. Thus, it worth examining the interactions between New V and some of the more powerful theories around. How comfortable would a set-theorist be with New V, defined over the entire set-theoretic hierarchy? Of course, the neologicist has the option to reject any set theory that she cannot recapture with New V, but perhaps this desperate maneuver can be avoided. One potentially confusing matter is that there are two different notions of ‘set’ and two different membership relations on the table. One is represented by the primitive ‘∈’-symbol in the language of set theory and the other is defined from the Ext operation of New V, as above. Here, we envision both notions of ‘set’ and both membership relations on the same domain of discourse. For the time being, we use ‘∈z ’ for the set-theoretic primitive and we call the relevant sets of set theory ‘Z-sets’. We use ‘∈v ’ for the defined relation of New V and ‘V-set’ for the sets of New V, the extensions of small properties. The two membership relations are different because, for example, ∈z is supposed to be well-founded while it follows from New V that ∈v is not well-founded. Under New V, there is one object b such that ∀x(x ∈v b) and so / z x). Of course, the b ∈v b. It is a theorem of Zermelo set theory that ∀x(x ∈ pure V-sets are well-founded, but not every V-set is pure. Relations between Z-sets and V-sets are determined by the resolution of the so-called Caesar problem. We also distinguish Z-ordinals, defined within set theory, from Vordinals, defined in terms of ∈v as above. The same goes for just about every other set-theoretic construction.
4.1
The iterative conception vs. limitation-of-size
As noted above, one of Boolos’s projects was to delineate two notions of ‘set’, to see what foundational properties support and follow from each one. Here we pursue this by seeing how the two notions interact. The limitation-of-size conception is captured by New V. If the quantifiers are restricted to pure V-sets, then New V entails every axiom of ZFC except infinity and powerset. The iterative notion is that Z-sets are arranged in ‘stages’, much like ranks. Boolos argues that the iterative conception yields all of Zermelo set theory except choice (when the existence of a transfinite ‘limit’ stage is assumed). The intended models of the iterative conception are isomorphic to ranks Vλ , where λ > ω is a limit ordinal, and every such rank satisfies secondorder Zermelo set theory. However, contrary to what may be a popular belief, second-order Zermelo set theory is not an adequate formalization of the iterative conception of set. Gabriel Uzquiano [1999] has shown that secondorder Zermelo set theory has a standard model in which Z-membership
292
The Arché Papers on the Mathematics of Abstraction
is not well-founded. 15 Moreover, if the axiom of infinity is formulated as the existence of a Z-set containing the Zermelo numerals i.e., {φ, {φ}, {{φ}}, . . . }), then one cannot prove the existence of Z-ω. Indeed, there are standard models which do not contain the set corresponding to Z-ω. If, instead, the axiom of infinity is taken to state the existence of Z-ω, then the existence of a Z-set containing the Zermelo numerals does not follow. In sum, the axiom of replacement is more useful than one might think. Fortunately, there is an adequate axiomatization of the iterative conception of set, due to Dana Scott [1974]. The underlying motivation accords well with Boolos [1989]. Here we simply replace Scott’s axiom of ‘reflection’ (which entails both infinity and replacement) with an axiom stating the existence of a transfinite stage. The details do not matter here. Call the result iterative set theory. We note for emphasis that replacement is not assumed. We need not include choice among the axioms of iterative set theory, since it follows from the well-ordering principle WO, once New V is added (see Corollary 2 above). Most contemporary versions of Zermelo and Zermelo–Fraenkel set theory entail that there are no urelements, so that everything is a Z-set. If we followed that here, we would have to insist that all of the extensions are Z-sets, which would pre-judge the Caesar problem. Instead, we allow urelements, as Zermelo himself did. In this sub-section, we adopt the common axiom that there is a Z-set whose members include all of the Z-urelements. Under this assumption, all standard models of iterative set theory are isomorphic to ranks Vλ : where λ is a limit ordinal greater than ω (and an appropriate axiomatization can be obtained from Zermelo set theory by replacing the axiom of foundation with a statement that every Z-set is a subset of a rank Vα —thanks to Gabriel Uzquiano here as well). In the context of neo-logicism, the assumption that there is a Z-set consisting of all of the Z-urelements may be unjustified. It entails that almost all V-sets are also Z-sets (perhaps with different members), and so the assumption pre-judges the Caesar problem. As far as we can tell, there is nothing in the neo-logicist program so far that prevents the abstracts (including the V-sets) from being disjoint from the original ontology (including the Z-sets). In this case, the V-sets will all be Z-urelements and may constitute a Z-proper class (see Fine [1998], pp. 515 and 560 for a fuller discussion of this). The present sub-section is concerned more with the foundations of set theory than with neo-logicism. With this background, we consider the result of simply adding New V to iterative set theory. Since the quantifiers of iterative set theory are not restricted to the non-abstracts (i.e., non-extensions), we are not pursuing the (amended 15 With second-order theories, a model is ‘standard’ if the n-place predicate variables range over the entire collection of n-tuples of the domain. So, for example, the monadic variables range over the powerset of the domain. The proof that every standard model of second-order ZFC is well-founded makes essential use of replacement. See Shapiro [1991], pp. 113–114.
New, V, ZF, and Abstraction
293
version of) Wright’s conservativeness requirement. We show that New V is a powerful set-theoretic principle, with serious ontological consequences. The first item concerns the Z-ordinals. Theorem 3: Let O be any unbounded property of Z-ordinals (i.e., for every Z-ordinal α, there is a Z-ordinal β such that α ∈ Z β and Oβ). Second-order iterative set theory plus New V entails that O is big. A fortiori, the property of being a Z-ordinal is itself big. Proof sketch: We work informally within iterative set theory with New V as an additional axiom. Assume that O is small. Let F be an arbitrary property and let α be a Z-ordinal. The separation principle entails that there is a Z-set α F whose Z-members are all and only the z such that (Fz & z ∈ Z Vα ). Define a property F as follows: F x if and only if x is a Z-ordered pair < α, α F > such that Oα. Notice that F is equinumerous with O and so F is small. So for any properties F, G, Ext(F ) = Ext(G ) if and only if ∀x(F x ≡ G x). However, since O is unbounded, it is straightforward to verify that ∀x(Fx ≡ Gx) if and only if ∀x(F x ≡ G x). Thus, ∀F∀G[∀x(F x ≡ Gx) ≡ Ext(F ) = Ext(G )]. But this is a version of Axiom V and yields a contradiction. So O is big.
It is consistent with iterative set theory that Z-ω is the only infinite limit Zordinal. However, we see that iterative set theory plus New V entails that there are lots of uncountable Z-ordinals. The set V2ω is a model of iterative set theory, but not iterative set theory plus New V. With a transfinite induction, it follows from Theorem 3 that the Z-ordinals are isomorphic to the V-ordinals, under their respective membership relations. So within iterative set theory plus New V, the Z-ordinals have the same structure as the V-ordinals. We presume that this much is pleasing to the neologicist. We noted above (Section 1) that New V entails a replacement principle for V-sets. Our next theorem is that the same goes for Z-sets. That is, New V ratchets the relatively mild-mannered iterative set theory (at least) up to its powerful Zermelo–Fraenkel cousin. Theorem 4: It follows from second-order iterative set theory and New V that the replacement principle holds for Z-sets. Proof sketch: Let x be a Z-set and f a function defined on the universe. We have to show that there is a Z-set whose members are all and only the fy where y ∈ Z x. Let O be a property of Z-ordinals defined as follows: Oβ if and only if there is a y ∈ Z x such that β is the Z-rank of fy (i.e., β is the smallest Z-ordinal γ such that fy ∈ Z Vγ ). Clearly, O is equinumerous with a Z-subset of x and so O is small. If follows from Theorem 3 that there is a Z-ordinal α greater than
294
The Arché Papers on the Mathematics of Abstraction
every ordinal β such that Oβ. So we have that for all y ∈ Z x, fy ∈z Vα . The requisite collection is a subset of Vα and so is a Z-set by separation. In the foregoing, we combined the two notions of ‘set’ in a single theory, but did not combine their ‘ontologies’, other than the assumption that there is a Z-set containing all of the Z-urelements. Beyond this, we left the extent of the Z-sets and the V-sets open. Corollary 2 and Theorem 4 show that the two notions interact sufficiently so that full ZFC holds on the Z-sets. It is a corollary of Theorem 4 that a property P is small if and only if there is a Zset equinumerous with P. This entails that full ZFC also holds on the pure V-sets: Theorem 5: It follows from second-order iterative set theory and New V that the axioms of infinity and powerset hold on the pure V-sets. As usual, define the ‘set-theoretic hierarchy’ to be the (proper) class of all Z-sets that have no urelements in their transitive closures. The set-theoretic hierarchy is the intended interpretation of most contemporary set theories. A transfinite induction establishes the following: Theorem 6: Under second-order iterative set theory and New V, the pure sets under ∈v are isomorphic with the set-theoretic hierarchy under ∈ Z . This sharpens Boolos’s claim that the contemporary notion of ‘set’ is a mixture of the iterative and limitation-of-size conceptions. Recall that the iterative notion based on a theory of stages yields all of Zermelo set theory except replacement and choice, and the limitation-of-size notion yields all of ZFC except infinity and powerset, once the quantifiers are restricted to pure Vsets. Here we see that when the theories are combined, the two notions have the same structure. Replacement and choice hold on the Z-sets; infinity and powerset hold on the pure V-sets.
4.2
ZF
Our next item of business is the extent to which a set-theorist can accept New V, whether or not New V is regarded as analytic, a definition, an explanation of a concept, etc. What is involved in the assumption that there is an interpretation of the Ext operator on the set-theoretic hierarchy such that New V is true? 16 If New V has some substantial set-theoretic presuppositions, then it is not without significant epistemological presupposition, contra Wright. 16 We could speak here of the ‘satisfiability’ of New V over the set-theoretic hierarchy (along the lines of Shapiro [1991], §6.1). However, given the different perspectives discussed in this article, we only use the word ‘satisfiability’ for a semantic relation between formulas, assignments, and sets. Rather than speak of satisfiability over the universe, we use locutions like ‘can be made true’ or ‘the Ext operator can be interpreted in the set-theoretic hierarchy’. Formally, this does not require explicit semantic notions.
New, V, ZF, and Abstraction
295
For simplicity, we focus on ordinary ZFC, which entails that there are no urelements. Because New V is a single second-order sentence, the statement that New V can be made true on the set-theoretic hierarchy is formulated in a language of third-order set theory. Let ᑮ be a variable ranging over functions from set-theoretic properties to sets. The following sentence amounts to a statement that one can interpret the Ext operator so that New V is true: 17 ∃ᑮ∀F∀G[ᑮ(F) = ᑮ(G) ≡ ((‘F is big’ & ‘G is big’) (INT) ∨ ∀x(F x ≡ Gx))]. Is (INT) true of the set-theoretic hierarchy? What follows from (INT) and what entails it? These are important questions for the would-be neo-logicist, if he hopes to convince the set-theorist that none of his mathematics is lost by adopting New V. By Corollary 2, New V entails that there is a well-ordering of the universe, WO. Similarly, (INT) entails WO. Our next theorem establishes a converse: Theorem 7: (WO → (INT)) is a theorem of (third-order) Zermelo–Fraenkel set theory. Proof sketch: First, two lemmas: Lemma A: It follows from second-order Zermelo–Fraenkel set theory and WO that the Z-ordinals are equinumerous with the universe. Proof sketch: Let R be a well-ordering of the universe. Say that x ≺ y if either the rank of x (under ∈z ) is less than the rank of y, or the rank of x is the same as the rank of y and Rxy. The relation ‘≺’ is a well-ordering of the universe. We can show, by transfinite induction on (∈z ) ranks, that, for each y, there is a unique Z-ordinal isomorphic to {x|x ≺ y}. Call this ordinal fy. The function f is a bijection of the universe onto the Z-ordinals. Lemma B: It follows from second-order Zermelo–Fraenkel set theory that WO is equivalent to: ∀F(F is big ∨ ∃x∀y(y ∈ Z x ≡ Fy )). In other words, WO is equivalent to a statement that every small property is coextensive with a Z-set. Proof sketch: The right-to-left direction is similar to the proof of Theorem 1 and its corollary. For the left-to-right direction, let R be a well-ordering of the universe. Let F be a property and define a function g on Z-ordinals such that 17 Although (INT) is a straightforward way to formalize the question at hand, some queasy philosophers
may think we have gone too far in this ascent to third-order. For what it is worth, there is a second-order sentence equivalent to (INT). One can code the relevant part of the Ext function with a binary relation on the domain (following a technique in Shapiro [1991], pp. 103–104). If R is a binary relation and x is an object, then let R x be the projected property λyRxy. The idea is to write a second-order sentence stating that there is an R such that the ‘relation’ between R x and x is the restriction of the Ext function to non-empty small properties: ∃R|∃b∀yRby & ∃c∀y¬Rcy & ∀F(F is big ∨∀y¬Fy ∨ ∃!x∀y(Fy ≡ Rxy))].
296
The Arché Papers on the Mathematics of Abstraction
gα is the smallest x (under R) such that (Fx & ∀β < α(x = gβ)). There are two possibilities. One is that the function g is defined on all of the Z-ordinals. Then F is equinumerous with the Z-ordinals and so F is big by Lemma A. The other possibility is that there is a Z-ordinal α such that g is defined on every Z-ordinal in α but g is not defined on α itself. In this case there is no x such that (Fx & ∀β < α(x = gβ)), so F is equinumerous with the members of α. By the axiom of replacement, there is a Z-set x such that ∀y(y ∈z x ≡ Fy). So, to return to the proof of Theorem 7, assume WO and define the function ᑮ as follows. If a property F is big then let ᑮ > (F) be the Z-empty-set. If F is not big, then by Lemma B, there is a Z-set x such that ∀y(y ∈z x ≡ Fy). Let ᑮ(F) = x ∪ {x}. It is straightforward to verify that this makes (INT) true. It follows that the well-ordering principle is both necessary and sufficient for a neo-logicist to add New V to second-order Zermelo–Fraenkel set theory and, thus, to interpret New V on the set-theoretic hierarchy. Perhaps the best alternative for the neo-logicist is to concede that New V is not conservative over set theory, and to formulate and defend a requirement that specifies just which kinds of theories abstraction principles should be conservative over.
5.
Set-sized models
In this final section we examine set-sized models of New V and other abstraction principles. This is the perspective of a set-theorist who is assessing the mathematical prospects of the neo-logicist quest, and the deductive and semantic consequences of New V. As noted, the existence of a countable model rules out the possibility of using New V alone to develop real analysis. Wright envisions augmenting New V with other abstraction principles in order to ensure that the models are uncountable. Well, what are the possibilities? Are there uncountable models of New V? This is equivalent to the question of whether there are any models of New V in which at least one V-set has infinitely many members. Is there a model of New V whose size is exactly that of the continuum? If so, the neo-logicist can develop real analysis in a structure with no more objects than there are real numbers. Is there a model of New V whose size is exactly that of the powerset of the continuum? This will allow the smooth development of functional analysis. We show that Zermelo– Fraenkel set theory does not determine the answers to these questions, and so in a sense the mathematical viability of Wright’s program is independent of set theory. For the most part, the only sets, ordinals, cardinals, and membership we need to invoke in this section are from the set-theoretic meta-theory (here ZFC). That is, they are Z-sets, Z-cardinals, etc. So we omit the ‘Z’, unless there is some danger of confusion. Let κ be a cardinal number. Clearly, there are at least κ small subsets of κ. Say that a cardinal κ has the small-subsets
New, V, ZF, and Abstraction
297
property if there are exactly κ small subsets of κ. There is a model of New V of size κ if and only if κ has the small-subsets property. We first take up successor cardinals: Theorem 8: Let κ be a cardinal and let κ + be its successor. Then (INT) is true in a domain of size κ + if and only if 2κ = κ + . In other words, there is a model of New V of size κ + if and only if the instance of the generalized continuum hypothesis (GCH) holds at κ. Proof sketch: A subset of κ + is small if and only if its cardinality is at most κ. There are at least 2κ such subsets since each subset of κ is a small subset of κ + . It follows from choice that κ + is regular, and so each small subset of κ + is bounded. Thus, there are at most (κ + )2κ small subsets of κ + , but this last is 2κ . So there are exactly 2κ small subsets of κ + . New V can be satisfied on a domain of size κ + if and only if κ + has the small-subsets property, i.e., there are no more than κ + small subsets of the domain. Of course, there are κ + such members. So New V can be satisfied on a domain of size κ + if and only if 2κ = κ + . For any infinite cardinal κ, it is independent of ZFC whether the relevant instance of the GCH holds, and thus whether the small-subsets property holds for κ + . Thus, Corollary 9: For any infinite cardinal κ, it is independent of ZFC whether there is a model of New V of size κ + . If the GCH holds generally, then every successor cardinal is a model of New V. If the continuum hypothesis holds, then there is a model of New V the size of the continuum (which would be ℵ1 ), and real and complex analysis can be developed in that model. However, if the continuum hypothesis fails, the neo-logicist must scramble some more. Boolos ([1989], p. 19) stated that the powerset axiom (over V-sets) does not follow from New V plus the existence of an infinite V-set. He suggested that this can be shown ‘by tinkering with the set of hereditarily countable sets’. Parsons ([1997], p. 268) directly claimed that New V can be satisfied over the hereditarily countable Z-sets. Since the set of hereditarily countable sets is the size of the continuum, Parsons is correct if the continuum hypothesis holds. However, if, say, the continuum is ℵ2 and the powerset of ℵ1 is larger than ℵ2 then by Theorem 8, New V cannot be satisfied on the hereditarily countable sets. There would be too many small subsets of the hereditarily countable sets. The Boolos/Parsons construction can be adapted to provide a Henkin model. Let the first-order domain of the model M be the set of all constructible hereditarily countable Z-sets, and let the n-place relation variables range over the constructible n-place relations on this domain. 18 Let P be an M‘property’—a constructible subset of the first-order domain. Then P is small 18 In the constructible universe L, M is the standard model consisting of the hereditarily countable sets.
298
The Arché Papers on the Mathematics of Abstraction
in M if and only if there is no constructible function from P onto the domain. Since the continuum hypothesis holds in the constructible universe, P is small in M only if there is a constructible function from ω onto P. Thus, P is small in M only if P is countable (or finite). In this case P is hereditarily countable and so P is in the first-order domain of M. Thus, M satisfies the small-subsets property and so this model satisfies New V. The model M also satisfies the axiom of infinity (on V-sets), but M does not satisfy the powerset axiom (on V-sets). Thus, Theorem 10: The powerset principle cannot be derived from New V and the principle of infinity. Next we turn to limit cardinals. Recall that a cardinal κ is weakly inaccessible if it is a regular limit cardinal. If κ is a weak inaccessible and if for every cardinal λ < κ, 2λ < κ, then κ is a strong inaccessible and Vκ is a standard model of second-order ZFC. Say that κ is a nearly strong inaccessible if κ is weakly inaccessible and for every cardinal λ < κ, 2λ ≤ κ. Theorem 11: Let κ be a weak inaccessible. Then κ satisfies New V if and only if κ is nearly strong. Proof sketch: Suppose that κ is not nearly strong, so that there is a cardinal λ < κ where 2λ > κ. Then there are more than κλ-sized subsets of κ, and so κ does not have the small-subsets property and New V is not satisfiable on κ. Now suppose that κ is nearly strong. Since κ is regular, every small subset of κ is also a subset of some λ < κ. There are at most 2λ ≤ κ subsets of each such λ, and there are exactly κ such cardinals λ. Thus there are at most κ · κ = κ small subsets of κ. So κ has the small-subsets property and New V can be satisfied at κ. The final cases in this long and tortuous journey are singular cardinals. Here ZFC does decide the question—in the negative: Theorem 12: Let κ be a singular cardinal. Then (INT) is false in any model of size κ, and so there is no model of New V of size κ. Proof sketch: Suppose, first, that there is a cardinal λ < κ such that 2λ > κ. Then since every subset of λ is a small subset of κ, there are more than κ small subsets of κ, and so κ does not have the small-subsets property. This precludes the ability to satisfy New V on κ. So now assume that for each cardinal λ < κ, 2λ ≤ κ. Since κ is singular, let o be a set of cardinals such that o is cofinal in κ and the cardinality of o is λ < κ. For each α ∈ o, let f α be a one-to-one function from the powerset of α into κ and assume that if α and β are distinct members of o, then the ranges of f α and f β are disjoint. Now if d is any subset of κ then let d be the set { f α (d ∩ α)|α ∈ o}. So d is a subset of κ whose cardinality is λ. Moreover, by the various conditions, we have that d1 = d2 if and only if d1 = d2 . So there are just as many λ-sized subsets of κ as there
New, V, ZF, and Abstraction
299
are subsets of κ altogether, namely 2κ of them. A fortiori, κ does not have the small-subsets property, and so New V cannot be satisfied in any model of size κ. For example, New V has no models of size ℵα + ω for any ordinal α. So we have: Corollary 13: For every cardinal κ there is a cardinal λ > κ such that there is no model of New V of size exactly λ. That is, the cardinals which fail to satisfy New V are unbounded. Thus, for example, there is a model of New V whose size is exactly the continuum if and only if either the continuum is a successor cardinal where the GCH applies to its predecessor or else the continuum is a nearly strong inaccessible. 19 It is consistent with ZFC for the continuum to fail to be a model for New V. To summarize the results of this section, the best-case scenario for New V is for the GCH to hold. In this case, every weak inaccessible is a strong inaccessible. Thus, New V holds at all successor cardinals and at all inaccessibles, and it fails at all singular cardinals. The worst-case scenario is for the GCH to fail ‘globally’ in the sense that for every infinite cardinal κ, 2κ > κ + and for there to be no uncountable, nearly strong inaccessibles other than strong inaccessibles. This worst case is consistent with ZFC (assuming some large cardinal principles). In particular, there is a model of ZFC in which for all infinite κ, 2κ = κ ++ (see Foreman and Woodin [1991]). In this model, every weak inaccessible is a strong inaccessible. If we restrict this model to the elements whose rank is less than the first inaccessible, we get the worst case of all: Corollary 14: It is consistent that there are no uncountable models of New V. In this doomsday scenario, there are no set-sized models in which our neologicist can develop real analysis via New V. Corollary 13 marks an important difference between New V and Hume’s principle. Heck ([1992], p. 494, n. 5) suggests that a ‘promising necessary condition’ on abstraction principles is that there be a cardinal κ such that for every λ ≥ κ, the abstraction principle is satisfiable in every domain of cardinality λ. Call this the strong unbounded condition, since it calls for a fixed lower bound on the cardinals that fail to satisfy the abstraction principle. One can think of an abstraction principle as like an implicit definition of the operator it introduces. One straightforward condition on implicit definitions is that any model of the base theory be extendible to a model of the base 19 It is consistent with ZFC for the continuum to be a nearly strong inaccessible. If we start out with a (Henkin) model of ZFC in which κ is inaccessible and ‘add’ κ ‘Cohen reals’ by forcing, then the continuum is of size κ and κ remains weakly inaccessible. A ‘chain condition’ argument establishes that κ is nearly strong. We are indebted to Matthew Foreman here.
300
The Arché Papers on the Mathematics of Abstraction
theory plus the definition. If logicism is to have a chance, this ‘extendability requirement’ is too much to ask of an abstraction principle since, as we have seen, Hume’s principle has no finite models. Neither does New V. The strong unbounded condition is the next best thing to the extendability requirement. It demands that any structure of sufficiently large size can be made to satisfy the principle. Boolos and Wright have shown that Hume’s principle meets the strong unbounded condition. Corollary 13 is that New V fails it. The same goes for a number of other abstraction principles. Recall that New V has the following form: Ext(F) = Ext(G) ≡ [(‘F is bad’ & ‘G is bad’ ) ∨ ∀x(F x ≡ Gx)], where, of course, we interpret ‘bad’ as ‘equinmnerous with the universe’. As noted above, Wright briefly considers conditions in which ‘is bad’ is interpreted in other ways, such as ‘is infinite’ or ‘is the size of the continuum’. Let κ be any infinite cardinal number and define the ‘κ-principle’ 20 to be the above abstraction with ‘F is bad’ as ‘there are at least κ Fs’. The κ-principle thus calls for a distinct extension for every subset of the domain whose size is smaller than κ. Thus, the κ-principle is satisfiable in a model of size λ if and only if λ has no more than λ subsets that are smaller than κ. Let α be any ordinal. The construction in the proof of Theorem 12 can be adapted to show that there are as many countable subsets of α + ω as there are subsets of α + ω altogether. A fortiori, if κ is uncountable then the κ-principle is not satisfiable in any domain of size α + ω . Thus, for any uncountable cardinal κ, the κ-principle fails to meet the strong unbounded condition. Perhaps a more reasonable condition on abstraction principles is that for every cardinal κ the principle should be satisfiable in a domain whose size is at least κ. That is, for any κ, there is a λ > κ such that the abstraction principle has a model of size λ. Call this the weak unbounded condition. The idea is that the most we can demand of an abstraction principle is that it does not put an upper bound on the cardinality of the universe. This is consonant with the idea that a legitimate abstraction can entail that there are at least κ objects, for some κ, but it cannot entail that there are at most κ objects. Let A be an abstraction principle formulated with purely logical terminology (other than the introduced operator). Then if A meets the weak unbounded condition, then it also satisfies the amended conservativeness requirement (see Section 1). Any structure at all can be extended, possibly by adding new objects, to become a model of the abstraction principle. New V meets this weak unbounded condition if, but only if, the property of being either an instance of the GCH or a nearly strong inaccessible is unbounded. Although ZFC does not determine whether this holds, it is at least consistent with ZFC that New V meets the weak unbounded condition. 20 Of course, the κ-principle is not a sentence unless κ is a definable cardinal.
New, V, ZF, and Abstraction
301
Curiously, the statement that New V meets the weak unbounded condition follows from the limiting axiom V = L and also from the maximizing principle that the nearly strong inaccessibles are unbounded. One problem now, however, is that there are abstraction principles that meet the weak unbounded condition, but are incompatible with each other. Consider, for example, an abstraction principle that is satisfiable only at cardinals of the form α + ω . This meets the weak unbounded condition, but it is not compatible with New V, nor with the κ-principle for any uncountable κ. Here is a pair of incompatible abstraction principles both in the form Ext(F) = Ext(G) ≡ [(‘F is bad’ & ‘G is bad’) ∨ ∀x(F x ≡ Gx)]. For Principle A, define ‘is bad’ to be ‘has the size of a limit in the series of inaccessibles’ and for Principle B define ‘is bad’ to be ‘has the size of a successor in the series of inaccessibles’. If the inaccessibles are unbounded, then both principles meet the weak unbounded condition, but they are incompatible with each other. Principle A is satisfiable only at limits in the series of inaccessibles and Principle B is satisfiable only at successors in the series of inaccessibles. The Bad Company objection has just returned, and so we are back near the place we started. How is the neo-logicist to decide which of the abstraction principles is acceptable? What condition admits the good ones and excludes the bad ones?
References BENACERRAF, P., and H. PUTNAM, eds. [1983]: Philosophy of Mathematics. Second edition. Cambridge: Cambridge University Press. BOOLOS, G. [1971]: ‘The iterative concept of set’, in Benacerraf and Putnam [1983]: pp. 486– 502. BOOLOS, G. [1987]: ‘The consistency of Frege’s Foundations of Arithmetic’, in Judith Jarvis Thompson, ed., On Being and Saying: Essays for Richard Cartwright. Cambridge, Mass.: The MIT Press, pp. 3–20. BOOLOS, G. [1989]: ‘Iteration again’, Philosophical Topics 17, 5–21. BOOLOS, G. [1997]: ‘Is Hume’s principle analytic?’, in Heck [1997], pp. 245–261. COFFA, A. [1991]: The Semantic Tradition from Kant to Carnap. Cambridge: Cambridge University Press. FIELD, H. [1980]: Science Without Numbers. Princeton: Princeton University Press. FINE, K. [1998]: ‘The limits of abstraction’, in Schirn [1998], pp. 503–629. FOREMAN, M., and W. H. WOODIN [1991]: ‘The generalized continuum hypothcsis can fail everywhere’, Annals of Mathematics 133 (2), 1–35. FREGE, G. [1884]: Dic Grundlagen der Arithmetik. Breslau: Koebner; The Foundations of Arithmetic. Trans. by J. Austin. Second edition. New York: Harper, 1960. FREGE, G. [1893]: Grundgesetze der Arithmetik 1. Hildesheim: Olms. FRIEND, M. [1997]: Second-order Logic Is Logic. Ph. D. Dissertation, University of St. Andrews. HALLETT, M. [1984]: Cantorian Set Theory and the Limitation of Size. Oxford: Oxford University Press. HECK, R. [1992]: ‘On the consistency of second-order contextual definitions’, Nous 26, 491– 494. HECK, R. [1993]: ‘The development of arithmetic in Frege’s Grundgesetze der Arithmetik’, Journal of Symbolic Logic 58, 579–601.
302
The Arché Papers on the Mathematics of Abstraction
HECK, R., ed. [1997]: Language, Thought, and Logic: Essays in Honour of Michael Dummett. Oxford: Oxford University Press. HILBERT, D., and W. ACKERMANN [1928]: Grundzüge der theoritischen Logik. Berlin: Springer. HINTIKKA, J. [1996]: The Principles of Mathematics Revisited. Cambridge: Cambridge University Press. LÉVY, A. [1968]: ‘On von Neumann’s axiom system for set theory’, American Mathematical Monthly 75, 762–763. MIRAGLIA, P. [1996]: Can we intend an interpretation? Ph.D. Dissertation. The Ohio State University. PARSONS, C. [1983]: Mathematics in Philosophy. Ithaca: Cornell University Press. PARSONS, C. [1997]: ‘Wright on abstraction and set theory’, in Heck [1997], pp. 263–271. QUINE, W. V. O. [1954]: ‘Quantification and the empty domain’, Journal of Symbolic Logic 19, 177–179. QUINE, W. V. O. [1986]: Philosophy of Logic. Second edition. Englewood Cliffs, New Jersey: Prentice-Hall. SCHIRN, M., ed. [1998]: The Philosophy of Mathematics Today. Oxford: Oxford University Press. SCOTT, D. [1974]: ‘Axiomatizing set theory’, in T. Jech, ed., Axiomatizing set theory: Proceedings of symposia in pure mathematics 13 (Part II), pp. 207–214. SHAPIRO, S. [1991]: Foundations Without Foundationalism: A Case for Second-order Logic. Oxford: Oxford University Press. SHAPIRO, S. and A. WEIR [2000]: “ ‘Neo-logicist’ logic is not epistemically innocent”. Philosophia Mathematica (3) 8, 163–189. UZQUIANO, G. [1999]: “ Models of Second-Order Zermelo Set Theory”. Bulletin of Symbolic Logic 5, 289–302. VAN HEIJENOORT, J. [1967]: From Frege to Gödel. Cambridge, Mass.: Harvard University Press. WEIR, A. [2003]: “Neo-Fregeanism: an embarrassment of riches”. Notre Dame Journal of Formal Logic, 44, 13–48. WHITEHEAD, A. N., and B. RUSSELL [1910]: Principia Mathematica 1. Cambridge: Cambridge University Press. WRIGHT, C. [1983]: Frege’s Conception of Numbers as Objects. Aberdeen: Aberdeen University Press. WRIGHT, C. [1997]: ‘On the philosophical significance of Frege’s theorem’, in Heck [1997], pp. 201–244. WRIGHT, C. [1998]: ‘On the harmless impredicativity of N = (Hume’s principle)’, in Schirn [1998], pp. 339–368. ZERMELO, E. [1904]: ‘Beweis, dass jede Menge wohlgeordnet werden kann’, Mathematische Annalen 59, 514–516; trans. in van Heijenoort [1967], pp. 139–141. ZERMELO, E. [1908]: ‘Neuer Beweis für die Möglichkeit einer Wohlordnung’, Mathematische Annalen 65, 107–128; trans. in van Heijenoort [1967], pp. 183–198.
WELL- AND NON-WELL-FOUNDED FREGEAN EXTENSIONS 1 Ignacio Jané Departament de Lògica, Universitat de Barcelona, E-08028 Barcelona, Spain E-mail:
[email protected]
Gabriel Uzquiano Pembroke College, University of Oxford E-mail:
[email protected]
Abstract George Boolos has described an interpretation of a fragment of ZFC in a consistent second-order theory whose only axiom is a modification of Frege’s inconsistent Axiom V. We build on Boolos’s interpretation and study the models of a variety of such theories obtained by amending Axiom V in the spirit of a limitation of size principle. After providing a complete structural description of all well-founded models, we turn to the non-well-founded ones. We show how to build models in which foundation fails in prescribed ways. In particular, we obtain models in which every relation is isomorphic to the membership relation on some set as well as models of Aczel’s anti-foundation axiom (AFA). We suggest that Fregean extensions provide a natural way to envisage non-well-founded membership. Keywords AFA, Axiom V, extensions, George Boolos, Gottlob Frege, New V, non-well-founded sets, second-order set theory, ZFC
There have been many recent and interesting attempts to develop much of standard set theory, by which we mean Zermelo–Fraenkel set theory plus the axiom of choice (ZFC), from some consistent modification of Frege’s original Axiom V in the framework of second-order logic. One prominent example occurs in [2], where George Boolos interprets a fragment of ZFC in the second-order theory whose only axiom is a variant of Axiom V based on von Neumann’s principle of limitation of size. More recently, Stewart Shapiro 1 This paper first appeared in the Journal of Philosophical Logic 33, [2004], pp. 437–465. Reprinted by kind permission of the editor and Springer Academic Publishers.
303 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 303–329. c 2007 Springer.
304
The Arché Papers on the Mathematics of Abstraction
has surveyed in [5] efforts to extend the Frege–Boolos strategy for developing standard set theory from variants of Axiom V that are acceptable from the point of view of neo-Fregeanism as championed by Crispin Wright and Bob Hale. Each of these attempts has been driven by a different overall aim, but what is of interest to us is how such developments illuminate the relation in which Cantorian sets, the objects of standard set theory, stand with respect to Fregean extensions as captured by consistent restrictions on Axiom V. The aim of this article is to further the comparison between Cantorian sets and Fregean extensions. Our point of departure are consistent modifications of Axiom V informed by the limitation of size doctrine. These principles determine a class of models in which most, and even all of the axioms of secondorder Zermelo–Fraenkel set theory plus choice minus foundation (ZFC− ) are satisfied. Some of these models are well-founded and may be of interest for those engaged in the neo-Fregean project to secure set theory within a theory of extensions. It is unclear whether a limitation of size modification of Axiom V may in fact serve as neo-Fregean foundation of set theory, 2 but at least a complete structural description of the well-founded models of such modifications is available, as we shall show. Yet, a crucial difference between Cantorian sets and Fregean extensions is that while Cantorian sets are well-founded, it is not difficult to motivate the existence of non-well-founded Fregean extensions. It takes, in fact, very little effort to provide models of limitation of size based modifications of Axiom V in which not all extensions are well-founded with respect to membership. Some of these models are of additional interest because they violate the axiom of foundation in prescribed ways that are in line with different axioms of anti-foundation. One variant of anti-foundation that is particularly well motivated in this context is one according to which every extensional graph is isomorphic to some transitive extension. But it is also possible to construct models of Aczel’s axiom of anti-foundation (AFA). We will thus suggest that Fregean extensions provide us with a natural tool for modelling non-well-founded set theories. What follows is a brief description of the contents of the paper. In Sections 1 and 2 we build on Boolos’s interpretation of set theory in a theory of extensions to motivate our notion of a (κ, λ)-model. As introduced in Section 3, (κ, λ)-models are the natural objects to consider if, generalizing on Boolos proposal, we distinguish between small and large concepts in any reasonable way and assign an extension only to the small ones. We devote Section 4 to the well-founded (κ, λ)-models, of which we offer a complete structural description. The last three sections are devoted to the construction of rich non-wellfounded (κ, λ)-models, which, unlike the well-founded ones, are very diverse. All our work takes place in ZFC. 2 Both [3] and [5] discuss constraints on acceptable abstraction principles any neo-Fregean foundation for set theory must fulfill.
Well- and non-well-founded Fregean Extensions
1.
305
Extensions
Frege assumed that with every concept F there is associated a certain object ext(F), the extension of F. He also assumed that this assignment of objects to concepts is governed by Axiom V: (Axiom V) ext(F) = ext(G) ↔ ∀x(F x ↔ Gx). Frege’s theory of extensions consists of this single axiom in the context of axiomatic second-order logic. We may interpret set theory in it by taking sets to be extensions and defining membership (E) as: u E x ↔ ∃F(x = ext(F) ∧ Fu).
(1)
With the help of Axiom V, we obtain: x = ext(F) → ∀u(u E x ↔ Fu). But then, if a is the extension of the concept F defined by Fu ↔ ¬uEu, we conclude that aEa↔ ¬aEa, which is a contradiction. We shall say that two concepts are coextensive just in case exactly the same objects fall under them. Frege’s inconsistent Axiom V requires that two concepts be assigned the same extension if and only if they are coextensive. In [2], Boolos weakened Frege’s requirement according to von Neumann’s principle of limitation of size. Define two concepts F and G to be equinumerous (F ∼ G) if there is a one-to-one correspondence between the objects falling under F and the objects falling under G. Call a concept large if it is equinumerous with the universal concept V , i.e., the concept under which all objects fall, and call it small if it is not large. Boolos replaced Frege’s Axiom V by: (New V) ext(F) = ext(G) ↔ (F ∼ V ∨ G ∼ ∨ ∀x(F x ↔ Gx). As in Frege’s case, Boolos assumes that each concept is assigned an extension (he called it a ‘subtension’), but he requires that any two large concepts be assigned the same extension, whether or not they are coextensive. Sameness of extension implies coextensiveness only for small concepts. Boolos called ‘FN’ the second-order theory whose only axiom is New V (for Frege–von Neumann). To see that it is consistent, let g be a 1 − 1 mapping on the set of finite subsets of ω into ω\{0}. Define the extension operation ext: ᏼ(ω) → ω by: g(F) if F is finite, ext(F) = 0 if F is infinite. It is clear that ω, ext is a model of FN (the concepts are the sets of natural numbers, a small concept is a finite set, and 0 is the common extension of all large concepts). In order to develop set theory in FN, Boolos defined membership by (1), but he took a set to be a small extension, i.e., the extension of a small concept.
306
The Arché Papers on the Mathematics of Abstraction
However, the common extension of all large concepts makes this choice somewhat unsatisfactory, since the union of the singleton {ext(V )} is not small. So Boolos restricted attention to what he called ‘pure sets’ and we call ‘pure extensions’. He defined a concept to be closed if whenever all members of an extension fall under it, so does the extension, i.e., F is closed if and only if for every extension x, ∀y(y E x → F y) → F x. And he defined an extension to be pure if it falls under all closed concepts. He then showed that pure extensions are small and that all the axioms of ZFC except power set and infinity hold when restricted to pure extensions. To be more specific, the axioms of extensionality, separation, empty set, pair, union, replacement, choice and foundation are provable in FN for pure extensions. We want to shed some light on Boolos’s notion of a pure extension and, particularly, on the fact that the axiom of foundation holds when restricted to them. We begin by remarking that we need not confine attention to pure extensions merely for the purpose of excluding unwanted objects such as {ext V }—to that purpose, it is enough to consider hereditarily small extensions. A concept F is transitive if whenever an extension falls under F so do all its members. An extension is transitive if it is the extension of a transitive concept. An extension is hereditarily small if it falls under some transitive concept under which only small extensions fall. The intuitive picture of a hereditarily small extension is a small extension whose members are small extensions, the members of whose members are small extensions, etc. The arguments of Boolos in [2] establish that all the axioms of ZFC except power set, infinity and foundation hold when restricted to hereditarily small extensions. However, foundation cannot be proved for them. For, as before, let g be a 1 − 1 mapping on the set of finite subsets of ω onto ω\{0}, but now require that g({1}) = 1. If we define ext : ᏼ(ω) → ω by: g(F) if F is finite, ext(F) = 0 if Fis infinite, New V is satisfied. But foundation fails, because in the membership relation defined from ext, 1E1 and 1 is a hereditarily small extension (since the concept under which only 1 falls is transitive and small). So, although we can prove in FN that all pure extensions are hereditarily small, we cannot prove that all hereditarily small extensions are pure. Remark 1: If all members of a small extension are hereditarily small extensions, so is the extension. Proof: Let a be a small extension and suppose that all members of a are hereditarily small extensions. Let F be the concept under which an object
Well- and non-well-founded Fregean Extensions
307
x falls if and only if either x = a or x is a hereditarily small extension. F is clearly transitive, under F only small extensions fall, and a falls under F. Hence, a is a hereditarily small extension. The (von Neumann) ordinals are important examples of hereditarily small extensions, where, to adapt the usual definition, an ordinal is a transitive hereditarily small extension a that is well-ordered by E. We let OR be the concept of being an ordinal. A well-founded concept is a concept F such that under every non-empty subconcept G falls some E-minimal object. As an example, OR is a wellfounded concept. A well-founded extension is an extension that falls under some transitive well-founded concept. Since OR is a well-founded concept, each ordinal is a well-founded extension. Remark 2: If all members of an extension are well-founded, so is the extension. Proof: Let a be an extension all whose members are well-founded. Let F be the concept under which an object x falls if and only if either x = a or x is well-founded. F is clearly transitive. It is also well-founded and a falls under F. Hence, a is a well-founded extension. As we now show, pure extensions coincide with well-founded hereditarily small extensions. Remark 3: An extension is pure if and only if it is a well-founded hereditarily small extension. Proof: By Remark 1, the concept of being a hereditarily small extension is closed. By Remark 2, so is the concept of being a well-founded extension. But the conjunction of two closed concepts is also a closed concept. Consequently, all pure extensions are well-founded hereditarily small extensions. For the converse, let a be a well-founded hereditarily small extension and let F be a closed concept. Assume, towards a contradiction, that a doesn’t fall under F. Since F is closed, there is a member b of a that doesn’t fall under F. Let G and H be transitive concepts witnessing that a is a hereditarily small extension and a well-founded object, respectively, and let J be the concept defined by: J x ↔ Gx ∧ H x ∧ ¬F x. Since J is a subconcept of H , J is well-founded. Since b falls under J , J is non-empty. Let d be some E-minimal object falling under J. Thus ¬Fd. Since J is a subconcept of G, d is a hereditarily small extension. Since G and H are transitive, and d is E-minimal in J , all members of d must fall under F. But then, since d is an extension and F is Boolos-closed, d falls under F. Hence, both Fd and ¬Fd. Absurd.
308
2.
The Arché Papers on the Mathematics of Abstraction
New V−
Since ext(V ) is unnecessary for the purpose of interpreting set theory in New V, we may as well avoid it and modify our theory by assigning extensions only to small concepts. More generally, without restricting ourselves to Boolos’s account of the distinction between large and small concepts, we propose to look at modifications of Frege’s Axiom V in which only the concepts in some prescribed class have an extension. To this end, we consider a second-order language with an extension predicate EXT and a partial extension operator ext, which is defined only for concepts satisfying EXT. If EXT(F), we say that F has an extension, and we call ext(F) the extension of F. Extensions are governed by the following restricted form of Axiom V: (RV) If EXT(F) and EXT(G), ext(F) = ext(G) ↔ ∀x(F x ↔ Gx). Of course, RV (for Restricted V) must be supplemented with a description of what concepts have an extension. The particular case of RV in which a concept has an extension just in case it is small (i.e., EXT(F) ↔ F $ V ) we call New V− : (New V− ) If F ∼ V and G ∼ V, ext(F) = ext(G) ↔ ∀x(F x ↔ Gx). One obvious difference between New V− and New V is that in the former V has no extension. Notice that New V− is trivially consistent, since it has a model where there is only one object, which is the extension of the empty concept. This is the only finite model, since, as is easily seen, New V− implies that if there is more than one object then there are infinitely many objects. 3 For a countably infinite model of New V− , take the set HF of hereditarily finite sets and let ext be the identity map on HF. In fact any injective function on HF into HF can play the role of ext. This, as will be seen later, yields 2ℵ0 pairwise non-isomorphic countable models of New V− . The development of set theory from Boolos’s New V can be carried out with little modification from New V− plus the assumption that there is more than one object, an assumption which we make all along. In other words, we deal with the second-order theory whose axioms are New V− and ∃x∃yx = y. Since large concepts have no extension, the hereditarily small extensions of New V are just the hereditary extensions of New V− , these being the extensions that fall under a transitive concept under which only extensions fall. Suitably adapted, the arguments in Boolos’s paper [2], establish the following proposition: Proposition 4: The relativizations to hereditary extensions of the axioms of extensionality, separation, empty set, pair, union, choice and replacement are 3 Boolos, G.: Iteration again, Philos. Topics 17 (1989), 15–21.
Well- and non-well-founded Fregean Extensions
309
theorems of New V− ∃x∃yx = y. So are the relativizations of these axioms and of the axiom of foundation to well-founded hereditary extensions. As is the case with New V, one can prove from New V− that there is a relational concept which well-orders the universe. One exploits the reasoning behind the Burali–Forti paradox to show that the concept OR of being an ordinal has no extension, which means that OR ∼ V. Accordingly, the canonical wellordering of OR induces a well-ordering of V. There are, however, variants of RV that are similar to New V− in that they assign extensions only to concepts under which relatively few objects fall, but which nevertheless block the previous route to the proof of the existence of a well-ordering of the universe. For every model of each of these variants there is an infinite cardinal λ such that a concept has an extension if and only if fewer than λ objects fall under it. In any such model, OR is well-ordered in type λ, but the cardinality of the model can be larger than λ. If it is, the argument just sketched for the provable existence of a well-ordering of V will be blocked. As an example of a variant of RV with these characteristics we define EXT so that a concept F is assigned an extension just in case there is a maximum inaccessible cardinal µ and F has cardinality strictly less than µ: (IN)
EXT (F)
↔ ∃G(F ≺ G ∧ In(G) ∧ ∀H (In (H → H G)). 4
As was the case with New V− , RV with EXT defined by (IN) is very weak, since it is trivially satisfied when no concept has an extension. So, when working with this variant of RV we should assume that some concept has an extension, that is, we should consider the second-order theory whose axioms are RV and ∃F EXT(F). Two cardinals are relevant to the models M, ext of the variants of RV where EXT depends exclusively on size, viz. κ, the cardinality of the universe, and λ, the cardinal such that a concept has an extension if and only if it has cardinality less than λ. We call these models ‘(κ, λ)-models’. If the variant of RV we consider is New V− , then κ = λ. On the other hand, for the case where EXT is given by (IN), λ is an inaccessible cardinal and there is no inaccessible cardinal µ such that λ < µ ≤ κ. We now leave New V− and turn to the study of general (κ, λ)-models.
3.
(κ, λ)-Models From now on, κ and λ are always infinite cardinals.
4 In(G) is a second-order formula which expresses that G is inaccessible, i.e., it is a formula which is satisfied in any given structure if and only if the cardinality of the set assigned to G is inaccessible. F ≺ G and F G are also second-order formulas. F ≺ G expresses that the cardinality of F is strictly less than the cardinality of G, while F G expresses that the cardinality of F is equal or less than the cardinality of G. (See [4], pp. 100–106.)
310
The Arché Papers on the Mathematics of Abstraction
A (κ, λ)-model is a pair ᏹ = M, h , where M is a set of cardinality κ and h is a one-to-one map on [M]<λ , the set of all subsets of M of cardinality less than λ, into M. A λ-model is a (κ, λ)-model, for some cardinal κ. Let κ <λ = supµ<λ κ µ . Since for µ < κ there are exactly κ µ subsets of κ of cardinality µ, we conclude that there is a (κ, λ)-model iff κ = κ <λ . The (κ, λ)-models are the models of the variants of RV we consider. In particular, the models of New V− are the (κ, κ)-models. If ᏹ = M, h , is a (κ, λ)-model, the ᏹ-concepts are the subsets of M, and a ᏹ-concept X has an extension—namely, h(X )—just in case its cardinality is < λ, i.e., just in case X ∈ [M]<λ . A (κ, λ)-model ᏹ = M, h , determines a structure M, E h , Sh , where Sh = ran(h), the range of h, and E h is the binary relation on M defined by: x E h y ↔ y ∈ Sh ∧ x ∈ h −1 (y). Thus whenever a ∈ M and X ∈ [M]<λ : a E h h(X ) ↔ a ∈ X. 5 We refer to the values of h as h-sets. Thus Sh is the set of all h-sets. A h-atom is a member of M which is not a h-set. We also speak of E h as the h-membership relation.
3.1
λ-Models and axioms of ZFCU−
If M, h is a (κ, λ)-model, M, E h , Sh is a structure for the language of set theory with atoms (or urelements). These structures are models of fragments of second-order Zermelo–Fraenkel set theory plus choice with urelements (ZFCU), which is the theory investigated by Zermelo in [6]. We are especially interested in second-order ZFCU− , whose axioms are those of ZFCU minus foundation. With the notable exception of foundation, which axioms of second-order set theory are satisfied in M, E h , Sh depends only on λ. Proposition 5: Let ᏹ = M, h be a λ-model. The structure M, E h , Sh
satisfies the second-order axioms of extensionality, separation, empty set, pair, replacement and choice. Moreover: 1. M, E h , Sh satisfies the axiom of infinity iff λ is uncountable, 2. M, E h , Sh satisfies the axiom of union iff λ is regular, 3. M, E h , Sh satisfies the axiom of power set iff λ is a strong limit.
Proof: See Appendix 1.
Thus M, E h , Sh is a model of second-order ZFCU− iff λ is an inaccessible cardinal. Hence, if λ is inaccessible and ᏹ is atomless, i.e., if Sh = M, then 5 We could read this as: “a belongs to the extension of a M-concept just in case it falls under it”.
Well- and non-well-founded Fregean Extensions
311
M, E h is a model of second-order Zermelo–Fraenkel set theory plus the axiom of choice minus foundation (ZFC− ). Note that every model of second-order ZFCU− corresponds to some λmodel for some inaccessible cardinal λ. For let κ be an infinite cardinal and suppose that M, E, S is a model of second-order ZFCU− of cardinality κ. For each a ∈ S, i.e., for each E-set a, let (a) E = {x ∈ M : xEa}. Notice that for no a ∈ S, |(a) E | = κ. For otherwise, by second-order replacement in M, E, S , there would be b ∈ S such that (b) M = M, which is impossible. Thus let λ be the least cardinal ≤ κ such that for no a ∈ S, |(a) E | = λ. By choice of λ,
r for every E-set a ∈ S, (a) E ∈ [M]<λ . Let X ∈ [M]<λ . By minimality of λ there is a ∈ M such that |X | ≤ |(a) E |. Let F be a function on (a) E onto X . By second-order replacement in M, E, S , there is some b ∈ S such that (b) E = X . Thus, since extensionality holds in M, E, S ,
r for every X ∈ [M]<λ , there is a unique a ∈ M such that (a) M = X . So, we can define h : [M]<λ → M by: h(X ) = the unique a ∈ M such that (a) M = X. Thus M, h is a (κ, λ)-model and M, E, S = M, E h , Sh . Consequently,
r λ is inaccessible. We record our conclusion in a proposition. Proposition 6: Let κ be a cardinal, M a set of cardinality κ, E ⊆ M×M and S ⊆ M. The structure M, E, S is a model of ZFCU− iff there is an inaccessible cardinal λ and a (κ, λ)-model M, h such that M, E, S = M, E h , Sh . The results of this subsection leave open the question of whether and when the axiom of foundation is satisfied in a (κ, λ)-model. Since New V− has non-well-founded models and all models of New V− are λ-models, there are (κ, λ)-models in which foundation fails. On the other hand, since any model of ZFCU− comes from a λ-model, there are also (κ, λ)-models in which foundation holds. Our next goal is to provide a uniform description of all well-founded (κ, λ)-models. After that, we will be concerned with the failure of foundation. But first of all we deal with isomorphism between and with submodels of (κ, λ)-models.
3.2
Isomorphism
Let ᏹ = M, h and ᏺ = N , g be (κ, λ)-models. An isomorphism between ᏹ and ᏺ is a bijection F between M and N such that, for all X ∈ [M]<λ , g(F“X ) = F(h(X )).
(2)
312
The Arché Papers on the Mathematics of Abstraction
This is an adequate notion of isomorphism, as we proceed to show. It is clear that the identity map on a set M is an automorphism of any model with domain M. If F is an isomorphism between ᏹ and ᏺ, then F −1 is an isomorphism between ᏺ and ᏹ. For let Y ∈ [N ]<λ . We have to see that h(F −1 “Y ) = F −1 (g(Y )). But let X = F −1 “Y. We compute: F −1 (g(Y )) = F −1 (g(F“X )), since Y = F“X = F −1 (F(h(X )), by (2) = h(X ) = h(F −1 “Y ). If = K , j is also a (κ, λ)-model and G is an isomorphism between ᏺ and , thus for every Y ∈ [N ]<λ , j (G“Y ) = G(g(Y )),
(3)
then G◦F is an isomorphism between ᏹ and . For if X ∈ [M]<λ , then j ((G ◦ F)“X ) = j (G“(F“X )) = G(g(F“X )), by (3) = G(F(h(X )), by (2) = (G ◦ F)(h(X )). Moreover, Lemma 7: If ᏹ = M, h and ᏺ = N , g are (κ, λ) -models and F : M → N, then F is an isomorphism between ᏹ and ᏺ iff it is an isomorphism between the structures M, E h , Sh and N , E g , Sg . Proof: Let F be a function on M to N . (⇒) Suppose F is an isomorphism between ᏹ and ᏺ. Let a, b ∈ M. We check that (i) if a ∈ Sh , then F (a) ∈ Sg and (ii) if aEh b, then F(a)E g (b). (This will be enough, since F −1 is also an isomorphism between ᏺ and ᏹ.) Now if a ∈ Sh , let X = h −1 (a). X ∈ [M]<λ , and, by (2), g(F“X ) = F(h(X )) = F(a), so that F(a) ∈ Sg . As to (ii), if aEh b, then b ∈ Sh and a ∈ h −1 (b). Let X = h −1 (b). Thus F(a) ∈ F“X , and, by (2), g(F“(X )) = F(h(X )) = F(b). Hence, F(a)E g F(b), by definition of E g . (⇐) Suppose now that F is an isomorphism between M, E h , Sh and N , E g , Sg . Let X ∈ [M] < λ and set a = h(X ). Thus, a ∈ Sh , and F(a) ∈ Sg , i.e., F(a) ∈ ran(g). By definition of E h , X = {b ∈ M: bEh a}. Thus, by our assumption on F, F“X = {F(b) : b ∈ M ∧ F(b)E g F(a)} = {c ∈ N : cE g F(a)}. Hence, by definition of E g , g(F“X ) = F(a), i.e., g(F“X ) = F(h(X )), which is Equation (2). Thus, F is an isomorphism between ᏹ and ᏺ.
Well- and non-well-founded Fregean Extensions
3.3
313
Submodels
Let M, h be a (κ, λ)-model. A subset N of M is h-closed if h“[N ]<λ ⊆ N . N is h-transitive if for all a ∈ N ∩ Sh , h −1 (a) ⊆ N . Thus N is h-transitive iff whenever a ∈ N and xEh a, then x ∈ N . Let M, h and N , g be λ-models. We say that N , g is a submodel of M, h iff N ⊆ M and g = h%[N ]<λ . Thus the submodels of M, h
are determined by the h-closed subsets of M. If N , g is a submodel of M, h and N is a h-transitive subset of M, we say that N , g is a transitive submodel of M, h . Notice that a submodel N , g if of M, h is transitive iff Sh ∩ N = Sg , i.e., iff no g-atom is a h-set. Hence, every atomless submodel of M, h is transitive. Lemma 8: If M, h and N , g are λ-models and N is a h-transitive subset of M, then N , g is a submodel of M, h iff N , E g , Sg is a substructure of M, E h , Sh . Proof: Let M, h and N , g be λ-models, with N a h-transitive subset of M. (⇒) Suppose that N , g is a submodel of M, h . Since N is h-transitive, Sg = Sh ∩ N . So, we are left to show that for all a, b ∈ N , aE g b iff aEh b. Let a, b ∈ N . If aE g b, then b ∈ ran(g) and a ∈ g −1 (b); since g −1 (b) ∈ [N ]<λ , h(g −1 (b)) = b, so aEh b. Conversely, if aEh b, a ∈ h −1 (b) and, since N is h-transitive, h −1 (b) ∈ [N ]<λ . Hence g(h −1 (b)) = b, so that aE g b. (⇐) Suppose that N , E g , Sg is a substructure of M, E h , Sh . We have to check that g = h%[N ]<λ , that is, that whenever X ∈ [N ]<λ , g(X ) = h(X ). Let g(X ) = b. Since N , E g is a substructure of M, E h , for all a ∈ N , aE g b ↔ aEh b, i.e., a ∈ X ↔ a ∈ h −1 (b). Thus X = h −1 (b) ∩ N . Since N is h-transitive, h −1 (b) ⊆ N . Thus X = h −1 (b), i.e., h(X ) = b. Conversely, assume that h(X ) = b. As before, for all a ∈ N , a ∈ g −1 (b) ↔ a ∈ X . So, since g −1 (b) ⊆ N , g −1 (b) = X , i.e., g(X ) = b. The assumption of h-transitivity of N is necessary for both conditionals, as shown by the following two examples. Example 1: Let N , g be a (κ, λ)-model with at least one atom a. Let M be a set including N with |M\N | = |M| = |N | = κ and let X be a subset of M such that X ∩ N = Ø and X \N = Ø . Let h be any injective function on [M]<λ onto M extending g and such that h(X ) = a. N is not htransitive, since h −1 (a) ⊇ N . By construction, N , g is a submodel of M, h
but N , E g , Sg is not a substructure of M, E h , Sh , since a ∈ Sh ∩ N , but a∈ / Sg . Indeed, not even N , E g is a substructure of M, E h , since if b ∈ X ∩ N , then bEh a, but not bEh a. Example 2: Let N , g an atomless (κ, λ)-model. Let M be a set including N with |M\N | = |M| = |N | = κ and let e ∈ M\N . Let h be any injective function on [M]<λ onto M such that for each X ∈ [N ]<λ , h(X ∪ {e}) = g(X ). N
314
The Arché Papers on the Mathematics of Abstraction
is not h-transitive, since for no a ∈ N is h −1 (a) ⊆ N . It is obvious that N , g is not a submodel of M, h However, N , E g , Sg is a substructure of M, E h , Sh , since for each a ∈ N , g −1 (a) = h −1 (a) ∩ N . For a ∈ M, let Th (a) be the h-transitive closure of a, i.e., the smallest htransitive subset of M containing a. The h-support of a, in symbols, supth (a), is the set of all h-atoms contained in Th (a). If A is a set of h-atoms, we say that a is an object with support in A if supth (a) ⊆ A. If supth (a) = Ø, we say that a is a hereditary h-set. It is not hard to see that the set of all objects in M with support in any given set of h-atoms is a h-closed and h-transitive subset of M. In particular, so is the set of all hereditary h-sets.
4.
Well-founded (κ, λ)-models
Let λ be an infinite cardinal and let ᏹ = M, h be a (κ, λ)-model. The intersection of any family of h-closed subsets of M is a h-closed subset of M. Thus, for each set of h-atoms A there is a minimum h-closed subset of M including A, which we denote by ‘Mh (A)’, or, if no confusion is likely to arise, simply by ‘M(A)’. By ‘ᏹ(A)’ we denote the submodel of ᏹ with universe M(A). We describe M(A) by recursion. Let A0 = A Aα+1 = Aα ∪ h“([Aα ]<λ ) Aβ , for αlimit. Aα = β<α
Notice that each Aα is h-transitive and that Aα ⊆ Aβ , whenever α < β. If λ is an infinite cardinal, let λ∗ be the least cardinal of cofinality ≥ λ. Thus λ if λ is regular, ∗ λ = λ+ if λ is singular. We note that if κ is any cardinal such that κ <λ = κ, then λ∗ ≤ κ. Lemma 9: M(A) = Aλ∗ . Hence, M(A) is h-transitive. Proof: Since M(A) is the smallest h-closed subset of M including A and since A ⊆ Aλ∗ , we have to show that (i) Aλ∗ is h-closed, and (ii) Aλ∗ ⊆ M(A). To show (i), let X ⊆ [Aλ∗ ]<λ . Since |X | < λ ≤ cf λ∗ , there is α < λ∗ such that X ∈ [Aα ]<λ . Thus h(X ) ∈ Aα+1 ⊆ Aλ∗ . To show (ii), one verifies by induction that for each ordinal α, Aα ⊆ M(A). Lemma 10: E h is well-founded on M(A).
Well- and non-well-founded Fregean Extensions
315
Proof: Let X ⊆ M(A), X = Ø. If X contains some h-atom, we are done. So suppose each member of X is a h-set. We must find a ∈ X such that h −1 (a) ∩ X = Ø. Let α be the least ordinal such that X ∩ Aα = Ø and let a ∈ X ∩ Aα . There is β < α such that h −1 (a) ⊆ Aβ . By minimality of α, Aβ ∩ X = Ø. So, h −1 (a) ∩ X = Ø. We say that an object a ∈ M is hereditarily h-well-founded if E h is wellfounded in the h-transitive closure of a. Lemma 11: M(A) is the set of all hereditarily h-well-founded objects in M with support in A. Proof: Let N be the set of all hereditarily h-well-founded objects in M with support in A. We can easily see that N is h-closed, A ⊆ N and E h is wellfounded in N . Since N is h-closed and includes A, M(A) ⊆ N . For the converse, let a ∈ M(A). Since M(A) is h-transitive, Th (a) ⊆ M(A), so that E h is well-founded on Th (a), i.e., a ∈ N . Lemma 12: If N , g is a transitive submodel of M, h A is the set of gatoms and E g is well-founded on N, then N = M(A). Briefly, ᏹ(A) is the unique well-founded transitive submodel of M, h with set of atoms A. Proof: Since N is h-transitive, A is a set of h-atoms. Hence, by minimality of M(A), M(A) ⊆ N . For the converse inclusion, assume a ∈ N . Since N is htransitive, Tg (a) = Th (a). Since E g is well-founded on N , it is well-founded on Tg (a). But then, E h is well-founded on Th (a), i.e., a is hereditarily h-well founded. Hence a ∈ M(A) and N ⊆ M(A). As we shall now see, ᏹ(A) is the unique, up to isomorphism, well-founded λ-model with a set of atoms of the cardinality of A. First we compute the cardinality of M(A). Then we show that for any infinite cardinal λ there is a well-founded λ-model with a set of atoms of any prescribed cardinality. Finally, we prove that any two well-founded λ-models with sets of atoms of the same cardinality are isomorphic. FACT 13. If |A| = µ, |M(A)| is the least cardinal κ ≥ µ such that κ <λ = κ. Proof: Let |A| = µ, |M(A)| = ν, and let κ be least such that κ ≥ µ and κ <λ = κ. We how that κ = ν. Since M(A) is h-closed, ν <λ = ν and, thus, λ ≤ ν. Since A ⊆ M(A), µ ≤ ν. Hence, by minimality of κ, κ ≤ ν. Since λ∗ ≤ κ (as cf κ ≥ λ), in order to know that ν ≤ κ it is enough to see that, for each α < λ∗ , Aα has cardinality ≤ κ. This we see by induction on α: Let α < λ∗ and suppose that |Aβ | ≤ κ for all β < α. Since α ≤ λ ≤ κ,
316
The Arché Papers on the Mathematics of Abstraction
|A| = µ ≤ κ and κ <λ = κ, Aα is the union of at most κ sets of cardinality ≤ κ, thus its cardinality is at most κ. Proposition 14: If λ and µ are cardinals, with λ infinite, there is at least one well-founded λ-model with exactly µ atoms. Proof: Let κ be a cardinal such that µ ≤ κ and κ <λ = κ. Let A be a subset of M of cardinality µ and such that |M\A| = κ. Let h be any injective function on [κ]<λ onto M\A. Thus M = M, h is a (κ, λ)-model with A as the set of h-atoms. ᏹ(A) is the desired model. Corollary 15: If λ is an infinite cardinal and κ is any cardinal such that κ <λ = κ, there is a well-founded (κ, λ)-model. Proof: By the proof of the preceding proposition and Fact 13.
Proposition 16: Let λ be an infinite cardinal and suppose that ᏹ = M, h
and ᏺ = N , g are any two well-founded λ-models. Suppose as well that F is a bijection between the set Ah of h-atoms and the set A g of g-atoms. Then F can be uniquely extended to an isomorphism F* between ᏹ and ᏺ. Hence, for each cardinal µ there is up to isomorphism exactly one well-founded λ-model with µ atoms. Proof: Define F ∗ : M → N by E h -recursion: F(a) if a ∈ Ah , ∗ F (a) = g({F ∗ (x) : x E h a}) if a ∈ Sh . First, we see that F ∗ is injective by E h -induction. F ∗ ’s restriction to Ah is clearly injective, so let a ∈ Sh and suppose that (∀x E h a)(∀y ∈ M)(F ∗ (x) = F ∗ (y) → x = y).
(4)
We must see that for no b = a is F ∗ (a) = F ∗ (b). So, assume that F ∗ (a) = F ∗ (b). Since g is injective, {F ∗ (x) : x E h a} = {F ∗ (x) : x E h b}.
(5)
We must conclude that a = b, or, equivalently, that h −1 (a) = h −1 (b). Suppose x ∈ h −1 (a), i.e., x E h a. By (5), there is y ∈ h −1 (b) such that F ∗ (x) = F ∗ (y). By (4), x = y. Thus x ∈ h −1 (b). Consequently, h −1 (a) ⊆ h −1 (b). If, conversely, y ∈ h −1 (b), by (5) there is x ∈ h −1 (a) such that F ∗ (x) = F ∗ (y). Again by (4), x = y. Thus h −1 (b) ⊆ h −1 (a). Hence h −1 (a) = h −1 (b). Now we check that F ∗ is onto N by E g -induction. Let b ∈ N and suppose that g −1 (b) ⊆ ran(F ∗ ). We must conclude that b ∈ ran(F ∗ ). Let X = {x ∈ M : F ∗ (x) ∈ g −1 (b)}. By our assumption on b, F ∗ “X = g −1 (b), and thus g(F ∗ “X ) = b. Since |g −1 (b)| < λ and F ∗ is injective, X ∈ [M]<λ . Let
Well- and non-well-founded Fregean Extensions
317
a = h(X ). By definition of F ∗ , F ∗ (a) = g ({F ∗ (x) : x E h a}) = g(F ∗ “X ) = b. Hence b ∈ ran(F ∗ ). Finally, we show that F ∗ satisfies the equation of isomorphism. Let X ∈ [M]<λ and let a = h(X ). Thus, a ∈ Sh . Hence, by the definition of F ∗ , and owing to the fact that h −1 (a) = {x ∈ M : x E h a}, F ∗ (a) = g(F ∗ “(h −1 (a))). In other words, F ∗ (h(X )) = g(F ∗ “X ); which is Equation (2) for F ∗ .
We thus have a complete structural description of all well-founded (κ, λ)models, as well as of all well-founded submodels of any (κ, λ)-model. Before leaving this topic, we give standard representatives of the atomless wellfounded λ-models. For each infinite cardinal λ, let Hλ be the set of all sets whose transitive closure has cardinality less than κ (thus, Hω = H F). Remark 17: If λ is a regular cardinal and M, h is an atomless well-founded λ-model, then M, E h ∼ = Hλ ∈ . Proof: Let λ be regular and suppose that M, h is a well-founded λ-model without atoms. By regularity of λ, [Hλ ]<λ = Hλ . Hence, if g is the identity map on Hλ , Hλ , g is a λ-model and E g is the true membership relation on Hλ . Thus, Hλ , g is a well-founded λ-model without atoms. By uniqueness of such models, M, h is isomorphic to Hλ , g . We observe that the unique isomorphism between the structures M, E h
and Hλ ∈ is the function F : M → Hλ given by F(a) = {F(x) : x E h a}. Remark 18: If λ is singular and M, h is an atomless well-founded λ<λ model, then M, E h ∼ = S+ λ , ∈ , where S0 = Ø, Sα+1 = [Sα ] , and, for α Sβ . limit, Sα = β<α
Proof: Similar to the preceding one.
5.
Non-well-founded (κ, λ)-models
Now that we have a complete description of all well-founded (κ, λ)-models, we direct our attention to the non-well-founded ones. Our aim is twofold. One purpose is to illustrate the variety and abundance of non-well-founded (κ, λ)models for each pair of infinite cardinals κ, λ such that κ <λ = κ. Some of them, namely, those for which λ is inaccessible, will yield non-well-founded models of second-order ZFCU− . The other purpose is to answer one question that immediately arises in view of the existence of non-well-founded (κ, λ)models: Is it possible to build (κ, λ)-models violating foundation in prescribed
318
The Arché Papers on the Mathematics of Abstraction
ways? And, in particular, is it possible to build (κ, λ)-models satisfying Aczel’s anti-foundation axiom? Let us tackle our first task first. We know that whenever κ and λ are infinite cardinals such that κ <λ = κ, there is a well-founded (κ, λ)-model. However, for κ large relative to λ, such models must include many atoms. If we demand that our model M, h have strictly fewer than λ atoms (which means that there is a h-set whose E h -members are all the h-atoms) then there is an upper bound on the cardinality of a λ-model, namely the least cardinal κ such that κ <λ = κ. In general, as we see from Proposition 16, the cardinality of well-founded λmodels is determined by the cardinality of the set of atoms. Indeed, if µ is any cardinal and κ is least such that µ ≤ κ and κ <λ = κ, then any well-founded λ-model with µ atoms has cardinality κ. Now we show that there is no upper bound on the cardinality of non-wellfounded λ-models without atoms. In fact, for each cardinal κ such that κ <λ = κ there is an atomless (κ, λ)-model: if |M| = κ = κ <λ and h is an injective function on [M]<λ onto M, then M, h is such a model. If we want to make sure that the model is non-well-founded, we choose h so that for some X ∈ [M]<λ , h(X ) ∈ X . For if h(X ) = a ∈ X , then a E h a. Clearly, an injective function h : [M]<λ → M yields a non-well-founded model iff there is a sequence an : n ∈ ω of members of M such that for all n ∈ ω, an+1 ∈ h −1 (an ). Functions like these are easy to find. Fix a cardinal κ such that κ <λ = κ and let M be any set of cardinality κ. Given any one-toone sequence an : n ∈ ω of members of M, there certainly exists an injective function h : [M]<λ → M such that for each n, h({an+1 }) = an . Then an : n ∈ ω is an Eh -descending sequence in M, E h . If h is onto M, the model M, h
is atomless. Again, given n > 1 and n distinct elements of M, a1 , a2 , . . ., an , let h : [M]<λ → M be an injective function such that h({a1 }) = a2 , h({a2 }) = a3 , . . ., and h({an }) = a1 . Then in M, E h , a1 E h a2E h , . . ., an E h a1 . Now we show that there are many non-well-founded (κ, λ)-models. Proposition 19: If λ is an infinite cardinal and κ is the least cardinal such that κ <λ = κ, there are 2λ pairwise non-isomorphic non-well-founded (κ, λ)models. Proof: Let S be a non-empty subset of λ and let M be a set of cardinality κ. Let A ⊆ M be a set bijectable with S, u α : α ∈ S an injective enumeration of A, and let h : [M]<λ → M be such that M, h is a well-founded (κ, λ)-model with set of atoms A, i.e., M = M(A). Consider the sequence of ‘h-ordinals’ aα : α < λ such that for each α < λ, aα = h({aβ : β < α}) (this plays the role of the ordinal sequence in M, E h ). Now define g S : [M]<λ → M thus (letting g = g S ):
r if X = {aα , u α } for some α ∈ S, g(X ) = u α , r otherwise, g(X ) = h(X ).
Well- and non-well-founded Fregean Extensions
319
Since each u α is a h-atom, g is injective. Moreover, ran(g) = M\{h(aα , u α ) : α ∈ S}, i.e., {h(aα , u α ) : α ∈ S} is the set of g-atoms. Let ᏹ S = M, g . We show that for any b ∈ M, bE g b ↔ (∃α ∈ S)(b = u α ).
(6)
−1
On the one hand, if α ∈ S, then u α ∈ g (u α ), so that u α E g u α . On the other hand, if b ∈ ran(g) and b = u α , for every α ∈ S, then b = h(X ) for some X ∈ [M]<λ . But then b = g(X ), so that bE g b iff bE h b. Since E h is well-founded, ¬bE g b. From (6), we see that we can recover S from ᏹ S : For (i) aα is, both in M, h and in M, g , ‘the α-th ordinal’ (i.e., the set {x ∈ M : x E h aα } = {x ∈ M : x E g aα } is h-transitive and g-transitive and is well-ordered both by E h and by E g with order-type α). This is true by construction for E h and it is also true for E g , since (as we can easily check by E h -recursion) the identity map is an isomorphism between the atomless well-founded λ-models M(Ø), E h
and M(Ø), E g . And (ii) aα is the unique E g -member of u α distinct from u α . Thus: α ∈ S iff there are a, b ∈ M such that 1. g −1 (b) = {a, b}, 2. g −1 (a) is g-transitive, and 3. g −1 (a), E g is a well-order of type α.
It follows that if S and T are distinct non-empty subsets of λ, the models ᏹ S and ᏹT thus constructed are non-isomorphic. Thus we have 2λ non-well founded (κ, λ)-models. All these models have atoms (the atoms in ᏹ S being the h({aα , u α })’s, for α ∈ S). Now we modify the construction to get rid of them. Proposition 20: If λ is an infinite cardinal and κ is the least cardinal such that κ <λ = κ, there are 2λ pairwise non-isomorphic non-well-founded (κ, λ)models without atoms. Proof: If λ is uncountable, let S be an infinite set of limit ordinals less than λ, while if λ = ω, let S be an infinite set of even numbers. As before, let A ⊆ M be a set of cardinality S, and let ᏹ = M, h be a well-founded λmodel with set of atoms A. Let T = {α + 1 : α ∈ S} (thus S ∩ T = Ø) and let u α : α ∈ S ∪ T be an injective enumeration of A. Define the sequence aα : α < λ of ‘h-ordinals’ as before, and let, for each α ∈ S ∪ T , bα = h({aα , u α }). Notice that for all α, β ∈ S ∪ T , (i) bα = u β , since u β ∈ / ran(h), and (ii) bα ∈ / aβ , since supth (aα ) = Ø, while u α ∈ supth (bα ).
320
The Arché Papers on the Mathematics of Abstraction
Let finally j be a bijection j : T → {u α : α ∈ T } ∪ {bα : α ∈ S ∪ T } such that j (α) = u α , for all α ∈ T . Now we can define g S : [M]<λ → M thus (letting g = g S ):
r g({aα , u α }) = u α , if α ∈ S, r g({aα , u α }) = j (α), if α ∈ T , r g(X ) = h(X ), for any other X ∈ [M]<λ . By (i) and (ii) we see that g is one-to-one. We now verify that g is onto M, and thus Sg = M. Certainly, M = A ∪ ran(h). Now if a ∈ A, a = u α , for some α ∈ S ∪ T. If α ∈ S, u α = g({aα , u α }), while if α ∈ T, u α = j (β) for some β ∈ T , and then u α = g({aβ , u β }). Thus A ⊆ ran(g). Now suppose a ∈ ran(h). If a = bα , for some α ∈ S ∪ T , then a = j (β), for some β ∈ T , so that a = g({aβ , u β }). If, for all α, a = bα , then a = h(X ) for X = {aα , u α }(α ∈ S ∪ T ). But in this case, a = g(X ). As before, for every b ∈ M, bE g b iff b = u α , for some α ∈ S. Also as before, S can be retrieved from g. Thus, there are 2λ pairwise non-isomorphic (κ, λ)-models with no atoms. Example 3: The construction in the proof of the preceding proposition works for every infinite cardinal λ. We now sketch a simpler construction of 2ω nonisomorphic (ω, ω)-models with no atoms. Let A be the set of the positive even numbers. If S is an infinite set of natural numbers, partition A as A = n∈S An , where for each n ∈ S, An is an n-element set: An = an1 , an2 , . . ., ann . Let now h = h S be any bijection h : [ω]<ω → ω such that
r if n is odd, h({n}) = 3n , r h({a i }) = a i+1 , if 1 ≤ i < n, n n r h({a n }) = a 1 . n
n
It is not hard to see that S can be retrieved from ᏹ S = ω, h S , because n ∈ S iff there is a ‘n-cycle of E h singletons’ (namely, an1 , . . ., ann ), i.e., there are x1 , x2 , . . ., xn ∈ ω such that ∀y(y E g x2 ↔ y = x1 ), ∀y(y E g x3 ↔ y = x2 ), . . ., ∀y(y E g x1 ↔ y = xn ). As before, if S = T, ᏹ S and ᏹT are not isomorphic. Corollary 21: For each inaccessible cardinal λ, there are 2λ pairwise nonisomorphic non-well-founded (λ, λ)-models of ZFCU− without atoms. We conclude that there are non-well-founded λ-models in great abundance. Now we turn to the task of building non-well-founded models with special characteristics.
Well- and non-well-founded Fregean Extensions
6.
321
Non-well-founded models and extensional graphs
If M, h is a (κ, λ)-model and a ∈ M, we say that a is a h-transitive h-set if h −1 (a) is a h-transitive subset of M. Thus, a is h-transitive iff whenever x E h a and y E h x, also x E h a. We will now show that for every infinite cardinal λ, there is a λ-model M, h without atoms in which every extensional relation on a set of cardinality < λ is isomorphic to the E h -relation on some h-transitive h-set. As is customary in the literature on non-well-founded sets, we often speak of graphs rather than relations. Thus we define: A graph is a pair A, , where A is a non-empty set and is a binary relation on A. A graph A, is extensional if whenever a, b ∈ A, (∀x ∈ A)(x a ↔ x b) → a = b. If A, is a graph and a ∈ A, let (a) = {x ∈ A : x a}. Thus A, is extensional iff the map a |→ (a) is injective. We say that a graph A, is represented in a λ-model M, h by a h-set a ∈ M if A, ∼ = h −1 (a), E h . First, we deal with one single graph. Lemma 22: Given any extensional graph A, and any infinite cardinal λ > |A|, there is a λ-model M, h with no atoms in which A, is represented by a h-transitive h-set. Proof: Let κ be such that κ <λ = κ and let M be a set of cardinality κ including A. Let a ∈ M\A. Since is extensional, (x) = (y) , whenever x = y. Let h be any bijection h: [M]<λ → M such that (1) h(A) = a, (2) h((x) ) = x, for x ∈ A.
Since |A| < κ, such an h certainly exists. Since h is onto M, every element of M is a h-set. By (1), A = h −1 (a) = {x : x E h a}, and, by (2), for all x, y ∈ A, x y ↔ x E h y. Hence, h −1 (a), E g = A, . Finally, a is E h transitive because if y E h x E h a, then certainly y ∈ A, i.e., y ∈ h −1 (a); that is y E h a. Let A, be an extensional graph. A subset B of A is -transitive if whenever x ∈ B and y x, then y ∈ B. Let W be the union of all -transitive subsets of A in which is well-founded. It is not hard to see that W is transitive and that is well-founded in W ; thus W is the largest -transitive
322
The Arché Papers on the Mathematics of Abstraction
subset of A in which is well-founded. By Mostowski’s collapsing theorem, the extensional subgraph W, is isomorphic to a transitive set, i.e., there is a transitive set B such that W, ∼ = B, ∈ . From these remarks, we easily get the following lemma. Lemma 23: If A, is an extensional graph, there is a set B and a relation S on B such that (1) A, ∼ = B, S and (2) if B0 is the largest -transitive subset of A on which S is well-founded, then B0 is a transitive set and S ∩ (B0 × B0 ) is the membership relation on B0 . A top point of a graph A, is a member a ∈ A such that (1) for no x ∈ A is a x, and (2) for all x ∈ A, x a iff x = a. Clearly, a graph A, has at most one top point. If a is the top point of the graph A, , the lower part of A, is the subgraph of A, with domain A\{a}. Theorem 24: Let λ be an infinite cardinal and let κ be a cardinal such that κ <λ = κ. There is a (κ, λ)-model without atoms M, h such that every extensional graph of cardinality less than λ is represented in M, h by some h-transitive h-set. Proof: Since κ <λ = κ, we see, with the help of Lemma 23, that there is a κ-sequence of triples, X ξ, ξ , aξ : ξ < κ , where |X ξ | < λ, ξ ⊆ X ξ × X ξ , aξ ∈ X ξ , and, letting X ξ0 be the largest ξ transitive subset of X ξ on which ξ is well-founded and X ξ1 = X ξ \X ξ0 , 1. X ξ , ξ is an extensional graph with top point aξ , 2. X ξ0 is a transitive set and ξ restricted to it is the membership relation, 3. X ξ1 ∩ X η1 = Ø, for ξ < η < κ, 4. X ξ0 ∩ X η1 = Ø, for ξ ≤ η < κ, 5. Each extensional graph of cardinality less than λ is isomorphic to the lower part of some X ξ , ξ .
Since each graph X ξ , ξ is extensional, if x, y ∈ X ξ , then x = y iff (x) ξ = (y) (x) ξ . So we can define the injective functions f ξ : {ξ : x ∈ X ξ } → X ξ by f ξ ((x) ξ ) = x. These functions are mutually compatible. For if ξ = η and (y) (x) ξ = η , then, by conditions 3 and 4, x ∈ X ξ0 and y ∈ X η0 ; but then by (y) condition 2, x = (x) ξ and y = η , so that = x = y = f η (x) f ξ (x) η . ξ Let X = ξ <κ X ξ and f = ξ <κ f ξ . So |X | ≤ κ and f is an injective function on a subset of [X ]<λ onto X . Let now M be a set including X and such that |M\X | = κ. Since |[M]<λ \dom( f )| = κ, we can extend f to an injective
Well- and non-well-founded Fregean Extensions
323
function h on [M]<λ onto M. The (κ, λ)-model M, h is the one we are looking for. For let A, be an extensional graph with |A| < λ. By condition 5, there is ξ < κ such that A, is isomorphic to the lower part of X ξ , ξ . Let (a ) bξ = h(ξ ξ ). By definition of h, x ∈ h −1 (bξ ) iff x ∈ X ξ \{aξ }, and, for x, y ∈ h −1 (bξ ), (y)
x E h y iff x ∈ ξ iff x ξ y, so that A, ∼ = h −1 (aξ ), E h . We are left to show that bξ is h-transitive. Suppose x E h y and y E h bξ . This (a ) (a ) (y) means that x ∈ ξ and y ∈ ξ ξ . But since aξ is the top point of X ξ , x ∈ ξ ξ , i.e., x E h bξ . Remark 25: Given a graph A, , there is a set B ⊇ A and a relation S on B such that the graph B, S is extensional and = S ∩ (A × A). Moreover, if A is finite, so is B, and if A is infinite, |A| = |B|. Proof: Let B = A ∪ A∗ {u}, where A∗ = {a ∗ : a ∈ A} is a set disjoint from / A ∪ A∗ . Further, let L be a strict linear order A, a ∗ = b∗ , for a = b, and u ∈ ∗ on A ∪ {u} with minimum element u. Define the relation S on B by: S = ∪ { a ∗ , a : a ∈ A} ∪ L . It is obvious that S ∩ (A × A) = R, and a routine check shows that B, S is extensional, since linear orders are. Corollary 26: If κ and λ are infinite cardinals such that κ <λ = κ, there is a (κ, λ)-model without atoms M, h in which every graph of cardinality less than λ is represented. Proof: By Theorem 24, in view of Remark 25.
7.
Models of AFA
As Corollary 26 makes clear, the (κ, λ)-models M, h whose existence was proven in Theorem 24 are very rich in non-well-founded h-sets, since every graph of cardinality less than λ is isomorphic to the E h -relation on some h-set. If λ is inaccessible, the structure M, E h is a model of second-order ZFC− plus Boffa’s Weak Axiom BA1 discussed in p. 57 of [1]. This is the variant of anti-foundation that seems to us most telling about Fregean extensions, since it brings to light the arbitrariness with which extensions can be assigned to concepts. Although it is not our purpose here to investigate what forms
324
The Arché Papers on the Mathematics of Abstraction
of anti-foundation are satisfiable in (κ, λ)-models, we nevertheless want to look at Aczel’s Axiom of anti-foundation AFA, mainly because it is the most discussed in the literature. We show how to obtain a λ-model of AFA from any λ-model in which every graph of cardinality less than λ is represented. Although we mean our treatment to be self-contained, we refer to [1] for details and motivation. A bisimulation on a graph Ꮽ = A, is a relation R on A such that whenever aRb, (∀x a)(∃y b)x Ry and (∀y b)(∃x a)x Ry. It is obvious that the union of any family of bisimulations on Ꮽ is a bisimulation on Ꮽ. Consequently, there is a largest bisimulation on Ꮽ, namely, the relation ≡Ꮽ defined by: a ≡Ꮽ b iff there is a bisimulation R on Ꮽ such thata Rb. Clearly, Id A , the identity relation on A, is a bisimulation on Ꮽ, and the inverse relation of a bisimulation on Ꮽ is a bisimulation on Ꮽ. Moreover, if R and S are bisimulations on Ꮽ, so is their relative product R|S = { x, y : ∃z(x Rz ∧ zSy)}. From this we conclude that ≡Ꮽ is an equivalence relation on A. We say that a graph Ꮽ is strongly extensional if ≡Ꮽ is the identity on A. In other words, if every bisimulation on Ꮽ is a subrelation of Id A . Remark 27: Every strongly extensional graph is extensional. Proof: Let Ꮽ = A, be a graph and define the relation R on A by a Rb iff (∀x ∈ A)(x a ↔ x b). R is a bisimulation on Ꮽ. Hence, if Ꮽ is strongly extensional and a Rb, then a = b. Thus, A is extensional. Let Ꮽ = A, Ꮽ be a graph and let B be a quotient of A by ≡Ꮽ with quotient map π . This means that π is a map on A onto B such that for every a, b ∈ A, a ≡Ꮽ b iff π(a) = π(b). We define the relation Ꮾ on B by: u Ꮾ v iff (∃a, b ∈ A)(π(a) = u ∧ π(b) = v ∧ a Ꮽ b)
(7)
and let Ꮾ = B, Ꮾ . Lemma 28: For all b ∈ A, {u ∈ B : u Ꮾ π(b)} = {π(a) : a Ꮽ b}. Proof: We have to show that for all a, b ∈ A and u ∈ B, 1. a Ꮽ b → π(a) Ꮾ π(b), and 2. u Ꮾ π(b) → (∃a Ꮽ b)(u = π(a)).
Since (i) is immediate by (7), we turn to (ii). Let u Ꮾ π(b). By (7), there are ), π(b ) = π(b) and a Ꮽ b . Since b ≡Ꮽ b and a , b ∈ A such that u = π(a ◦ ≡Ꮽ is a bisimulation on Ꮽ , from a Ꮽ b we conclude that there is a Ꮽ b such that a ≡Ꮽ a; hence π(a) = u.
Well- and non-well-founded Fregean Extensions
325
Lemma 29: Ꮾ is strongly extensional. Proof: If R is a bisimulation on Ꮾ, then the relation { a, b : π(a)Rπ(b)} is a bisimulation on Ꮽ. Accordingly, if π(a)Rπ(b), then a ≡Ꮽ b; hence π(a) = π(b). Let M, h be a λ-model. Thus, M, E h is a graph. Let N be the quotient of M, E h by ≡ᏹ with quotient map π. Define the relation E N as in (7): u E N v iff (∃a, b ∈ M)(π(a) = u ∧ π(b) = v ∧ a E h b).
(8)
For each v ∈ N , let X v = {u ∈ N : u E N v}. By Lemma 28, we know that for each a ∈ M, X π(a) = {π(b) : bE h a},
(9)
so that, for all v ∈ N , |X v | < λ. By Lemma 29, N , E N is strongly extensional. But the relation { u, v : X u = X v } is clearly a bisimulation on N , E N . Hence X u = X v → u = v.
(10)
Let now X be any subset of N of cardinality < λ. Let Y ∈ [M]<λ be such that π Y = X and let a = h(Y ). We have: X = {π(b) : b ∈ Y } = {π(b) : bE h a}. Hence, by (9), X = X π(a) . Thus, by (10), for each X ∈ [N ]<λ there is a unique v ∈ N such that X = X v . We define the function g : [N ]<λ → N by g(X ) = the unique u ∈ N such that X = X u . The inverse relation of g being the function u|→ X u , g is injective and onto N . Hence N = N , g is a λ-model with no atoms and E g = E N . We also refer to N as the quotient of M by ≡ M with quotient map π. Lemma 30: Every g-transitive g-set is strongly extensional. To be explicit, if a is a g-set and g−1 (a) is a g-transitive subset of N, then the graph g −1 (a), E g
is strongly extensional. Proof: If a is a g-transitive g-set, every bisimulation on g −1 (a), E g is also a bisimulation on N , E g , hence a subrelation of the identity. Lemma 31: If M = M, h is a λ-model in which every extensional graph of cardinality < λ is represented by a h-transitive h-set, and ᏺ = N , g is the quotient of ᏹ by ≡ M with quotient map π, then every strongly extensional graph of cardinality < λ is represented in N by a g-transitive g-set. Proof: Let ᏹ, ᏺ and π be as stated and let Ꮽ = A, be a strongly extensional graph with |A| < λ. By assumption, there is an isomorphism σ between A and the graph h −1 (a), E h , for some h-set a. Let b = π(a). Since Ꮽ is strongly extensional, so is h −1 (a), E h , but then the restriction of π to h −1 (a)
326
The Arché Papers on the Mathematics of Abstraction
is an isomorphism between h −1 (a), E h and g −1 (b), E g . Hence π ◦ σ is an isomorphism between Ꮽ g −1 (b), E g . Theorem 32: For very infinite cardinal λ there is a λ-model N , g such that (i) every g-transitive g-set is strongly extensional, and (ii) every strongly extensional graph of cardinality < λ is represented in N by a g-transitive g-set. Proof: By Theorem 26 and Lemmas 30 and 31.
As we show in Appendix 2, Aczel’s anti-foundation axiom (AFA) is equivalent in ZFC− to the joint assertion that (i) every transitive set is strongly extensional, and (ii) every strongly extensional graph is isomorphic to a transitive set.
ZFA is the theory whose axioms are those of ZFC− plus AFA. Since whenever λ is inaccessible and M, h is an atomless λ-model, M, E h is model of ZFC− , the preceding theorem yields the next. Theorem 33: For each inaccessible cardinal λ, there is a (λ, λ)-model of second-order ZFA.
Acknowledgments Some of the results contained in this paper were first presented at an Arché Workshop at the University of St Andrews and at the 2002 Annual Meeting of the Association for Symbolic Logic in Las Vegas. Others were discussed at the Logic Seminar of the University of Barcelona. A version of this paper was presented at the 12th International Congress of Logic Methodology and Philosophy of Science at Oviedo, Spain. We are grateful to the audiences at each meeting for helpful discussion. We are also grateful to Roy Cook for comments on an earlier draft. The first author gratefully acknowledges partial support from the Spanish DGICYT under grant BFM2002-03236.
Appendix 1. λ-Models and axioms of ZFU− The purpose of this appendix is to provide a proof of Proposition 5. So, let ᏹ = M, h be a λ-model, where λ is infinite. We must show that the structure M, E h , Sh satisfies the second-order axioms of extensionality, separation,
empty set, pair, replacement and choice, and that (1) it satisfies the axiom of infinity iff λ is uncountable, (2) it satisfies the axiom of union iff λ is regular, and (3) it satisfies the axiom of power set iff λ is a strong limit cardinal. Let us consider each axiom in turn: Extensionality. If a, b ∈ Sh and ∀x ∈ M(x E h a ↔ x E h b), then h −1 (a) = −1 h (b). Since h is injective, a = b.
Well- and non-well-founded Fregean Extensions
327
Separation. Let X ⊆ M and a ∈ Sh . We must show that there is some b ∈ Sh such that: ∀x ∈ M(x E h b ↔ x E h a ∧ x ∈ X ). −1
Now, |h (a) ∩ X | < λ. Let b = h(h −1 (a) ∩ X ). We have: x E h b ↔ x ∈ h −1 (a) ∧ x ∈ X ↔ x E h a ∧ x X. Empty set. Since λ > 0, Ø[M]<λ . Let a = h(Ø). Then a ∈ Sh and for no x ∈ Sh , x E h a. Pair set. Let a, b ∈ M. Since λ > 2, {a, b} ∈ [M]<λ . Let c = h({a, b}). Then c ∈ Sh and for all x ∈ Sh , x E h c iff x = a ∨ x = b. Replacement. Let F : M → M and a ∈ Sh . We must show that there is some b ∈ Sh such that: ∀y ∈ M(y E h b ↔ ∃x(x E h a ∧ F(x) = y)). −1
Since |h (a)| < λ, |F“h −1 (a)| < λ. Let b = h(F“h −1 (a)). We have: y E h b → y ∈ F“h −1 (a) → ∃x(x ∈ h −1 (a) ∧ F(x) = y) → ∃x(x E h a ∧ F(x) = y). Choice. We deal the version of the axiom of choice according to which if a is a disjointed set of non-empty sets, there is a set b which has exactly one element in common with each member of a. So suppose that a ∈ Sh is such that 1. ∀x ∈ M(x E h a → ∃y x E h y), and 2. ∀x y ∈ M(x E h a ∧ y E h a ∧ x = y ↔ ¬∃z(z E h x ∧ z E h y)).
We must show that there is some b ∈ Sh such that: ∀x ∈ M(x E h b → ∃!z(z E h b ∧ z E h x)). Consider the set A = {h −1 (x) : x ∈ h −1 (a)}. By assumptions 1 and 2, A is a disjointed set of non-empty sets and |A| < λ. By the axiom of choice, there is a B ⊆ A such that |B ∩ h −1 (x)| = 1, for all x ∈ h −1 (a). Since |A| < λ, B ∈ [M]<λ . Let b = h(B). For all x ∈ M, x E h a → x ∈ h −1 (a) → ∃!z(z ∈ B ∩ h −1 (x)) → ∃!z(z E h b ∧ z E h x). Infinity. If the axiom of infinity holds in M, E h , Sh , then M <λ certainly contains an infinite set. So λ must be uncountable.
328
The Arché Papers on the Mathematics of Abstraction
For the converse, assume λ is uncountable. First we see that each h-set a has a successor, s(a), i.e., that there is a h-set b such that ∀x ∈ M(x E h b ↔ x ∈ a ∨ x = a). For, since a ∈ Sh , h −1 (a) ∈ [M]<λ . Since λ ≥ ω, h −1 (a) ∪ {a} ∈ [M]<λ . Let b = h(h −1 (a) ∪ {a}). For all x ∈ M, x E h b iff x ∈ h −1 (a) ∨ x = a. Thus x E h b iff x E h a ∨ x = a. Let 0h = h(Ø) and let A ⊆ M be the closure of {0h } under successor. Since A is countable, A ∈ [M]<λ . Let c = h(A). We check that c satisfies the axiom of infinity. Since 0h ∈ A, 0h E h c. Further, if x E h c, then x ∈ A; thus the successor of x, s(x), is a member of A and s(x)E h c. Union. It is clear that if λ is singular, union fails. For the converse, assume that λ is regular. Suppose a ∈ Sh . We will show that there is b ∈ Sh such that for all x ∈ M, x E h b ↔ ∃y(y E h a ∧ x E h y). Let A = {h −1 (y) : y ∈ Sh ∧ y ∈ h −1 (a)}. of A Since both A and all members have cardinality < λ and λ is regular, A ∈ [M]<λ . Let b = h( A). For all x ∈ M, A x Eh b ↔ x ∈ ↔ ∃z(z ∈ A ∧ x ∈ z) ↔ ∃z∃y(y ∈ Sh ∧ y ∈ h −1 (a) ∧ z = h −1 (y) ∧ x ∈ z) ↔ ∃z∃y(y ∈ Sh ∧ y ∈ h −1 (a) ∧ x ∈ h −1 (y)) ↔ ∃y(y E h a ∧ x E h y). Power set. If there is X ∈ M <λ whose power set has cardinality ≥ λ, then the power set axiom will fail for h(X ) in M, E h , Sh . Thus, if the power set axioms holds, λ is a strong limit. For the converse, assume that λ is a strong limit cardinal. Suppose a ∈ Sh . We will show that there is a b ∈ Sh such that for all x ∈ M, x E h b ↔ ∀y(y E h x → x E h a). Let B = {h(X ) : X ∈ P(h −1 (a))}. Since h −1 (a) ∈ [M]<λ and λ is a strong limit, B ∈ [M]<λ . Let b = h(B). For all x ∈ M, x Eh b ↔ x ∈ B ↔ ∃X (X ⊆ h −1 (a) ∧ x = h(X )) ↔ h −1 (x) ⊆ h −1 (a) ↔ y(y E h x → x E h a).
Well- and non-well-founded Fregean Extensions
329
Appendix 2. AFA The usual formulation of AFA is that every graph has a unique decoration. Now we define a decoration and show that the usual formulation is equivalent, in ZFC minus foundation, to the one we have given. For details, see [1]. A decoration of a graph Ꮽ = A, is a function d on A such that for every a ∈ A, (11) d(a) = {d(b) : b a}. We show in ZFC− that α 1 iff β1 , and α2 iff β2 , where α1 : every graph has at least a decoration, α2 : every graph has at most a decoration, β1 : every strongly extensional graph is isomorphic to a transitive set, β2 : every transitive set is strongly extensional. (α1 ⇒ β1 ) If Ꮽ = A, is a graph and d is a decoration of Ꮽ, then d“A is a transitive set. If Ꮽ is strongly extensional, d is injective, hence an isomorphism between Ꮽ and d“A, ∈ . (α2 ⇒ β2 ) Let a be a transitive set and let R be a bisimulation on a. Define the relation ≺ in R by x, y ≺ u, v iff x ∈ u ∧ y ∈ v. Since a is transitive, each of the maps d1 and d2 on R defined by d1 ( x, y ) = x and d2 ( x, y ) = y is a decoration of R, ≺ . Thus, by α2 , d1 = d2 , i.e., x Ry → x = y. That is, a, ∈ is strongly extensional. (β1 ⇒ α1 ) Let Ꮽ = A, A be a graph. Let Ꮾ = B, B be its quotient by ≡Ꮽ with quotient map π . By β1 , there is a transitive set a and an isomorphism σ between Ꮾ and a, ∈ . The composition σ ◦ π is a decoration of Ꮽ. (β2 ⇒ α2 ) Suppose d1 and d2 are decorations of the graph Ꮽ = A, A . Let a1 = d1 “ A and a2 = d2 “A. Since a1 and a2 are transitive sets, so is their union a = a1 ∪ a2 . Let R = { d1 (x), d2 (x) : x ∈ A}. R is a bisimulation on a. Since a is strongly extensional, R ⊆ Id A , so that d1 (x) = d2 (x), for all x ∈ A, i.e., d1 = d2 .
References Aczel, P.: Non-Well-Founded Sets, CSLI Publications, 1988. Boolos, G.: Iteration again, Philos. Topics 17 (1989), 15–21. Hale, B. and Wright, C.: The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics, Oxford University Press, 2001. Shapiro, S.: Foundations without Foundationalism: A Case for Second-Order Logic, Oxford Logic Guides 17, Oxford University Press, 1991. Shapiro, S.: Prolegomenon to any future neo-logicist set theory: Abstraction and indefinite extensibility, British J. Philos. Sci. 54 (2003), 59–91. Zermelo, E.: Über Grenzzahlen und Mengenbereiche: Neue Untersuchungen über die Grundlagen der MengenlE h re, Mengenlehre Fund. Math. 16 (1930), 29–47.
ABSTRACTION & SET THEORY 1 Bob Hale
1.
Preliminaries
The Neo-Fregean programme in the philosophy of mathematics aims to provide a foundation for a substantial part of mathematics in abstraction principles which can be regarded as implicitly definitional of fundamental mathematical concepts. By abstraction principles we mean, roughly, 2 principles of the shape: ∀α∀β((α) = (β) ↔ α ≈ β) where ≈ is an equivalence relation on entities of the type of α, β, . . . , and is a function from entities of that type to objects. Prominent examples are the Direction Equivalence: The direction of line a = the direction of line b ↔ a and b are parallel in terms of which much of Frege’s original discussion 3 of such principles is conducted; Hume’s Principle: ∀F∀G[Nx : Fx = Nx : Gx ↔ ∃R(F1_ R _ G)] which he considers, but eventually rejects, as a means of defining (cardinal) number; and, of course: Basic Law V: ∀F∀G[{x:Fx} = {x:Gx} ↔ ∀x(Fx ↔ Gx)] —the set or class form of Frege’s ill-fated axiom on value-ranges. As is wellknown, in the neo-Fregean view, Hume’s Principle may—Frege’s and other misgivings notwithstanding—be taken as a means of implicitly defining the concept of number, and can serve as a foundational principle for, at least, 1 This paper first appeared in the Notre Dame Journal of Formal Logic 41, [2000], pp. 379–398. Reprinted by kind permission of the editor and the University of Notre Dame. 2 —‘Roughly’ because it is desirable to count as abstractions some principles which don’t, as they stand, have precisely this form. In the only case that matters for present purposes, is a function of two arguments, not one, and the RHS relation is 4-rather than 2-termed. One can easily deal with this, either by introducing ordered pairs by abstraction or by generalising the notion of an equivalence relation. 3 In Frege (1884), §§62–7.
331 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 331–352. c 2007 Springer.
332
The Arché Papers on the Mathematics of Abstraction
elementary arithmetic. In its philosophical aspect, this claim is, of course, controversial. But it is not my purpose to engage further in that controversy here. Instead, I would like to explore—rather tentatively, I should stress— the prospects for developing a version of set theory along abstractionist lines. Can we find an abstraction principle, or principles, which might serve as the foundation for an interesting theory of sets? An abstraction principle which can plausibly be seen as implicitly defining the concept of set will do so by fixing the identity conditions for its instances, and will—on pain of changing the subject—take the identity of sets to consist in their having the same members. If we assume—what does not seem seriously disputable—that any plausible candidate will be a higher-order abstraction (i.e. will involve abstraction over an equivalence relation among concepts, rather than objects), then it seems clear, further, that the equivalence relation involved will have to be either co-extensiveness of concepts or a close relative of it. In other words, what we are looking for is, broadly speaking, a consistency-preserving restriction of Basic Law V. I think we can also assume that a suitable restriction—if one can be found—will, in effect (and in contrast with the restriction Frege himself tried), be a restriction on what concepts (can) have sets corresponding to them. If we schematically represent the sought after restriction using a secondlevel predicate ‘Good’, then the most obvious ways to restrict BLV are: (A) ∀F∀G[Good (F) ∨ Good (G) → ({x|Fx} = {x|Gx} ↔ ∀x(Fx ↔ Gx))] and (B) ∀F∀G[{x|Fx} = {x|Gx} ↔ (Good (F) ∨ Good (G) → ∀x(Fx ↔ Gx))]. The main difference between these is that (A) is a conditionalised abstraction principle, whereas (B) is unconditional, with the restriction built into the relation required to hold between F and G for them to yield the same set—fairly obviously, the resulting relation is an equivalence relation. 4 Consequently, (B) yields a ‘set’ for every F, regardless of whether it is Good or not, whereas (A) yields a set {x|Fx} only if we have the additional premiss that F is Good. If neither F nor G is Good, the right hand side of (B) holds vacuously, so we get that {x|Fx} = {x|Gx}—regardless of whether F and G are co-extensive. That is, we get the same ‘set’ from all Bad concepts. We get real sets via (B)— that is, objects whose identity is determined by their membership—only from Good concepts. 4 Reflexivity and Symmetry are obvious. For Transitivity, suppose (a) Good(F) ∨ Good(G) → ∀x(Fx ↔ Gx) and (b) Good(G) ∨ Good(H) → ∀x(Gx ↔ Hx). If the antecedents of both (a) and (b) are both false, then ¬(Good(F) ∨ Good(H)), whence (c) Good(F) ∨ Good(H) → ∀x(Fx ↔ Hx). Likewise, if the consequents of both (a) and (b) are true, then ∀x(Fx ↔ Hx), whence (c). If (a)’s antecedent is false but (b)’s consequent is true, then ¬Good(H), whence ¬(Good(F) ∨ Good(H)) and so (c) again. Similarly if (a)’s consequent is true but (b)’s antecedent is false. Essentially this is the proof given in Wright (1997), fn. 32.
Abstraction & Set Theory
2.
333
Goodness as Smallness (1)—New V
What is it for a concept to be Good? Various suggestions have been canvassed. One general approach picks up on the well-entrenched ‘limitation of size’ idea, that the set-theoretic paradoxes stem from treating as sets ‘collections’ which are in some sense ‘too big’—the collection of all sets, of all sets that are not members of themselves, of all ordinals, etc. On one version of this (Goodness is Smallness) approach, we define a concept to be Small if it is smaller (i.e. has fewer instances) than some concept under which everything, or at least every object, falls. And the favoured universal concept has been that of self-identity. Following George Boolos, 5 we say that a concept G ‘goes into’ a concept F iff there is a one–one function taking the Gs into the Fs, and that F is Small iff the concept self-identical does not go into F. If we frame our restricted set-abstraction in the style (B), the result is what Boolos called New V (and Wright calls VE 6 ): New V : ∀F∀G[{x|Fx} = {x|Gx} ↔ (Small(F) ∨ Small(G) → ∀x(Fx ↔ Gx))] Although, as is now well-known, a certain amount of set-theory can be obtained by adding New V to second-order logic, there are some problems with it. The most serious of these is that we don’t get enough set theory. As Boolos showed, neither an axiom of infinity nor the power set axiom can be obtained as theorems on this basis, so the theory is rather weak—and certainly weaker than a neo-Fregean requires, if he is to have a set-theoretic foundation for analysis. 7 I shall return to this below. A further, quite different, difficulty relates to the constraints needed to differentiate between good or acceptable abstraction principles and bad or unacceptable ones. It is clear that some constraints are needed, since not all abstractions are acceptable—as is dramatically illustrated by Basic Law V. Obviously consistency is one requirement. But it does not seem that it can be the only one, since—as Boolos also showed—one can formulate abstraction principles which are severally consistent but mutually incompatible. For example, 8 we can take as our equivalence relation the relation which holds between concepts F and G just when their symmetric difference is finite (i.e. when there are just finitely many objects which are either F-but-not-G or G-but-not-F). Writing this briefly as (F, G), we can frame the abstraction which Wright calls: Nuisances : ∀F∀G[ν(F) = ν(G) ↔ (F, G)] 5 Boolos (1987), p. 178. 6 Wright (1997), p. 300. 7 This last point may not, in itself, be as damaging as might be at first supposed, since there are known
to be abstractionist methods of obtaining the real numbers which avoid any essential reliance upon an underlying set theory. But a neo-Fregean should be concerned to develop as powerful a set theory as can be done using the resources—centrally, but not necessarily only, abstraction principles—at his disposal, and so may hope to do better than New V. 8 The example is not Boolos’s own, but a very similar one taken from Wright (1997), pp. 289–91.
334
The Arché Papers on the Mathematics of Abstraction
As Wright shows, this is a consistent abstraction, but is satisfiable only in domains containing finitely many objects. Hume’s Principle by contrast, though likewise provably consistent, is satisfiable only in domains containing at least a countable infinity of objects. Since they are thus mutually incompatible, Hume’s Principle and Nuisances cannot both be acceptable. The problem for the neo-Fregean is to justify rejecting the latter as unacceptable. To this end, Wright proposes a constraint—his first conservativeness constaint 9 —which has obvious affinities to the notion of conservativeness deployed by Hartry Field in his defence of nominalism. 10 His plausible thought is, roughly, that a satisfactory explanation of a concept—whether by means of an abstraction principle or other form of definition—should do no more than fix the truthconditions of statements involving that concept. It should have nothing to say about the truth-values of statements which already have determinate truthconditions independently of the introduction of that concept, and in particular, it should carry no implications for the extensions of other concepts unconnected with the concept the explanation seeks to introduce. If we think of an abstraction principle as added to an existing theory, the requirement can be expressed, still somewhat roughly, as that the abstraction should carry no implications regarding the ‘old’ ontology—the ontology of the given theory; it is should be conservative with respect to that theory in the sense that its addition to the theory does not settle the truth-values of any statements expressible in the old language which are left unsettled by that theory. 11 Precisely because it is satisfiable only in finite universes, Nuisances does carry such implications—it implies, for example, that there are at most finitely many aardvarks, or subatomic particles, or space-time points, etc.—it fails this constraint and should therefore be rejected as unacceptable, even though consistent. Hume’s principle, by contrast, whilst implying that there are at least countably infinitely many objects, places no restrictions—either upper or lower bounds—upon the extensions of concepts other than the concept it is intended implicitly to define. That it places no upper bound is obvious. But it is crucial that it places no lower bound either, since the requirement that there be at least infinitely many objects is satisfied by the abstracts—numbers—which it itself serves to introduce, so that it makes no demand that any other concepts should have infinitely many instances. So far, so good. But how do things stand with regard to New V? Stewart Shapiro and Alan Weir, 12 exploiting a point made originally by Boolos, 13 have argued that New V violates Wright’s own first constraint. In essence, 9 ‘first’, because Wright introduces a second quite distinct conservativeness constraint, which will be discussed briefly later. 10 See Field (1980) and subsequent papers collected together in Field (1989). 11 For a fuller discussion and more precise articulation of the proposed constraint, see Wright (1997), pp. 295–7, especially fn. 49. 12 Shapiro & Weir (1999). 13 Boolos (1989), p. 102.
Abstraction & Set Theory
335
the argument is simple enough. If the concept ordinal is Small, New V yields a set (and not just a ‘set’) of all the ordinals and we have the Burali-Forti contradiction. Hence ordinal must be Big. But in that case it is exactly as big as the universe, i.e. there is a one–one correspondence between ordinal and a(ny) universal concept, say self-identity. But this, together with the fact that the ordinals are well-ordered by membership, entails Global Well-Ordering— the existence of a well-ordering of the universe. Given that the existence, or otherwise, of such a well-ordering may reasonably be taken to be independent of existing theory, New V must be reckoned non-conservative.
3.
Goodness as Smallness (2)—Small2 V
Since the difficulty for New V just explained crucially exploits the definition of Small as, in effect, smaller than the (or a) universal concept, it is possible that it could be avoided by a suitable re-definition of Smallness. As a first step in the direction I have in mind, one might, in preference to defining Smallness in terms of being smaller than some specified universal concept, say that a concept is Small if it is smaller than some concept or other—where F < G if there is a bijection of F into G but not-(F ∼ G)—and take as our set-abstraction either New V with Small so defined, or perhaps a conditional ((A)-type) abstraction: Small V : ∀F∀G[Small(F) ∨ Small(G) → ({x|Fx} = {x|Gx} ↔ ∀x(Fx ↔ Gx))] However, it is obvious that, at least so long as 14 some universal concept, V, is in play and can serve as providing an upper bound, so to speak, on the potential sizes of concepts, this simple suggestion makes no advance over New V as originally understood. For then, since any concept can be no bigger than our universal concept V, a concept F will be smaller than some concept G only if smaller than V, and if smaller than V, will certainly be smaller than some concept or other—so ∃G F < G iff F < V. But now if ordinal is Small in this sense, and so Good, we shall have the Burali-Forti again, so that ordinal must be Bad, i.e. ordinal ∼ V, and we have Global Well-Ordering, just as before. Flawed as our simple proposal is, there is a refinement 15 of it which really does avoid the Global Well-Ordering problem. This is to interpret Goodness as Double Smallness, where a concept is doubly small if and only if it is (strictly) smaller than some concept which is itself (strictly) smaller than some concept, i.e. Small2 (F) ↔ ∃G∃H (F < G < H) Interpreting Goodness as Smallness2 blocks the reasoning which shows that New V, as originally understood with Small as meaning ‘smaller than the 14 The proviso is not merely decorative—we shall later consider possible reasons to feel misgivings about it. 15 I am heavily indebted to Crispin Wright for this suggestion and much useful discussion of it.
336
The Arché Papers on the Mathematics of Abstraction
universe’, implies Global Well-Ordering. We can still show, of course, that ordinal cannot be Good—that is, now, Small2 —since if it were, we would have the Burali-Forti paradox, just as before. So we have to agree that ordinal is Bad. But that just means that it is not Small2 , and from this we cannot infer that it is bijectible onto any universal concept (or indeed onto any concept). 16 As has already been indicated, Wright’s first conservativeness constraint is one of a pair. The second constraint he proposes 17 concerns abstraction principles which, as he puts it, embed a paradoxical component—centrally, abstractions of the type: (D) ∀F∀G[(F) = (G) ↔ ((φ(F) ∧ φ(G)) ∨ ∀x(Fx ↔ Gx)))] of which New V is an instance. 18 In general, by exploiting the reasoning that leads from BLV to contradiction, we can prove, from any instance of (D), that ∃F φ(F). For example, from New V, one can prove, via the Russell contadiction, that there are Bad concepts (i.e. that not all concepts are Small). In particular, we can prove that self-identity is Big. But as Wright observes, that is a result which we can prove independently of New V, as a theorem of second-order logic. The second constraint proposes that this last condition should be met by any (D)-type abstraction or, perhaps more generally, any abstraction which embeds a paradoxical component. As Wright at one point expresses it: “any consequences which may be elicited [from the abstraction] by exploiting its paradoxical component should be, a priori, in independent good standing.” 19 The precise force of this constraint depends, obviously, on what is to be understood by consequences being in independent good standing. Being independently provable in logic alone would clearly suffice, but Wright does not wish to accept that as a necessary condition: “. . . ‘independent good standing’ might also reasonably be taken to cover the case where a consequence elicited from such an abstraction by ‘fishy’—paradoxexploitative—means can also be obtained not from logic alone but, as it were, 16 Roy Cook has pointed out that whilst the Global Well-Ordering result is blocked, one still gets a significant result—that, since ordinal is not Small2 , there cannot be any concept that is bigger than ordinal but smaller than the universe. Though a weaker result, this is, Cook remarks, independent of second-order ZFC. Perhaps so, but it is not a non-conservativeness result for a set theory based on New V with Small interpreted as Small2 (or Small2 V, i.e. Small V with Small so re-interpreted) in the sense in which Global Well-Ordering is a non-conservativeness result for original New V. Global Well-ordering implies wellordering for the ‘old’ ontology—i.e. the universe of objects as a whole, including those not included among the abstracts provided by New V—in violation of Wright’s first conservativeness constraint (see Wright (1997), p. 296ff). By contrast, if our set-theory is to be based on Small2 V, then the theory of ordinals will be naturally construed as a part of it, and the ordinals themselves will be a species of the new abstracts so introduced. That this species is at most singly small would seem to have no bearing on anything—cardinal or ordinal—essentially to do with the old ontology. 17 In Wright (1997), pp. 300–2. 18 On (D)-type abstractions, see Wright (1997), section IV. φ is any property of concepts for which co-extensiveness is a congruence, i.e. φ(F) and ∀x(Fx ↔ Gx) jointly entail φ(G). New V comes from the schema by reading φ as Big. New V as formulated here is not strictly of the form (D), but is obviously equivalent to an abstraction of that form. 19 Wright (1997), p. 303. A more precise formulation is given at p. 304.
Abstraction & Set Theory
337
innocently from additional resources provided by that very abstraction.” 20 Given this qualification, it is at least not clear that the derivability of Global Well-Ordering via the Burali-Forti constitutes a violation of conservativeness in Wright’s second sense, as distinct from his first. And for essentially the same reason, the fact that Small2 V, although not implying Global Well-Ordering, does imply that there can be no concept larger than ordinal but smaller than the universe 21 is not clearly in breach of the second constraint either. But, at least pending further clarification of the key notion of independent good standing— and especially of what it is for a result to be establishable only in a viciously paradox-exploitative way—it is anyway not clear that what form, if any, of the second constraint should be respected. Certainly ‘paradox-exploitative’ had better not be understood so liberally as to render any proof by reductio ad absurdum as such. No acceptable constraint—and certainly none that Wright intended—should require that we may accept a result established by reductio ad absurdum only when we can independently prove it by constructive means. The other, and almost certainly the most serious, of the two problems New V faces, as we noted, is that it suffices for only a rather weak set theory. In particular, it doesn’t give us either an axiom of infinity or a power set axiom as theorems. And the same goes, of course, for New V re-interpreted with Good as Small2 and Small2 V. As Wright has observed, however, this need not be a crippling drawback from the neo-Fregean’s point of view, if he can justify supplementing New V, or Small2 V, with other principles— perhaps other abstraction principles—which compensate for its weakness. On this more catholic approach, we separate two distinct rôles one might ask a setabstraction principle to discharge—fixing the concept of set, on the one hand and, on the other, serving as a comprehension principle. The claim would be that New V’s—or Small2 V’s—shortcomings as a comprehension principle need not debar it from successfully discharging a concept-fixing rôle—of serving as a means of introducing the concept, whilst leaving its extension to be determined, largely or even entirely, 22 by other principles. I want to discuss a couple of ways in which this might be done. I’ll concentrate on the possibility of supplementing Small2 V 23 with other abstraction principles—I don’t think it is obvious that the neo-Fregean could not justify using supplementary 20 Wright, loc.cit. 21 See note 16. 22 Largely, but not entirely, if one works with a (B)-type abstraction, such as New V with Good understood as Small2 , but entirely if one works with a conditionalised, (A)-type abstraction, such as Small2 V. I shall make no attempt to adjudicate here whether there are compelling reasons to favour one approach over the other. Very roughly, the fragment of standard set theory which one can recover without appeal to non-set-theoretic abstraction principles—though existentially very weak—is larger if one works with (B)type principle such as New V rather than an (A)-type principle such as Small2 V. This might be thought a reason for preferring New V over Small2 V. 23 A good deal of what I shall be saying applies, with relatively minor adjustments, to a development based on New V with Good interpreted as Small2 . It would unduly complicate the discussion to keep both alternatives in play throughout.
338
The Arché Papers on the Mathematics of Abstraction
principles other than abstractions, but I won’t pursue that alternative here. The general strategy, then, is to look for other abstraction principles which might be used to set up sortal concepts in the presence of which we get an interesting range of Small2 sortal concepts which have sets corresponding to them.
4.
Cut principles
The first approach I want to consider makes essential use of a kind of abstraction principle which plays a key rôle in a neo-Fregean construction of the real numbers I developed a little while ago. 24 In outline, the leading idea was to get the real numbers in broadly the way that Frege proposed to do, by defining them as ratios of quantities. The concept of a ratio of quantities is itself to be introduced by means of an abstraction corresponding to the ancient equimultiples principle: The ratio a:b = the ratio c:d iff for all positive integers m and n, ma is equal to, greater or less than nb according as mc is equal to, greater or less than nd.
Here a, b are quantities of some single kind, and likewise c, d. Crucially, c and d need not be of the same kind as a and b. 25 Quantities are themselves abstract objects (defined by abstraction over quantitative equivalence relations 26 ). To get all the positive reals (and, with a little extra work, all the reals), ratio abstraction has to be applied to a sufficiently rich abstract structure—what I called a complete quantitative domain. A kind of quantity Q constitutes a complete domain if and only if it is closed under a commutative and associative operation ⊕ such that exactly one of: a = b, ∃c(a = b ⊕ c), ∃c(b = a ⊕ c) holds for any ‘elements’ of Q, and the following further conditions are met: [Archimedean condition] ∀a, b ∈ Q∃ m(ma > b) [Fourth proportionals]
∀a, b, c ∈ Q∃d ∈ Q(a:b = d:c)
[Completeness]
Every bounded non-empty property P on Q has a least upper bound
Given a complete domain of quantities, it is hardly surprising that one gets the (positive) reals by ratio abstraction over it. The obvious question is: can the neo-Fregean justify the assumption that there exists at least one such domain? If attention is restricted to domains of physical quantities (i.e. quantities ‘belonging’ to physical objects), then the answer is almost certainly: No. But nothing in the definition of quantity, or that of quantitative domains, precludes 24 In Hale (2000). 25 Crucially, because we want the same real numbers to be applicably in measuring quantities of different
kinds. 26 For some explanation of this, see Hale (2000), section II.
339
Abstraction & Set Theory
recognising numbers themselves as a kind of quantity. In particular, the positive natural numbers—whose existence the neo-Fregean can justify by appeal to Hume’s principle—form a quantitative domain meeting all but the last two + conditions. And the ratios of positive natural numbers, R N , form a domain meeting all but the last condition—completeness. To demonstrate, a priori, the existence of a complete domain, the neo-Fregean can mimic Dedekind’s construction. Define a property P of ratios of positive natural numbers to be a cut-property just in case P is non-empty, bounded above, downwards closed and has no greatest instance. Then we can abstract over the cut-properties on + R N , using: Cut abstraction : ∀F∀G[Cut(F) = Cut(G) ↔ ∀x(Fx ↔ Gx)] +
where x varies over just R N and F, G just cut properties on R N
+
The Cuts thus obtained can then be shown to form a complete domain. For present purposes, it is Cut abstraction that is of primary interest to us. On the basis of Hume’s principle, we can define the property of being a natural number, and show that it has no end of instances. Appealing then to + Cut abstraction, we can define the property of being a Cut on R N and show that the property of being a natural number is smaller than this property, and so is Small. If an abstractionist set theory could be based on Small V, this would suffice to give us an infinite set—the set of natural numbers. That is, Small V combined with Hume’s Principle and Cut abstraction would give us the effect of an axiom of infinity. However, if we are working with Small2 V, we need to show that natural number is doubly small before we can obtain the corresponding set. Can we do so? Consideration of an objection that has been brought against my use of Cut abstraction 27 suggests a way in which we might. Cut abstraction may be viewed as an instance of a general schema: Cut-schema : ∀F∀G[Cut(F) = Cut(G) ↔ ∀x(Fx ↔ Gx)] where x varies over a suitable domain Q and F, G over cut properties on Q What counts as an instance of this schema is, of course, unclear until we say what counts as a suitable domain. Although the definition of cut-property makes sense whatever domain we assume, there only exist cut-properties if the domain is suitably structured—it will need to be at least densely ordered. It is certainly implausible to suppose that the particular Cut principle I’ve used is the only acceptable instance of this schema. Cut principles, like Hume’s principle and unlike the Direction Equivalence, are second-order abstractions—they abstract over an equivalence relation on concepts, rather than objects. But in other respects, they differ significantly 27 By Roy Cook in Cook (forthcoming). I am grateful to him both for letting me see earlier versions of this paper, and for helpful discussion of it.
340
The Arché Papers on the Mathematics of Abstraction
from Hume’s principle. Let us assume that we are concerned with the application of abstraction principles to domains of definite cardinal size, finite or infinite. With Hume and cut principles, the underlying domain is a domain of concepts. There will be more of these than there are objects, however concepts are individuated. If there are κ objects, and concepts are individuated purely extensionally, there will be exactly 2κ concepts. Thus a second-order abstraction can ‘generate’ up to 2κ abstracts, when the initial domain of objects is κ-sized. Applied to any domain of concepts, BLV generates the maximum collection of abstracts—one for each concept. By contrast, Hume’s principle is quite modest. Because its equivalence relation partitions the concepts into equivalence classes by equinumerosity, and there are just κ + 1 such classes when there are κ objects over which the concepts are defined, Hume generates more abstracts than there are other objects when, but only when, κ is finite. When κ is infinite, Hume generates κ + 1 abstracts from the 2κ concepts, and κ + 1 = κ. Hume inflates on finite domains, but not on infinite domains. This ensures that Hume has no finite models. But it is stable at infinite cardinalities. Cut principles behave quite differently. As noted, if a cut principle is applied to a finite domain, it generates no abstracts at all, as there are no cut properties on the domain. If the underlying domain is infinite and at least densely ordered, what happens depends upon whether the domain is strictly densely ordered (like the rationals) or completely ordered (like the reals). Applied to a strictly dense domain, cut abstraction inflates, giving a completely ordered domain of abstracts. Applied to a completely ordered domain, however, it gives a domain of abstracts isomorphic to the underlying domain, and so does not inflate. That is, what happens with the rationals and reals when cut-abstraction is applied and then re-applied is representative of what happens in general. So any one cut principle inflates on a strictly dense domain, but is not rampantly inflationary (in the sense that its iterated application leads to unlimited inflation). Various people 28 have observed that, on certain standard set-theoretic assumptions, any set of definite cardinal size can be put in a (strictly) dense linear order, so that whilst it is true—as I have claimed—that the re-application of a cut principle to the (complete and so not strictly dense) domain of abstracts generated by its application to a strictly dense domain does not inflate, there will be a strictly dense ordering of the domain of cut abstracts thus generated on which a cut principle does inflate. So that if we start with a countable strictly dense ordering and apply a cut principle to get a 2ℵ0 -sized domain, C, of cuts, there is a strictly dense ordering of C, call it C*, to which another cut principle may be applied to get a new domain of cuts of size 2 2ℵ0 , and so on. In particular, Roy Cook has proved:
28 Including Stewart Shapiro, Roy Cook, and one (still) anonymous referee of the paper in which some of these ideas were first put forward.
341
Abstraction & Set Theory
Cook’s Theorem: 29 For any infinite cardinal κ, there is a linear order (A, <) such that |A| ≤ κ and | Comp (A, <)| > κ Proof: Given an infinite cardinal κ, let λ be the least cardinal ≤ κ such that 2λ > κ. Let A be the subset of functions from λ (as an ordinal) into {0, 1} such that f ∈ A iff there is an ordinal γ < λ such that for all ordinals α ≥ γ , f (α) = 0. For f , g ∈ A, let f < g iff, at the least γ where f (γ ) = g(γ ), f (γ ) = 0. Then |A| ≤ κ by the following computation: |A| = | ≈ γ 2 | ≤ 2|γ | ≤ κ≤λ × κ=κ γ<λ
γ<λ
γ<λ
But: |Comp(A, <)| = 2λ > κ, since Comp(A, <) is isomorphic to the set of all functions from λ to {0, 1}. Cook thinks his result is disastrous for the neo-Fregean logicist, because he thinks that the neo-Fregean should only endorse abstraction principles that are ‘epistemologically modest’, but that certain ‘natural generalisations’ of my cut-principle are clearly epistemologically extravagant. Specifically, he argues that the conjunction of Hume’s principle with a generalised cutprinciple: GCA
∀P∀Q∀{H, < }[Cut(P, {H, < }) = Cut(Q, {H, < }) ↔
∀x((H x ∧ P and Q are cut properties on {H, < }) → (P x ↔ Qx))] has, at best, only proper class sized models, and that its conjunction with a generalised cut schema: GCA-Schema All formulae of the form: ∀P∀Q[Cut(P, {H, < }) = Cut(Q, {H, < }) ↔ ∀x(H x ∧ P and Q are cut properties on {H, < } → (P x ↔ Qx))] may have set-sized models, but if so, can have only models of cardinality infinitely many times up from that of the continuum. I don’t have space to discuss this objection in the detail it deserves, so I shall be brief and somewhat dogmatic. I am unmoved by it for two main reasons. First, Cook seems to me to give no compelling reason why a neo-Fregean abstractionist must endorse either his generalised cut-principle or even all instances of his generalised cut-schema. GCA is not itself an abstraction principle, and it is not clear why an abstractionist should be committed to it. No doubt there are many instances of the generalisation to which the abstractionist 29 This statement of the theorem and its proof are taken verbatim from Cook (2001), section (6). By ‘Comp(A, <)’ Cook means the set of Dedekind Cuts on (A, <).
342
The Arché Papers on the Mathematics of Abstraction
should have no objection, but that does not amount to a reason for thinking that he must assert the generalisation itself. After all, there are doubtless many instances of the general abstraction schema: ∀α∀β[(α) = (β) ↔ α ≈ β] with which the abstractionist should have no quarrel, but he can hardly be expected to endorse its generalisation: ∀ ≈ ∀∀α∀β[(α) = (β) ↔ α ≈ β] which implies, inter alia, BLV, and is therefore outright inconsistent! I would of course agree that an abstractionist shouldn’t reject any instance of GCASchema without good reason, but that is not the same thing as being committed to all instances. Second, Cook’s understanding of epistemological modesty seems to me flawed, and indeed, simply question-begging. Cook takes it that an abstraction will be immodest if it ‘generates too many objects’. If by ‘too many’ he meant ‘too many to avoid inconsistency’, there could be no disagreeing with him. But he doesn’t—if he did, he would have an objection only if he’d shown that the generalisations, coupled with HP, lead to contradiction. In fact, it’s not clear that he means anything more precise than ‘rather a lot, by settheoretician’s standards’. The short answer to that is a question: Why should that be objectionable? Wouldn’t it actually be a rather good result, from the neo-Fregean’s perspective, if it turned out that his principles are mathematically quite powerful? Indeed—to return to our main business—it might seem that the neo-Fregean can turn Cook’s result to his own advantage. We saw how, by appealing to the + fact that the concept Natural number is smaller than the concept Cut on R N , he can show that the former concept is at least (singly) Small. But if something sufficiently close to Cook’s result were at his disposal, why should he not apply + it to the cuts on R N , together with a suitable linear ordering, to obtain a further + cut concept strictly larger than Cut on R N ? He would then be in position to apply Small2 V to obtain the countably infinite set of natural numbers. And if it can be done once, why shouldn’t it be done again, and again, to obtain larger and larger uncountably infinite sets? Before he succumbs to euphoria at the prospect of a quite powerful Small2 set theory, however, the neo-Fregean should remind himself that he does not get Cook’s result for free. We have already noted that its proof relies upon Choice. It is not obvious that Choice must be out of bounds for the neoFregean—that he could not argue for it as a logical principle, or secure its effect by means of a suitable abstraction. But it is equally not obvious that he could do so. I have not yet been able to get a clear view on the matter, and so must leave this question for further investigation. But there is, in any case, another—glaringly obvious and seemingly more troublesome—fly in the ointment. Cook’s proof begins: ‘Given an infinite cardinal κ, let λ be the least cardinal ≤ κ such that 2λ > κ . . . ’. But how do we know that there is such a
Abstraction & Set Theory
343
λ? What, in other words, justifies the assumption that there are cardinals > κ? I can see no way of justifying it without appealing to the Power Set Axiom and Cantor’s Theorem, or something at least as problematic, from the neoFregean’s point of view. If that is right, then it would seem that Cook’s result cannot after all be the blessing in disguise that it may at first appear to be. The difficulty is not decisive. It might be suggested 30 that a neo-Fregean who looks askance at Cook’s proof because it is a proof in set theory is being unduly fastidious. Consider Boolos’s proof of the equiconsistency of Frege Arithmetic (HP + second-order logic) with second-order arithmetic—this is likewise a proof in set theory, but that need not mean that the assurance it provides is unavailable to neo-Fregeans. We need a distinction between the ‘internal perspective’—which is concerned with what results can be obtained using only resources available to the neo-Fregean—and the ‘external perspective’— which is concerned with what results can be obtained, perhaps by making indispensable use of other resources, about the neo-Fregean enterprise. Why shouldn’t the neo-Fregean welcome Cook’s proof as it stands, as demonstrating ‘from the outside’ that a neo-Fregean set theory based on Small2 V + HP + (a suitably restricted) Cut schema is agreeably powerful? This suggestion raises delicate issues. Their resolution depends, in part at least, upon what principled attitude the neo-Fregean can take towards reasoning that makes essential use of principles which cannot be justified on a neoFregean basis, and to what extent he can justify reliance on such reasoning. I don’t think we know how much mathematics—and in particular how much set theory—is amenable to neo-Fregean reconstruction. My guess—and I imagine just about everyone’s—is that there may be quite severe limits on what can be so reconstructed. That begs the question: what should the neo-Fregean say about the parts that neo-Fregeanism doesn’t reach? I think he is bound to regard them as having a significantly different epistemological and ontological status from the reachable parts. But that need not mean that he must dismiss them as worthless. Perhaps he can find an indirect justification for relying on them. If so, then there may be a way to uphold the present suggestion. But even if there is, the difficulty is serious enough to warrant exploration of an alternative strategy.
5.
Power concepts
Since the doubt whether the neo-Fregean can exploit Cook’s result turns on the need to appeal to the Power Set Axiom, one might wonder whether one can secure some of the effect of Cantor’s Theorem in a higher-order logic without using the Power Set Axiom—enough to secure a significant range of concepts as Small,2 and so as having sets corresponding to them. For any first-level concept F, we can form a second-level concept F P — the concept: subconcept of F—defined by: F P (G) ↔ ∀x(Gx → F x). Define 30 I am grateful to Michael Potter for this suggestion.
344
The Arché Papers on the Mathematics of Abstraction
F ≤ G iff F ∼ H for some subconcept H of G, and define F < G iff F ≤ G but ¬ F ∼ G. Then we can prove, by an obvious adaptation of the usual proof of Cantor’s Theorem, that: ∀F F < F P . Proof: (i) For each x falling under F, there is a unitary subconcept of F—the concept: = x—under which x alone falls. Denote the (second-level) concept under which all these unitary subconcepts fall by F Unit . Obviously ∀G(F Unit (G) → F P (G)). Define the relation S by: S(x, G) ↔ F x ∧ ∀y(Gy ↔ y = x) Then x bears S to G iff G is that unitary subconcept of F under which x alone falls, and, since S is obviously one–one, we have F ∼ F Unit under R. So F ≤ F P. (ii) Suppose F ∼ F P under some one–one R. Define a subconcept D of F by: Dx ↔ F x ∧ ∀G(R(x, G) → ¬Gx) By the assumption that F ∼ F P under R, we have R(x, D) for some x falling under F. Suppose R(d, D). Suppose Dd. Then by definition of D, we have: Fd ∧ ∀G(R(x, G) → ¬Gx) whence R(d, D) → ¬Dd whence ¬Dd Suppose, then, that ¬Dd. Since R(d, D) and R is one–one, R(d, G) iff G is D (i.e. R(d, G) ↔ ∀x(Gx ↔ Dx)), it follows that ∀G(R(x, G) → ¬Gx). Hence, again by the definition of D, Dd. So Dd ↔ ¬Dd. Contradiction! Hence ¬(F ∼ F P ). This is a restricted form of Cantor’s Theorem, asserting that any first-level concept is strictly smaller than its (second-level) power-concept. To state and prove it, we need third-order logic. If we ascend to fourth-order logic, we can prove that any second-level concept is strictly smaller that its (third-level) power-concept. And, presumably, and so on . . . The prospect opens up of obtaining each finite restriction of Cantor’s Theorem—i.e. each instance of the schema: ∀φφ < φ P for φ of level n and φ P of level n + 1 in a logic of order ω. Of course, even going up this far doesn’t give us anything approaching the full strength of the Power Set Axiom, but it does suggest a method of establishing the Smallness2 of a significant series of larger and larger concepts, by noting that they are doubly smaller than the power concepts of their own power concepts. Before we turn to what difficulties may stand in the way of this approach, it is worth noticing that the recourse to logic of order ω may be avoidable— we may not need to go above fifth-order logic. Let F be a first-level concept
Abstraction & Set Theory
345
for which we can show, as above, that F < F P < F P P . Then F is Small2 . Since any subconcept of a Small2 concept is Small2 , any subconcept G of F is ∗ Small2 , and so has a set corresponding to it by Small2 V . Define F P to be the first-level property which an object y has iff y = {x|Gx} for some subconcept G of F. Let R be the relation which holds between x and y iff x is G and y = ∗ {x|Gx} for G ⊆ F. Then obviously F P ∼ R F P . Since we can prove F P < P P P P 2 P∗ is also Small2 . F P < ( F P ) in fifth-order logic, F is Small , so that F ∗ 2 P So by Small V , we have {x|F (x)}—the power set corresponding to F P , i.e. the set of all subsets of {x|F x}. ∗ Since F P is first-level, we have in third-order logic that it is smaller than its power concept F P ∗P , which in turn can be shown (in fourth-order logic) ∗∗ to be smaller than its power concept. There is a first-level concept, F P , ∗ corresponding to F P ∗P as F P does to F P . So we can repeat the foregoing ∗∗ reasoning to get the power set {x|F P (x)} corresponding to F P ∗P , i.e. the P∗ set of all subsets of {x|F (x)}. And generally, for any set of objects, X , we have each of the ascending sequence of powersets—℘ (X ), ℘ (℘ (X )), ℘ (℘ (℘ (X ))), . . . Since Nat is smaller than Nat P , which is in turn smaller than N at P P , Nat is Small2 , so we have N = {x|Nat(x)}, ℘ (N ), ℘ (℘ (N )), . . . So, taking Nat as our starting point, we can obtain first-level concepts and corresponding sets of ℵ increasing transfinite cardinality ℵ0 , 2ℵ0 , 22 0 , . . . Perhaps, then, we may be able to get a small but non-negligible theory of sets, by supplementing fifth-order logic with Small2 V . Perhaps . . . , but there is, once again, a more or less obvious fly in the ointment. For on the face of it, our special cases of Cantor’s Theorem in higher-order logic are, in one crucial respect, perfectly general. When we proved, in third-order logic, that (the firstlevel concept) F is strictly smaller than its (second-level) power concept F P , F could be any first-level concept. But with no restriction on our choice of F, we can let it be, say, self-identical, and following our route, show that that concept is Small2 , since twice smaller than the power concept of its power concept. So applying Small2 V , we have a universal set of all self-identicals. But we shall also, by the same route, be able to show that self-identical P is Small2 , whence we shall also have the powerset of that set, and so Cantor’s paradox. And similar moves with ordinal will get us the Burali-Forti. Clearly, then, some further restriction is needed, if anything like the last proposal we’ve been reviewing is to have any chance of getting anywhere useful. In my closing section I want to indicate two rather different ways in which one might try to frame and motivate a suitable restriction.
6.
Definiteness and restricted cardinality relations
The first suggestion I shall discuss has its origin in a third—and if welltaken—fundamental misgiving one might feel about New V as originally
346
The Arché Papers on the Mathematics of Abstraction
understood. Initially, this focuses on the suitability of ‘smaller than selfidentity’ as an explication of Goodness. On the face of it, it makes good sense to think of one concept F as having as many instances as, or fewer instances than, another concept G only if F and G are both sortal concepts—that is, roughly, concepts with which are associated both criteria of application and criteria of identity. Thus on the widely accepted assumption that brown is a merely adjectival, non-sortal, concept, it makes no sense to speak of the number of brown objects, or of there being as many brown objects as there are Fs, for any bona fide sortal F. The worry about ‘smaller than self-identity’ stems, initially, from a doubt on this score. To get it into focus, it will be helpful to digress briefly to re-consider an objection Boolos made 31 to Hume’s principle, turning upon the existence of the universal number—anti-zero—the number of all the objects there are, defined as N x:x = x. Boolos claimed that since neo-Fregean’s are happy to define zero as N x:x = x, they can hardly refuse to admit the existence of anti-zero, defined as proposed. But that, he argued, is disastrous, since it puts the neo-Fregean reconstruction of arithmetic in direct conflict with ZF plus standard definitions, from which it follows that there can be no such number. A crucial part of Wright’s reply 32 to this objection was that, contrary to what Boolos claimed, the neo-Fregean has very good reason to deny that there is such a number as anti-zero. For the question: How many Fs are there? to be in good order (and so for ‘the number of Fs’ to have determinate reference), F has to be a sortal concept. But self-identical is, Wright argued, no sortal. It seems undeniable that if F is any sortal concept, then so will be its restriction by any other concept G, irrespective of whether G is sortal or merely adjectival. For example, given that horse is sortal, brown horse, for example, must likewise be sortal, even though brown (or brown-thing) is itself no sortal. But now if self-identical were sortal, brown self-identical would likewise have to be so. But since every object is necessarily self-identical, brown self-identical is equivalent to brown simpliciter—necessarily an object is brown and selfidentical just in case it is brown. Since brown is not a sortal, neither can brown self-indentical be one. Nor therefore, can self-identical be one. If this is right, then the seemingly good question: How many self-identicals are there? has no determinate answer, and ‘N x: x = x’ has no determinate reference. There is no universal number. There is also space, I think, for a further doubt, about whether the contexts ‘There are just as many Fs as Gs’ and ‘There are fewer Fs than Gs’ are well-defined, or have determine truth-conditions, when one or both of F and G is non-sortal, and hence whether self-identical can be a suitable filler for G in those contexts. If not, then there is a further reason to deny that the proposed explication of Goodness as smaller than self-identity is satisfactory. 31 Boolos (1997), pp. 313–4. 32 Wright (1998).
Abstraction & Set Theory
347
I think an objector might concede that a concept F must be sortal for the how many question and talk of the number of Fs to be in good order, and agree that self-identical is therefore, as it stands, unsuitable, but argue that we can get around this and re-instate anti-zero, by defining it slightly differently. First note that if F is sortal, then so is self-identical F (i.e. the concept for which the predicate ‘x is the same F as x’—briefly ‘x = F x’—stands). Of course, one can’t get around Wright’s objection to anti-zero, or the related difficulty I’ve raised, just by picking some particular sortal concept F and using self-identical F in place of self-identical. More precisely, self-identical F will—though sortal—fail to apply to every object unless F itself does so; but if F itself is a universal sortal, then the detour through self-identity is a waste of time, since anti-zero could then be just defined as N x:F x, and we could simply explain Good as smaller than F. We may, however, form the complex predicates ‘For all F, x = F y’ and ‘For some F, x = F y’. And from these in turn we may form ‘For all F, x = F x’ and ‘For some F, x = F x’. Presumably the first of these last two is true of no object whatever, and it would seem that every object whatever must satisfy the second. And this—or so it might be supposed—gives us a way out of both difficulties: just define anti-zero as Nx: ∃F x = F x, and ‘smaller than the universe’ as ‘smaller that the concept ∃F x = F x’. Of course, this way out is good only if the concept ∃F x = F x is itself a genuine sortal concept. The mere fact that ‘For some F, x = F x’ is true of every object is certainly not enough to make it a sortal predicate—any more than the fact that ‘x has mass’ is true of every physical object is enough to make it a sortal predicate of physical objects. For being, for some F, the same F as itself to be a genuine sortal, there needs to be a criterion of identity for the objects falling under it. But so, it seems, there is. Let us abbreviate our predicate ‘For some F, x = F x’ by ‘V x’. Suppose b and c both satisfy ‘V x’. What condition is both necessary and sufficient for b and c to be one and the same V ? Well, the obvious answer is that b and c are one just in case for some single F, = F c. ‘V ’ has thus both a criterion of application—V x iff for some F, x = F x—and a criterion of identity— x = V y iff for some F, x = F y. It thus appears that V is a genuine sortal concept. Does that show that Boolos was after all right, and Wright wrong? I don’t myself think so. A concept F’s being sortal is a necessary condition for the how many question to be in good order and for the corrresponding term ‘N x:F x’ to have determinate reference. But I think it is arguably not sufficient. Indeed, it is fairly obviously insufficient, if there are—as there certainly seem to be—concepts which are sortal but indefinitely extensible in Dummett’s sense (however one thinks that difficult notion is best to be explicated). The concepts of ordinal number, cardinal number and set all seem to be in this case. And, since the ordinals, cardinals and sets are among the objects that there are, it is plausible that any universal sortal concept must likewise
348
The Arché Papers on the Mathematics of Abstraction
be indefinitely extensible. 33 But in any case, there is a particular reason to doubt that ‘N x:V x’ can have a determinate reference. For—given that our proposed definition of the universal concept V involves quantification over (sortal) concepts—it could do so only if it were already determinate what sortal concepts there are. It can scarcely be that there is a determinate answer to the question: How many objects are there?—where this is construed as: For how many x do we have ∃F x = F x?—unless there is a determinate answer to the question: What sortal concepts are there? It is at least not obvious that there can be a determinate answer to that question. Someone might protest: “There is no difficulty over that. For any given domain of objects, the corresponding domain of concepts is fixed. For each and every way of dividing the domain of objects, there is a concept, and those are all the concepts. If the domain of objects comprises k objects, there are thus 2k concepts.” But there is an obvious difficulty with this answer. The use of the phrase “For any given domain of objects” gives the game away. Whether there is or is not a determinate domain of objects (i.e. all objects whatever) is precisely our problem—clearly if there is, there is nothing amiss in the assumption that it has a definite cardinality, even if we are unable to determinate what that cardinality is. Thus to assume a domain of objects ‘given’ is simply to fail to engage with the problem, or to assume it somehow solved. We cannot both assume a given domain of objects as a means of fixing the range of the quantifier ‘For some F’, and at the same time use that quantifier to define the sortal concept ‘V x’ (i.e. ‘x is an object’). If a domain of objects is already somehow fixed as comprising k objects, then it is, of course, quite right that there are 2k concepts on the domain—at least provided that concepts are individuated extensionally. However, whilst there is no general objection to treating concepts extensionally—as, in effect, determined simply by what objects fall under them—it is questionable whether they are appropriately so treated in the present context, for at least two, and perhaps three, reasons. The first, more general, reason is that it makes sense to think of concepts extensionally only if it is already determinate what objects belong to the domain on which they are to be thought of as defined. That condition may well be met in a particular case—it will be met if, for example, we are considering the domain comprising exactly the natural numbers. But it clearly cannot be assumed met in the present case. Secondly, and more specifically, the whole point of insisting that an identity-statement x = y has to be understood as asserting that x and y are one and the same F, for some appropriate sortal F, is lost, if the covering sortal F is thought of as determined purely extensionally. The point is—at least in part—that objects cannot be individuated save as instances of some 33 This will be so, if we can assume that if F is indefinitely extensible and ∀x(Fx → Gx), then G is likewise indefinitely extensible.
Abstraction & Set Theory
349
sortal concept or other, so that unless some appropriate sortal is specified or understood from the context, it is simply not determinate what is being asserted, when it is said that x = y. If objects could be individuated simply as objects, there would be no justification for insisting that ‘x = y’ must be understood as elliptical for ‘x = F y’ for some specific sortal F, such as horse, person, number or the like—any identity-statement x = y could be understood as claiming simply that x is the same object as y. The third reason—which should, I think, weigh with the neo-Fregean, but may not be felt compelling by others—is that taking the sortal concepts to comprise just the extensionally individuated concepts on some supposed fixed domain of objects seems, in effect, simply to beg the question against the idea that abstraction principles give a way of introducing ‘new’ sortal concepts, with a ‘new’ range of objects falling under them. If what I’ve said is right, the universal concept self-identical under F, for some sortal F exhibits something akin to the property of indefinite extensibility. I’m not sure that it is indefinitely extensible in the usual sense, which requires, for a concept G to be indefinitely extensible, that given any definite collection of Gs, there is an object satisfying the intuitive requirements for being G which cannot be one of that collection. But even if the universal concept isn’t strictly indefinitely extensible, it seems clear that it has a similar kind of indeterminacy—leaving open the question whether this coincides with indefinite extensibility, I shall say that it is sortally indeterminate, and for brevity say that a concept is indefinite if it is either indefinitely extensible or sortally indeterminate. Like Wright, I think that no determinate number can be associated with any indefinitely extensible concept. And the same goes, in my view, for sortally indeterminate concepts like the universal concept, even if they are not indefinitely extensible in the usual sense (however exactly that is to be explained). Even if it is right that no determinate number can be assigned to any indefinite concept, it does not straightforwardly follow from this that where F is an indefinite concept, there cannot be functions from F into other concepts. It is of course true that if a concept F is indefinitely extensible, there can be no functions from F (to other concepts, definite or not) which are not themselves indefinitely extensible. But that is only to be expected and constitutes no clear objection to the idea that there may be functions from an indefinitely extensible concept to others—anyone who accepts that there are indefinitely extensible concepts will have no principled reason to deny that there are indefinitely extensible relations, including indefinitely extensible functions. If sortal indeterminacy coincides with indefinite extensibility, then the point applies to indefinite concepts quite generally. But it is not clear either that sortal indeterminacy is just indefinite extensibility under another name, or that, if it is not, there can anyway be no difficulty in principle with the idea of functions from a sortally indeterminate concept. In our only putative example of sortal indeterminacy—self-identity under F for some sortal F—the source
350
The Arché Papers on the Mathematics of Abstraction
of indeterminacy lies in the indeterminacy of the range of the second-order quantifier, and is to that extent a higher-order matter, in contrast with indefinite extensibility, which consists in the fact that no definite first-level concept can have all instances of an indefinitely extensible concept in its extension. Perhaps it could be shown that this makes no essential difference, so that we may have sortally indeterminate functions just as we can have indefinitely extensible ones. I shall not here try to determine whether the concerns aired in the last few paragraphs are, in the end, well-founded. Anyone in sympathy with them ought, it seems, to view the free-wheeling, unrestricted talk of double smallness involved in our original formulation of Small2 V with some suspicion, at least unless she can view the unqualified statement that F < G < H as a merely heuristically useful way of expressing the idea that F is a definite (i.e. not indefinite) concept. But even one not moved by those concerns ought to be able to discern the shape of a possible restriction on the application of Small2 V —to the effect that we may take the fact that for some concepts G and H, F < G < H as entitling us to conclude that there is a set of Fs only when G, and hence F, is a definite concept. If such a restriction can be imposed, it will straightforwardly block the paradoxes—both Cantor’s and the BuraliForti—that threaten the proposal sketched in the preceding section. However, if the paradoxes are to be blocked by imposing a restriction to definite concepts in the application of the set-theoretic abstraction, it is no longer clear that the shift to interpreting Good as Small2 is doing useful work. The original point of that shift was to block the derivation of Global Well-Ordering which convicted New V of a violation of Wright’s first conservativeness constraint. But restricting Good concepts to definite ones would seem by itself enough to achieve that result, since the derivation of the Burali-Forti from the assumption that ordinal is definite would then force only to the conclusion that ordinal is indefinite, which does not yield Global Well-Ordering. The second of the two suggested restrictions I want to mention, by contrast, leaves Small2 V playing a significant role in the enterprise, and can be stated rather more briefly. This exploits two thoughts. The first is that the notion of what it is for one concept to be smaller than another, involved in the definition of Small2 , need not be taken as fixed in advance and independently of the neo-Fregean enterprise. The neo-Fregean is, so far as I can see, perfectly free to stipulate a meaning for it that suits his purposes. The second is that, on the neo-Fregean approach to set theory which I have been exploring in the last few sections, there is no aspiration to develop that theory as a freestanding theory, based exclusively on distinctively set-theoretic abstraction principles. On the contrary, we are already embracing the idea that much of the ontology of the theory, and hence much its power, is to be provided by other abstraction principles—such as Hume’s Principle and Cut principles—which do not specifically concern sets at all, but objects of other kinds. Crucially, these other abstractions are, when acceptable, to be conceived of as in good
Abstraction & Set Theory
351
standing independently of the development of any abstractionist set theory. Their acceptability is to be thought of, rather, as a matter of their compliance which whatever constraints—some of which we have touched on above— govern legitimate abstraction in general. 34 In the context of these two thoughts, a natural proposal is that the neo-Fregean may go a step further, and take the sortal concepts introduced via independently acceptable abstraction principles as his basis for the identification of a privileged class of concepts which may serve to anchor, as it were, a restricted < relation for the purposes of Small2 V . In a little more detail, the idea would be that F < G < H holds, in the relevantly restricted sense, only when G and H are concepts independently in good standing courtesy of other acceptable abstraction principles, or power concepts of such concepts, or power concepts of power concepts of such concepts, and so on. Obviously both of the suggestions canvassed here are merely directions for further investigation, without which one can have little confidence that either of them will withstand closer scrutiny or result in a satisfactory and agreeably powerful abstractionist theory of sets. And there may, of course, be other possible ways to impose the restriction(s) on Small2 V which we have seen to be needed. I must leave that work for another occasion. I hope, at least, that the present discussion will have served to identify some of the difficulties facing an abstractionist development of set theory, and perhaps some strategies for dealing with them worth further thought. 35
References Boolos, George (1987) “Saving Frege from Contradiction” Proceedings of the Aristotelian Society 87 (1987), pp. 137–51, reprinted in Boolos (1998), pp. 171–82. Boolos, George (1989) “Iteration Again” Philosophical Topics 17 (1989), pp. 5–21, reprinted in Boolos (1998), pp. 88–104. Boolos, George (1997) “Is Hume’s Principle Analytic?”, in Heck (1997), pp. 245–62, reprinted in Boolos (1998), pp. 301–15. Boolos, George (1998) Logic, Logic, and Logic Cambridge, MA: Harvard University Press. Cook, Roy (2001) “The State of the Economy: Neo-logicism and Inflation”, in Philosophia Mathematica (3) 10, pp. 43–66. Field, Hartry (1980) Science without Numbers Oxford: Blackwell. Field Hartry (1989) Realism, Mathematics & Modality Oxford: Blackwell. Frege, Gottlob (1884) Die Grundlagen der Arithmetik Breslau: W. Koebner; reprinted with English translation by J. L. Austin as The Foundations of Arithmetic Oxford: Blackwell 1950. Hale, Bob (2000) “Reals by Abstraction”, Philosophia Mathematica (3) 8, pp. 100–23, reprinted in Hale & Wright (2001), pp. 399–420. Hale, Bob & Wright, Crispin (2000) “Implicit Definition and the A Priori” in Paul Boghossian & Christopher Wright, Crispin Peacocke, eds. New Essays on the A Priori Oxford: Clarendon Press, 2000, reprinted in Hale & Wright (2001), pp. 117–50. 34 For further discussion of which, see, as well as Wright (1997), Hale & Wright (2000). 35 I am especially grateful to Adam Rieger, Stewart Shapiro, and Crispin Wright for critical and
constructive discussion of the ideas in this paper.
352
The Arché Papers on the Mathematics of Abstraction
Hale, Bob & Wright, Crispin (2001) The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics Oxford: Clarendon Press. Heck, Richard, Jr. (1997) Language, Thought and Logic Oxford: Oxford University Press. Shapiro, Stewart (1999) “New V, ZF and Abstraction” (1999), Philosophia Mathematica 7(3), pp. 293–321. Wright, Crispin (1997) “The Philosophical Significance of Frege’s Theorem” in Heck (1997), pp. 201–45, reprinted in Hale & Wright (2001), pp. 272–306. Wright, Crispin (1998) “Is Hume’s Principle Analytic?” Notre Dame Journal of Formal Logic 40(1), reprinted in Hale & Wright (2001), pp. 307–32.
PROLEGOMENON TO ANY FUTURE NEO-LOGICIST SET THEORY: ABSTRACTION AND INDEFINITE EXTENSIBILITY1 Stewart Shapiro
1.
Background. What and why?
This paper is a contribution to an ongoing program in the philosophy of mathematics that began with Crispin Wright [1983], was bolstered by Bob Hale [1987], and now continues through many extensions, objections, and replies to objections. The neo-logicist plan is develop branches of established mathematics using abstraction principles in the form: ∀a∀b((a) = (b) ≡ E(a, b)), where a and b are variables of a given type, typically ranging over either individual objects or properties; is a higher-order operator, denoting a function from items in the range of the given type to objects in the range of the firstorder variables; and E is an equivalence relation over items of the given type. The main thesis of neo-logicism concerns the epistemic status of some abstraction principles. The neo-logicist claims that certain abstraction principles are, or are like, implicit definitions, and true by stipulation. Thus, the program provides an epistemological foundation for the consequences of those principles. The collection Hale and Wright [2001] contains detailed articulations of the goals and purposes of the neo-logicist program, and of the proposed status of certain abstraction principles (see especially Wright [1997]). We need not broach the exegetical question of the extent to which the archlogicist Gottlob Frege accepted the orientation of neo-logicism. Frege [1884, 1893] did employ at least three abstraction principles. One of them, used for illustration, comes from geometry: The direction of l1 is identical to the direction of l2 if and only if l1 is parallel to l2 . 1 This paper first appeared in the British Journal for the Philosophy of Science 54, [2003], pp. 59–91. Reprinted by kind permission of the editor and Oxford University Press.
353 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 353–382. c 2007 Springer.
354
The Arché Papers on the Mathematics of Abstraction
Frege’s second abstraction principle was dubbed N = in Wright [1983] and is now called Hume’s principle: (Nx : Fx = Nx : Gx) ≡ (F ≈ G), where F ≈ G is an abbreviation of the second-order statement that there is a relation mapping the F’s one-to-one onto the G’s. Hume’s principle thus states that the number of F is identical to the number of G if and only if the F’s are equinumerous with the G’s. Unlike the principle concerning directions, this abstraction is second-order, since the relevant variables, F, G range over concepts or properties of whatever is in the range of the firstorder variables. Hume’s principle also differs from the direction principle in that the equivalence on the right hand side contains only logical terminology (assuming that second-order variables and quantifiers are logical). Frege [1884] contains the essentials of a derivation of the Peano postulates from Hume’s principle. This deduction, now called Frege’s theorem, reveals that Hume’s principle entails that there are infinitely many natural numbers. On the neo-logicist program, then, Hume’s principle provides an epistemological foundation for arithmetic. Frege’s third abstraction is the infamous Basic Law V: Ext(F) = Ext(G) ≡ ∀x(Fx ≡ Gx). Like Hume’s principle, Basic Law V is a second-order, logical abstraction, but unlike Hume’s principle, it is inconsistent. An essential part of the ongoing neo-logicist program is to articulate principles that indicate which abstraction principles serve as legitimate epistemic foundations for mathematical theories. A proposed abstraction principle must be consistent, of course, but consistency is not sufficient. There are abstraction principles which are individually consistent, but are mutually incompatible (see Heck [1992] and Weir [2003]). We do not enter into the subtle details of this issue here. An important item on the neo-logicist agenda is to extend the success of Frege’s theorem to other, richer branches of mathematics. The idea is to formulate acceptable abstraction principles that recapture those branches, in much the same way that Hume’s principle recaptures arithmetic. Hale [2000] attempts this for real analysis, as does Shapiro [2000], which includes a brief account of complex analysis. The purpose of this article is to assess the prospects for a neo-logicist recapture of set theory. I suggest that set theory is a particularly important case, if neo-logicism is to dovetail with the full range of contemporary and historical mathematics. The notion of ‘set’ plays a central role within many branches, and set theory itself has come to enjoy a foundational significance. Since virtually every extant mathematical structure can be modeled in the set-theoretic hierarchy, set theory provides a natural setting for comparing and relating different mathematical structures. Moreover, the set-theoretic hierarchy is the de facto arena
Prolegomenon to Any Future Neo-Logicist Set Theory
355
for resolving existence, or consistency issues within mathematics, and the set-theoretic hierarchy provides a natural setting for comparing and relating different mathematical structures. Thus, if the neo-logicist fails to capture a reasonably rich set theory, then the program has left out a crucial part of contemporary mathematics, one with special foundational significance. One the other hand, someone might think that set theory is already implicit in the neo-logicist systems. The extant developments—including Frege’s theorem deriving the Peano postulates from Hume’s principle—make essential use of second-order logic. Officially, the monadic higher-order variables range over properties or propositional functions of whatever is in the range of the first-order variables, but properties have structural affinities with sets. Bertrand Russell ([1903], p. 13), for example, wrote that ‘. . . the study of propositional functions appears to be strictly on a par with that of classes, and indeed scarcely distinguishable therefrom’. During his later no-class period, Russell ([1993], Chapter 18) proposed to eliminate talk of classes by replacing variables ranging over classes with higher-order variables. In another context, W. V. O. Quine ([1986], Chapter 5) famously argues that second-order logic is not logic at all, but is set theory in disguise, a ‘wolf in sheep’s clothing’. So one might think that some set theory has already been smuggled into the neo-logicist program, lying disguised in the higher-order logic. Notice, for example, that Hume’s principle is equiconsistent not with classical arithmetic, but with classical analysis, the theory of natural numbers and sets of natural numbers (Boolos [1987]). We need not broach the border dispute concerning whether second-order logic is properly part of logic or is mathematics in disguise. The theme of both logicism and neo-logicism is that mathematics and logic are intertwined. The underlying issues here are epistemic. Our Quinean might argue that Frege’s theorem only shows how to derive the Peano postulates from Hume’s principle plus some set theory. When put this way, the result has no deep philosophical significance, since we already know that set theory is mathematically richer than arithmetic, and can serve as a foundation for it. Frege’s theorem merely recounts what we already know, that we can recapture arithmetic from the basic principles of set theory. This particular charge can be rebutted with a study of the particular axioms and rules of second-order logic invoked in Frege’s theorem. Do those particular principles presuppose a substantial set theory? Typically, the study focuses on the instances of the comprehension scheme, and the use of the abstraction operators, often focusing on how impredicative those are (see, for example, Wright [1998]). We can safely bracket this foundational issue here. Even if second-order logic is a disguised set theory, it is a rather weak one. Let us call the items in the range of the first-order variables ‘objects’. Monadic second-order variables range over properties—or perhaps sets—of those objects. We cannot take all of these properties or sets to be objects in the range of the first-order variables,
356
The Arché Papers on the Mathematics of Abstraction
on pain of contradiction. Cantor’s theorem is that there are more properties or sets than objects. General second-order variables range over analogues of sets of n-tuples of objects. But that is the limit of second-order logic. If the neo-logicist ventures into third-order logic, then she has analogues of sets of sets of objects, and fourth-order logic brings analogues of sets of sets of sets of objects. Perhaps the neo-logicist would be well-advised to keep the ‘order’ low, to minimize the charge of smuggling the mathematics in. However, even if the neo-logicist climbs all the way to ω-order logic, or beyond, there is still no principle of infinity, showing there to be infinitely-many objects. In contrast to higher-order logic, set theories have sets within the intended range of the first-order variables. So in set theory, sets are objects. The theories typically have powerful existence axioms, such as infinity, union, powerset, and replacement. The focus of the present paper is whether such a powerful set theory can be captured with neo-logicist abstraction principles. We want to explore some assumptions that yield the usual strong existence principles. Along the way, we do not pay much explicit attention to the invoked axioms and rules of second-order logic. Presumably, that would come later, after the neo-logicist has provided us with a complete set theory. In the end, a neo-logicist might eschew such a powerful set theory, and try to get by with only a second- or higher-order logic in developing other branches of mathematics. The purpose of this article is to assess the necessity of this retrenchment.
2.
Framework Recall Frege’s Basic Law V: ∀P∀Q[Ext(P) = Ext(Q) ≡ ∀x(Px ≡ Qx)],
which, of course, is inconsistent. The plan here is to restrict this principle— somehow. This much, of course, is not a new. Restricted versions of Basic Law V were broached by Bertrand Russell [1906] and John von Neumann [1925], for example. One might even think of Ernst Zermelo’s axiom of separation as a restriction of Basic Law V. Here, we will not get pinned down as to what the restriction is. Rather, we only insist that there be such a restriction, and then explore the consequences of the resulting principle. Our underlying issue concerns assumptions on the restriction needed to sanction the various axioms. So we introduce a new primitive GOOD(P), which is a (third-order) predicate of monadic predicates. The intended interpretation of ‘GOOD(P)’ is something like ‘the P’s constitute a set’, or ‘P has a non-trivial extension’. Define BAD(P) to be ¬GOOD(P). One preliminary, quasi-terminological issue concerns the locution ‘Ext(P)’, when P does not have a non-trivial extension (i.e., P is BAD). The question concerns what to make of the phrase ‘the extension of P’, when the P’s do not constitute a set. One option would be to invoke a free logic, so that such
Prolegomenon to Any Future Neo-Logicist Set Theory
357
expressions do not denote anything. If there is no set of P’s, then ‘Ext(P)’ is akin to ‘the President’s twelfth child’ and ‘the present King of France’. Instead, we introduce dummy ‘extensions’ here. When the expression ‘Ext(P)’ is wellformed, it denotes something, but the ‘extensions’ of BAD properties need not obey the right hand side of Basic Law V. If one can put up with an apparent oxymoron, the ‘extensions’ of BAD properties do not satisfy extensionality. In other neo-logicist contexts, the issue of free logic looms important. For example, Shapiro and Weir [2000] show that Frege’s theorem cannot be derived from Hume’s principle (alone) in a background free second-order logic. The usual non-free context of Frege’s theorem has an assumption (or consequence) that for any property F, the number of F’s (Nx : Fx) exists, and all such numbers obey the right hand side of Hume’s principle. In that context, perhaps, the assumption ought to be made explicit. There is, however, no analogous issue here concerning the present choice to eschew free logic. In the present context, the ‘hidden assumption’ that there is an ‘extension’ for each property is innocuous. Since we are restricting Basic Law V anyway, what matters is not which properties have ‘extensions’, but which ‘extensions’ obey extensionality, the right hand side of Basic Law V. Whether we invoke a free logic or not, the right hand side of Basic Law V is restricted to GOOD properties, and that is what matters here. From now on, we drop the scare quotes around ‘extension’. The reader is asked to keep it in mind that only the extensions of GOOD properties are extensional, and so only those extensions act like sets. Georg Cantor [1899] famously called the totality of all sets an ‘inconsistent multitude’, since one cannot conceive of this collection as ‘one finished thing’. This is another way of saying that some properties do not have extensions that obey the right hand side of Basic Law V. Russell ([1903], pp. 102–3) wrote that the basic question is ‘to determine which propositional functions define classes which are single terms as well as many, and which do not’. In present terms, Russell’s question is to determine which properties are GOOD. I submit that if the neo-logicist is to recapture set theory, then a restricted version of Basic Law V is at least part of its proper foundation. Extensionality is analytic of the notion of set. Whatever else sets are, it is part of the meaning of ‘set’ that sets with the same members are identical. The restricted version of Basic Law V focuses on this aspect of sets. For this reason, if someone believes in sets already, and adopts a sufficiently abundant account of properties, then a restricted version of Basic Law V is a necessary truth, either analytic or all but analytic. Let p be a set. Then define the property M p of being-a-member-of- p thus: M p x ≡ x ∈ p. For someone who believes in sets, the properties M p are the quintessential GOOD properties. More formally, let us say that a property Q is GOOD if there is a set p such that for each object a, Qa if and only if a ∈ p. That is, Q is GOOD if the Q’s constitute a set. Then a restricted version of Basic Law V follows from this. So every set theorist should accept a restricted version of Basic Law V.
358
The Arché Papers on the Mathematics of Abstraction
There is an analogous situation concerning Hume’s principle. The neologicist uses that principle to introduce cardinal numbers as abstract objects. But someone who already believes in numbers still holds that Hume’s principle is a (necessary) truth, all but analytic. After Frege himself abandoned Hume’s principle as a foundation for arithmetic, he gave explicit definitions of ‘number’ and ‘number of’ (in terms of extensions), and then proved Hume’s principle as a theorem. Similarly, in typical treatments of set theory, we define the cardinal number of a set s to be the smallest ordinal that is equinumerous with s. Then one proves a (first-order) version of Hume’s principle, restricted to sets. The neo-logicist project for arithmetic is to use this necessary truth about number to introduce the requisite objects, via abstraction. In the present context, the neo-logicist uses a necessary truth about sets to introduce the requisite objects, the sets. There are several ways to formulate a restricted version of Basic Law V: ∀P∀Q[(GOOD(P) & GOOD(Q)) → (Ext(P) = Ext(Q) ≡ ∀x(Px ≡ Qx))], ∀P∀Q[(GOOD(P) ∨ GOOD(Q)) → (Ext(P) = Ext(Q) ≡ ∀x(Px ≡ Qx))], ∀P∀Q[Ext(P) = Ext(Q) ≡ [(BAD(P) & BAD(Q)) ∨ ∀x(Px ≡ Qx)]]. The first of these leaves the individuation of the extensions of BAD properties completely open. It is consistent with the first principle that some BAD properties have the empty set as their extension, other BAD properties have ω as their extension, and still other BAD properties have the White House as their extension. In contrast, the second version entails that if P is GOOD and Q is BAD, then EXT(P) is different from EXT(Q). So if the empty property is GOOD, and there is an empty set, then this set is not also the extension of any BAD property. However, the principle says nothing else concerning how the extensions of BAD properties are to be individuated. Some of them can have the While House as extension, and others can have the Eiffel Tower as extension. The third restricted version of Basic Law V entails that all BAD properties have the same extension, and that this extension is not also the extension of any GOOD property. I prefer this version, on grounds of convenience, but nothing turns on this. Let us call the form ∀P∀Q[Ext(P) = Ext(Q) ≡ [(BAD(P) & BAD(Q)) ∨ ∀x(Px ≡ Qx)]]. (RV) for Restricted-V. I use ‘ ’ as a name of the common extension for all BAD properties. We dub ‘the bad thing’.
3.
GOOD candidates, indefinite extensibility
As noted, in the present study we officially take GOOD to be a primitive property of properties. We are exploring the available possibilities. An actual
Prolegomenon to Any Future Neo-Logicist Set Theory
359
neo-logicist treatment of set theory would delineate what it is for a property to be GOOD. Ideally, GOOD should be defined using only logical terminology, perhaps supplemented with other legitimate abstraction principles. This section provides some examples for GOOD. I do not claim that all of the resulting versions of (RV) are legitimate neo-logicist abstractions, and in some cases, the resulting set theory is not very interesting. Some of the examples are serious philosophical candidates for (RV); others are used later to refute certain inferences. For a completely trivial, toy example, we might declare that no property is GOOD: ∀P(¬GOOD(P)). Then it follows from (RV) that there is a single extension, the bad thing. For every property P, the extension of P is . For a slightly less trivial toy example, suppose that only ‘empty’ properties are GOOD: ∀P(GOOD(P) ≡ ∀x(¬Px)). Then it follows from (RV) that there are exactly two distinct extensions, the empty set Ø (the extension of empty properties) and the bad thing (the extension of every other property). This is the exact opposite of the ‘Aristotelian’ principle broached by Shapiro and Weir [2000]. Third, we might declare that a property is GOOD if and only if it applies to only finitely many objects. One can state this in a second-order language using only logical terminology. It follows from (RV) that every finite property has an extension in the ordinary sense, and the extension of every infinite property is . Next, we might declare that a property is GOOD if and only if it is countable, in which case the extension of every uncountable property is . In general, for any cardinal number κ, there is an instance of (RV) in which a property is GOOD if and only if it applies to fewer than κ-many objects. According to one of these instances, if a property holds of fewer than κ-many objects, then it has an extension in the ordinary sense, and extensionality applies. If a property applies to at least κ-many objects, its extension is . This version of (RV) is satisfiable on any domain of size Vλ , so long as the cofinality of λ is at least κ. Weir [2003] dubs these instances of (RV) ‘distraction principles’. In many of these cases, GOOD can be defined in a second-order language, using only logical terminology (see Shapiro [1991], §5.1.2). One instance of (RV) of note follows a suggestion once made by von Neumann [1925]. The idea is that a property is GOOD if and only if it is not equinumerous with the universe. This can be formulated rigorously, using only second-order logical terminology: GOOD(P) ≡ ¬∃R(∀x∃!y(Py & Rxy) & ∀y(Py → ∃!xRxy)). George Boolos [1989] called the resulting instance of (RV) ‘New V’, and developed the resulting set theory in some detail (comparing it to the now more common iterative approach). New V captures one manifestation of what is sometimes called the ‘limitation of size’ conception of set. Wright [1997]
360
The Arché Papers on the Mathematics of Abstraction
calls NEW V ‘VE’, for ‘V enlightened’, and suggests that it might serve as part of a foundation for real analysis. New V is satisfiable in the countable set Vω , and so it does not entail that there are uncountably many abstract objects. Wright suggests, however, that New V might be supplemented with other abstraction principles to provide the requisite ontology. New V would help delineate the structure of the real numbers. As a neo-logicist abstraction principle, New V is fraught with problems. It entails that there is a well-ordering of the universe, and thus runs afoul of Wright’s [1997] conservativeness requirements on acceptable abstraction principles. Its fate seems to be bound up with the generalized continuum problem. It is consistent with Zermelo–Fraenkel set theory that there are no uncountable models of New V (see Shapiro and Weir [1999]). Nevertheless, it is plausible that some version of the ‘limitation of size’ conception of set might serve the needs of neo-logicism. The rough idea is that a property is GOOD if it does not apply to too many objects. The problem is to say what is ‘too many’. Setting ‘too many’ at a fixed cardinal seems ad hoc and puts a bound on the ontology. Moreover, this move seems to beg the question, since set theory is supposed to be our theory of cardinals. Instead, we rely on second-order logic for our account of cardinality. Michael Dummett’s notion of indefinite extensibility makes for a plausible ‘limitation of size’ interpretation. This notion has considerable historical interest, and makes for a serious candidate for a neo-logicist set theory considered within the framework presented here. Russell ([1906], p. 144) begins with an examination of the standard paradoxes, and concludes: . . . the contradictions result from the fact that . . . there are what we may call self-reproductive processes and classes. That is, there are some properties such that, given any class of terms all having such a property, we can always define a new term also having the property in question. Hence we can never collect all of the terms having the said property into a whole; because, whenever we hope we have them all, the collection which we have immediately proceeds to generate a new term also having the said property.
Citing this passage, Dummett writes that an ‘indefinitely extensible concept is one such that, if we can form a definite conception of a totality all of whose members fall under the concept, we can, by reference to that totality, characterize a larger totality all of whose members fall under it’ (Dummett [1993], p. 441, emphasis mine). According to Dummett, an indefinitely extensible property P has a ‘principle of extension’ that takes any definite totality t of objects each of which has P, and produces an object that also has P, but is not in t (see also Dummett [1991], pp. 316–319). Let us say that a property P is Definite if it is not indefinitely extensible. I hope I can be forgiven for a brief rehearsal of familiar material, in order to establish the pattern that Russell and Dummett discern. Consider, first, the Burali-Forti paradox. Let O be any Definite collection of ordinal numbers. Let O be the collection of all ordinals α such that there is a β ∈ O and α ≤ β. Let
Prolegomenon to Any Future Neo-Logicist Set Theory
361
γ be the order type of O . Note that γ is itself an ordinal. Let γ be the order type of O ∪ {γ }. Then γ is an ordinal number, and γ is not a member of O. So the property of being an ordinal is indefinitely extensible. As Dummett ([1991], p. 316) puts it, if we have a clear grasp of any totality of ordinals, were thereby have a conception of what is intuitively an ordinal number greater than any member of that totality. Any [D]efinite totality of ordinals must therefore be so circumscribed as to forswear comprehensiveness, renouncing any claim to cover all that we might intuitively recognise as being an ordinal.
Next consider Russell’s paradox. Let R be a set of sets that do not contain themselves; so if r ∈ R then r ∈ / r . Then R does not contain itself. So the property of being a set that does not contain itself is indefinitely extensible. Our third example is Cantor’s paradox. Let C be a collection of cardinal numbers. Let C be the union of the result of replacing each κ ∈ C with a set of size κ. The cardinal of the powerset of C is larger than any cardinal in C. So the property of being a cardinal number is indefinitely extensible. To be sure, one can challenge the set-theoretic principles (union, replacement, powerset, etc.) that are invoked in the above constructions, or one can tinker with the logic, but it is natural to agree with Russell and Dummett that the properties in question are indefinitely extensible. Russell ([1906], p. 144) wrote that it ‘is probable’ that if P is any property which demonstrably does not have an extension (that obeys extensionality) then ‘we can actually construct a series, ordinally similar to the series of all ordinals, composed entirely of terms having the property’ P. In present terms, Russell’s conjecture is that if P is BAD, then there is a one-to-one function from the ordinals into P. If ‘BAD’ is ‘indefinitely extensible’, then we can provide an argument for Russell’s conjecture: Let α be an ordinal and assume that we have a one-to-one function f from the ordinals smaller than α to objects that have the property P. Consider the collection { fβ|β < α}. Since P is indefinitely extensible, there is an object asuch that P holds of a, but a is not in this set. Set f α = a.
We can only speculate if Russell had something like this argument in mind. The argument uses transfinite induction on ordinals, a version of replacement (that if a totality t is equinumerous with an ordinal, then t is Definite), and perhaps most notably, a global choice principle, or at least a choice function on sub-totalities of the given indefinitely extensible property. Russell proposed three ways to avoid the antinomies: the zigzag theory, the theory of limitation of size, and the no classes theory. The middle one is suggested by his treatment of indefinitely extensible properties. Russell notes that such properties come with ‘processes’ which ‘seem essentially incapable of terminating’. These processes seem to be Dummett’s principles of extension. Russell suggests that ‘it is natural to suppose that the terms generated by such a
362
The Arché Papers on the Mathematics of Abstraction
process do not form a’ set. In other words, such properties are BAD. So ‘there will be (so to speak) a certain limit of size which no [set] can reach; and any supposed [set] which reaches or surpasses this limit is . . . improper . . . , i.e., is a non-entity’. Russell concludes that every set ‘must always be capable of being arranged in a well-ordered series ordinally similar to a segment of the series of ordinals in order of magnitude’ ([1906], p. 152). This much holds in ZFC, thanks to Zermelo’s theorem that every set can be well-ordered. Notice, however, that this last depends only on a local axiom of choice. In present terms, Zermelo’s theorem only requires that each Definite property have a choice function. We will return to this distinction in the discussion of the choice and powerset axioms in Section 5. The proposal here is to express this manifestation of the limitation of size with an instance of (RV), so that being Definite is both necessary and sufficient for a property to have an extension that obeys extensionality. For his part, Russell ([1906], pp. 153–154) dismisses this limitation of size conception, almost as soon as he raises it: A great difficulty of this theory is that it does not tell us how far up the series of ordinals it is legitimate to go. It might happen that ω was already illegitimate: in that case all proper [sets] would be finite. For, in that case, a series ordinally similar to a segment of the ordinals would necessarily be a finite series. Or it might happen that ω2 was illegitimate, or ωω , or ω1 or any other [limit] ordinal . . . [O]ur general principle does not tell us under what circumstances [a property is GOOD]. It is no doubt intended by those who advocate this theory that all ordinals should be admitted which can be defined, so to speak, from below, i.e., without introducing the notion of the whole series of ordinals. Thus, they would admit all of Cantor’s ordinals, and they would only avoid admitting the maximum ordinal. But it is not easy to state such a limitation precisely: at least I have not succeeded in doing so.
Russell is certainly correct that anyone who wishes to propose a set theory developed along present lines faces the conceptual problem of delimiting just how ‘many’ objects a property must apply to, in order for the property to be too big and thus BAD. Dummett ([1991], p. 317) seems to acknowledge this, writing that the ‘principle of extendibility constitutive of an indefinitely extensible concept is independent of how lax or rigorous the requirement for having conception of a totality is taken to be, although that will of course affect which concepts are acknowledged to be indefinitely extensible’. We can get a feel for the problem if we consider some of the philosophers and mathematicians who invoke notions like indefinite extensibility. Dummett occupies one extreme. First, he famously argues that bivalence and excluded middle legitimately apply to quantified statements only if the range of the quantifiers is Definite. According to Dummett, the proper logic for a theory of an indefinitely extensible concept is intuitionistic. He adopts Russell’s supposedly flippant suggestion that even ω is too big to be Definite. In other
Prolegomenon to Any Future Neo-Logicist Set Theory
363
words, Dummett claims that the property of being a finite ordinal is already indefinitely extensible. He notes that it is common for mathematicians to concede that concepts like ‘set’ and ‘ordinal’ are indefinitely extensible, but most hold that domains like the natural numbers and the real numbers are perfectly Definite. He argues that this last belief begs the question: We have a strong conviction that we do have a clear grasp of the totality of natural numbers; but what we actually grasp with such clarity is the principle of extension by which, given any natural number, we can immediately cite one greater than it by 1. A concept whose extension is intrinsically infinite is thus a particular case of an indefinitely extensible one. Assuming its extension to constitute a [D]efinite totality . . . may not lead to inconsistency; but it necessarily leads to our supposing that we have provided definite truth-conditions . . . for statements that cannot legitimately be so interpreted. (Dummett [1991], p. 318, see also [1993], pp. 442–443)
Dummett here seems to follow Leibniz, who invokes a notion much like indefinite extensibility: It could . . . well be argued that, since among any ten terms there is a last number, which is also the greatest of those numbers, it follows that among all numbers there is last number, which is also the greatest of all numbers. But I think that such number implies a contradiction . . . When it is said that there are infinitely many terms, it is not being said that there is some specific number of them, but that there are more than any specific number. (Letter to Bernoulli, Leibniz [1863], III 566, translated in Levey [1998], pp. 76–77, 87) . . . we conclude . . . that there is no infinite multitude, from which it will follow that there is not an infinity of things, either. Or [rather] it must be said that an infinity of things is not one whole, or that there is no aggregate of them. (Leibniz [1980], 6.3, 503, translated in Levey [1998], p. 86) Yet M. Descartes and his followers, in making the world out to be indefinite so that we cannot conceive of any end to it, have said that matter has no limits. They have some reason for replacing the term ‘infinite’ by ‘indefinite’, for there is never an infinite whole in the world, though there are always wholes greater than others ad infinitum. As I have shown elsewhere, the universe cannot be considered to be a whole. (Leibniz [1996], p. 151)
For the other extreme, we turn to Zermelo [1930], who presents a version of second-order ZFC with urelements, in pretty much its contemporary form. We restrict attention here to models that have no urelements. Each such model is isomorphic to a rank Vκ , in which κ is a strong inaccessible. Zermelo ([1930], p. 1233) proposes an axiom stating the existence of ‘an unbounded sequence’ of such models. Each such model Vκ has subsets (like κ, the collection of ordinals in the model) which are not members of the model. Within the model Vκ , these subsets are proper classes, and act as BAD properties. However, [w]hat appears as an ‘ultra-finite non- or super-set’ in one model is, in the succeeding model, a perfectly good, valid set with both a cardinal number and an ordinal type . . . To the unbounded series of Cantor ordinals there corresponds a similarly unbounded . . . series of essentially different set-theoretic
364
The Arché Papers on the Mathematics of Abstraction models . . . This series reaches no true completion in its unrestricted advance, but possesses only relative stopping points . . . Thus the set-theoretic ‘antinomies’, when correctly understood, . . . lead . . . to an, as yet, unsurveyable unfolding and enriching of that science.
In present terms, then, Zermelo’s proposed axiom is that the series of models of second-order ZFC—and so the series of strongly inaccessible cardinals—is itself indefinitely extensible. Each strong inaccessible is a Definite collection, but each set of inaccessibles gives rise to further, larger strongly inaccessible sets, cardinals, and ordinals. So there is no set of all such models or all such cardinals. Like Russell, I have no concrete suggestions on how to further articulate the notion of indefinite extensibility, and so I cannot settle whether ω, or the real numbers, or the umpteenth inaccessible cardinal, or the umpteenth supercompact cardinal, is Definite. The present plan is to take the notion of a GOOD property as primitive, and to explore what must be true of it, in order to have a viable set theory based on (RV). If the best candidate for BAD is indefinitely extensibility, then the project will determine what properties that notion needs to have in order to recapture extant mathematics. The notion of indefinite extensibility has already appeared in the neologicist literature, in response to a criticism formulated by Boolos [1997]. Boolos pointed out that Hume’s principle entails that every property has a cardinal number. In particular, it follows from Hume’s principle that the property of being self-identical has a cardinal number. This would be the number of all objects whatsoever. Similarly, Hume’s principle entails that there is a number of all cardinal numbers, there is a number of all ordinal numbers, and there is a number of all sets. Boolos notes that prima facie, this presents a conflict with ordinary Zermelo–Fraenkel set theory: [I]s there such a number as [the number of all objects whatsoever?] According to [ZF] there is no cardinal number that is the number of all the sets there are. The worry is that the theory of number [based on Hume’s principle] is incompatible with Zermelo–Fraenkel set theory plus standard definitions. (Boolos [1997], p. 260)
Wright [1999] accepts the force of this objection. He seems to grant ‘the plausible principle . . . that there is a determinate number of F’s just provided that the F’s compose a set’. Then he points out that ‘Zermelo–Fraenkel set theory implies that there is no set of all sets. So it would follow that there is no number of sets’. Wright’s response is to restrict the second-order variables in Hume’s principle, so that some properties do not have numbers. This is where the Russell–Dummett notion of indefinite extensibility appears: I do not know how best to sharpen [the notion of indefinite extensibility], still less how its best account might show that Dummett is right, both to suggest that the proof-theory of quantification over indefinitely extensible totalities should be uniformly intuitionistic and that the fundamental classical mathematical
Prolegomenon to Any Future Neo-Logicist Set Theory
365
domains, such as those of the natural numbers, or the reals, should also be regarded as indefinitely extensible. But Dummett could be wrong about both those points and still be emphasizing an important insight concerning certain very large totalities—ordinal number, cardinal number, set, and indeed “absolutely everything”. If there is anything at all in the notion of an indefinitely extensible totality . . . one principled restriction on Hume’s Principle will surely be that [cardinal numbers] not be associated with such totalities. (Wright [1999], pp. 13–14)
Thus, Wright suggests that the second-order variables in Hume’s principle be restricted to Definite properties. He conceded that he does not have a more rigorous articulation of the notion of indefinite extensibility, but unlike Russell, Wright does not despair of a further articulation. This programmatic suggestion provoked the present project. If we can restrict Hume’s principle in the way Wright suggests—to avoid saying that there is a number of all ordinals, a number of all sets, etc.—then why not restrict Basic Law V similarly, and perhaps resurrect set theory along neo-logicist lines? Presumably, or hopefully, the ultimate notion of indefinite extensibility will end up closer to Zermelo’s conception than to Dummett’s or Leibniz’s. In response to Wright’s proposal, Peter Clark [2000] argued that the best candidate for ‘Definite’ is ‘set sized’, where ‘set’ is the notion given by Zermelo–Fraenkel set theory. That is, Clark argues that ‘Definite’ just means something like ‘equinumerous with a member of the iterative hierarchy’. So if the notion of indefinite extensibility is indeed needed for the neo-logicist program, then that program is hopeless. It requires that we articulate the iterative hierarchy before we can give the proper foundation even for arithmetic. The project of this paper sheds light on that debate, by outlining properties that the notion of a BAD property should enjoy in order for (RV) to produce a mathematically viable set theory. The burden on the neo-logicist is to give a characterization of BAD that has the requisite properties, does not beg any questions or presuppose prior mathematics, and hopefully can be formulated using only logical terminology.
4.
The framework of (RV) alone, or almost alone Recall the basic form to be used to develop set theory here:
(RV)∀P∀Q[Ext(P) = Ext(Q) ≡ [(BAD(P) & BAD(Q)) ∨ ∀x(Px ≡ Qx)]]. The principle states that the extensions of GOOD properties are individuated extensionally, and that all BAD properties have the same extension. We use ‘ ’ as a name of the common extension for all BAD properties. Some features of BAD and GOOD properties follow from (RV) alone. Notice, first, that (RV) plus a statement (∀P(GOOD(P)) that all properties are GOOD entails the original Basic Law V, and so leads directly to Russell’s paradox. Contraposing, it follows from (RV) that some properties are BAD.
366
The Arché Papers on the Mathematics of Abstraction
It follows from any abstraction principle that the relation on its right hand side is an equivalence. So it follows from (RV) that (BAD(P) & BAD(Q)) ∨ ∀x(Px ≡ Qx) is an equivalence relation on properties. Reflexivity and symmetry are immediate, but the relation is transitive if and only if (GOOD(P) & ∀x(Px ≡ Qx)) → GOOD(Q). So (RV) entails that GOOD is a congruence on co-extensive properties. If two properties apply to exactly the same objects, then either both are GOOD or neither are. Our neo-Fregean must make sure that GOOD has this feature. Let U be the ‘universal property’, λx(x = x), so that ∀xUx holds; and let E be the empty property, λx(x = x), so that ∀x¬Ex holds. It does not follow from (RV) that U is BAD, nor does it follow that E is GOOD. To see this, consider the following model: the domain is the natural numbers and GOOD(P) iff P is co-finite. We can interpret the ‘Ext’ operator so that (RV) holds under this interpretation. On this interpretation, U is GOOD and E is BAD. Of course, on the preferred interpretations of (RV), a property is GOOD if it is indefinitely extensible, or otherwise not too big. In those cases, the empty property E is GOOD if anything is, and the universal property U is BAD. In what follows, we make those two assumptions (which accounts for the word ‘almost’ in the title of this section). Boolos [1989] shows how to develop the apparatus of set theory in the context of New V, the version of (RV) in which GOOD is ‘not equinumerous with the universe U ’. It turns out that much of the framework does not depend on that particular instance of ‘GOOD’, and can be invoked for any instance of (RV). That is the business at hand here. The first item is to define membership: x ∈ y : ∃P(y = Ext(P) & Px) That is, x is a member of y if y is the extension of a property of which x holds. Recall that ‘ ’ is a name the common extension of all BAD properties. From the running assumption that the universal property U is BAD, we have that Ext(U ) = . It follows that ∀x(x ∈ ). The following is straightforward: ∀P[GOOD(P) → ∀x(x ∈ Ext(P) ≡ Px)]. In words, the members of a GOOD property P are just those objects that P holds of. So a property P is GOOD if and only if ∃x(x ∈ / Ext(P)). Thus, the running assumption that U is BAD allows explicit definitions of GOOD and BAD. We could rewrite (RV): ∀P∀Q[Ext(P) = Ext(Q) ≡ [(∀x(x ∈ Ext(P)) & ∀x(x ∈ Ext(Q))) ∨ ∀x(Px ≡ Qx)]],
Prolegomenon to Any Future Neo-Logicist Set Theory
367
or more fully, ∀P∀Q[Ext(P) = Ext(Q) ≡ [(∃R(∀xRx & Ext(P) = Ext(R)) & ∃R(∀xRx & Ext(Q) = Ext(R))) ∨ ∀x(Px ≡ Qx)]]. Of course, it would be quite unnatural to start with these formulas. It is straightforward that the principle of extensionality holds: ∀x∀y[((∃P(x = Ext(P)) & ∃Q(y = Ext(Q)) & (∀z(z ∈ x ≡ z ∈ y))) → x = y)]. Notice that the range of the second-order quantifiers includes the BAD properties, and the range of the first-order quantifiers includes the bad thing as well as any non-extensions. We have that all extensions (including ) are individuated extensionally by their members. Let Ø be Ext(E), the extension of the empty property. If E is GOOD then Ø = . If x is any object, then let Sx be the property that holds of x alone, so that ∀z(Sx z ≡ z = x). Define {x} to be Ext(Sx ). It does not follow from (RV) and the running assumptions that Sx is GOOD. Indeed, in one of the toy examples from the Section 3, we defined ‘BAD’ to be ‘non-empty’. The resulting version of (RV) is satisfiable on, say, the natural numbers. In that model, for any object x, Sx is BAD, and so {x} = . For another example, suppose that there is some object b that ‘infects’ any property that it has. That is, if Pb then P is BAD. In this case, Sb is BAD, and {b} = . Next we define a set to be the extension of a GOOD property: set(x) : ∃P(GOOD(P) & x = Ext(P)). is the only extension that is not a set. Notice It follows from (RV) that that we are not assuming that every object is an extension, and thus we allow urelements. One of our running assumptions is that if there are any GOOD properties, then the empty property E is GOOD. It follows that if there are any sets at all, then Ø is one of them. From the running assumption that the universal property U is BAD, we have that Ext(U ) = . It follows from the definition of membership that ∀x(x ∈ ). In particular, ∈ . So the membership relation is not well-founded. Consider the property S that holds of alone (so ∀x(S x ≡ x = )). It seems reasonable to hold that S is GOOD, since it holds of only one thing. If so, then { } is a set. Of course, ∈ { }. But since every object is a member of , we have { } ∈ . So if S is GOOD, as seems reasonable, then even the sets are not well-founded. Again, on most of the examples broached so far, { } is a set, but under the running assumptions that the universal property U is BAD, everything is a member of a member of { }, and so the property ‘member of a member of ’ is BAD. Thus, the extension of this property is , which is not a set. So there is a set whose union is not a set.
368
The Arché Papers on the Mathematics of Abstraction
To have a chance of (re-)capturing the iterative hierarchy, we need the notion of a pure set—a set “built” up from the empty set. It corresponds to a member of the iterative hierarchy, with no urelements. Boolos [1989] showed how to define this notion, using a Frege-style ancestral. Define a property F to be closed if: ∀y[[∃P(y = Ext(P)) & ∀z(z ∈ y → Fz)] → Fy]. In words, F is closed if, whenever it holds of the members of an extension, then it holds of that extension. We sometimes write ‘closed(P)’ for ‘P is closed’. Now we define purity: x is pure if and only if ∀F(closed(F) → Fx). In words, an object is pure if every closed property holds of it. We will sometimes write ‘pure(x)’ for ‘x is pure’. Suppose that the empty property E is GOOD. Let F be closed. Since F holds of every element of the empty set Ø, we have F∅. Thus Ø is pure. Let us illustrate the notion with a couple of our toy examples. Suppose we define a property P to be GOOD if and only if P holds of at most one object. The resulting set theory is not very interesting, but we do have that Ø and {Ø} are both sets on that interpretation. Let T be the property that holds of Ø, {Ø}, and nothing else. Then T is closed. On this interpretation, the only pure objects are Ø and {Ø} (exercise). Now define GOOD to be ‘finite’. Then the property of being hereditarily finite is closed, and the only pure objects are the hereditarily finite sets (exercise). Returning to the general case, let T be the property of being an extension, so that ∀x(Tx ≡ ∃P(x = Ext(P))). It is immediate that T is closed. So every pure object is an extension. Now let S be the property of being non-universal: ∀x(Sx ≡ ¬∀w(w ∈ x)). Suppose that S is not closed. Then there is an object y such that ∃P(y = Ext(P)) and ∀z(z ∈ y → ¬∀w(w ∈ z)), but ∀w(w ∈ y). From the first and third of these clauses, y is an extension and everything is a member of y. So for z in the middle clause, we have ¬∀w(w ∈ ), which y = . Putting contradicts a property of . So S is closed. Thus every pure object is a nonuniversal extension. In other words, every pure object is a set. We will thus sometimes write ‘pure set’ for ‘pure object’. Let R be the property of not being a member of itself, so that ∀x(Rx ≡ x ∈ / x). Let y be any object, and suppose that y is an extension and ∀z(z ∈ y → z∈ / z). So if y ∈ y then y ∈ / y. Thus, y ∈ / y. Therefore, R is closed. So we have that if x is pure, then x ∈ / x. Boolos proves a couple of theorems concerning New V, which also hold here in the more general context:
Prolegomenon to Any Future Neo-Logicist Set Theory
369
Theorem 1: If y is an extension and every member of y is pure, then y is pure. In symbols: (∃P(y = Ext(P)) & ∀x(x ∈ y → pure(x))) → pur e(y). Proof: Suppose that y is an extension and that every member of y is pure. Let F be closed. We need to show that Fy. We have that for every z ∈ y, z is pure. Thus for every z ∈ y, Fz. Since F is closed, we have Fy. Theorem 2: Suppose that y is pure. Then y is a set and all members of y are pure. Proof: Let G be the property of being a set all of whose members are pure. So ∀x[Gx ≡ (xis a set & ∀z(z ∈ x → pure(z)))]. We show that G is closed. Suppose that w is an extension and ∀z(z ∈ w → Gz). We need to show that Gw. Suppose, first, that w is not a set. Since w is an ∈ w. So G and thus is a set. extension, we have that w = , and so Contradiction. So w is a set. Now suppose that z ∈ w. Then Gz. So z is a set and every member of z is pure. So by Theorem 1, z itself is pure. Therefore, Gw, and so G is closed. Now suppose that y is pure. Then Gy, and so every member of y is pure. In sum, then, an object y is pure if and only if y is a set and all members of y are pure. We saw above that the principle of extensionality holds for all extensions. Since the members of pure sets are themselves pure sets, we have that extensionality holds when the quantifiers are restricted to pure sets. Boolos [1989] proves a theorem that New V entails that a strong form of the axiom of foundation holds for the pure sets. His remarks suggest a proof of the same result in the general context here. Theorem 3: Let G be any property. If G holds of at least one pure set, then there is a pure set x such that Gx and G does not hold of any member of x. In symbols: ∃x(pure(x) & Gx) → ∃x(pure(x) & Gx & ∀y(y ∈ x → ¬Gy)). Proof 1: Suppose ∃x(pure(x) & Gx). Consider the following form: (∗)∀x(∀y(y ∈ x → Fy) → Fx) If (*) holds, then F is closed, and so ∀x(pure(x) → Fx). Now substitute ¬(pure(x) & Gx) for Fx. Suppose that (*) holds in this case. Then ∀x(pure(x) → Fx), i.e., ∀x(pure(x) → ¬(pure(x) & Gx)). But this contradicts our hypothesis, ∃x(pure(x) & Gx), that G holds of at least one pure set. So the negation of (*) holds in this case. With some rewriting, this is: ∃x(pure(x) & Gx & ∀y(y ∈ x → ¬( pur e(y) & Gy))).
370
The Arché Papers on the Mathematics of Abstraction
Theorem II above entails that if x is pure, then all members of x are pure. So we have ∃x(pure(x) & Gx & ∀y(y ∈ x → ¬Gy)), which was to be shown. The axiom of foundation is:
∀x[∃y(y ∈ x) → ∃y(y ∈ x & ¬∃z(z ∈ x & z ∈ y))]. We saw above that this is false in general, but it follows from Theorem III that foundation holds if the quantifiers are restricted to pure sets: Given a nonempty pure set x, substitute the property of being a member of x for G in Theorem III. Our theorems also entail that the property of being pure is BAD. Indeed, suppose that the property of being pure is GOOD. Let g be the extension of this property. It follows that g is a set and all members of g are pure. So, by Theorem I, we have that g is pure, and so g ∈ g. This contradicts foundation, since we saw above that for every pure set x, x ∈ / x. We can illustrate this with the two toy examples above. In the first, the only pure sets are Ø or {Ø}. But the property of being either Ø or {Ø} has two distinct instances, and so it is BAD under that interpretation of (RV). In the second example, the property of being hereditarily finite is infinite, and thus BAD. Define an object x to be transitive if every member of every member of x is itself a member of x: ∀y∀z((z ∈ y & y ∈ x) → z ∈ x), and define x to be an ordinal if x is pure, x is transitive, and every member of x is transitive. It is an exercise to show that if x is an ordinal and y ∈ x, then y is an ordinal (or see Boolos [1989], p. 101). Boolos shows that the ordinals are strongly well-ordered by ∈, and so each ordinal is well-ordered by ∈. It is straightforward that this carries over to the present, more general case. Burali-Forti type reasoning shows that the property of being an ordinal is BAD. Indeed, suppose that the property of being an ordinal is GOOD. Then let o be the extension of this property. It follows that o is pure, and so o itself an ordinal. So o ∈ o, which contradicts foundation. We would like to have one ordinal for each GOOD well-order type, but this does not follow from (RV) alone, even with the running assumptions. The issue must wait until the axiom of replacement is put on the table.
5.
The axioms
It is interesting how much of the set theory based on New V follows from (RV) alone, and does not depend on the particular interpretation of BAD as ‘equinumerous with the universe’. The axiom of extensionality (for
Prolegomenon to Any Future Neo-Logicist Set Theory
371
extensions) holds in general, and foundation holds for the pure sets. It follows from our running assumption that the empty property is GOOD if any properties are that if there are any GOOD properties, then the null set axiom holds. Moreover, the null set Ø is pure (if it is a set). In this section, we examine what assumptions must be made about GOOD in order for the other common settheoretic axioms to hold, either in general or when the variables are restricted to sets or to pure sets. Our stalking horse is the interpretation of ‘GOOD’ as Definite (not indefinitely extensible). Pairs: ∀x∀y∃z∀w(w ∈ z ≡ (w = x ∨ w = y)). Let a be any object in the domain. The axiom of pairs entails that there is an extension {a} whose sole member is a. If no properties are GOOD, then there is only one extension, the bad thing . So {a} = . But we have that ∀x(x ∈ ), and so ∈ . So, a = . Since a is arbitrary, if no properties is the only object in the are GOOD, then the axiom of pairs entails that universe. Of course, we do not get an interesting set theory from (RV) if there are no GOOD properties, for the simple reason that there would be no sets. If there is at least one GOOD property, then the axiom of pairs is a statement that for any objects x and y, there is a GOOD property that holds of x, y, and nothing else. The extension {x, y} of this property is a set. Thus, the axiom of pairs, restricted to sets, follows from unrestricted Pairs. Moreover, if x and y are pure, then {x, y} is also pure (Theorem I). Thus, the axiom of pairs, restricted to pure sets, follows. Above, we introduced an interpretation of (RV) in which some object b ‘infects’ any property that it has. That is, if Pb then P is BAD. If there is at least one GOOD property, the axiom of pairs rules this interpretation out. For any object a, the property Sa that applies to a alone is GOOD. In set theory, the axiom of pairs is fairly innocuous. Notice, however, that if at least one property is GOOD, then (RV) and the axiom of pairs entails that there are infinitely many sets (under the running assumption that the universal property U is BAD). First, if at least one property is GOOD, then there are at least two objects in the universe: a set and the bad thing . We have that and { } are distinct, since the former has at least two members and the latter has only one. Moreover, both of those are distinct from {{ }}. We can show, by induction, that the elements of the sequence , { }, {{ }}, {{{ }}}, . . . are pairwise distinct. Separation: ∀P∀x∃y∀z(z ∈ y ≡ (z ∈ x & Pz)) If the universal property is BAD, then unrestricted separation is inconsistent with (RV). Let P be any property and let x be the bad thing . Recall that ∀z(z ∈ ). Separation and (RV) thus entail that there is an extension whose members are all and only the objects z such that Pz. Since P is arbitrary, this leads to Russell’s paradox. From now on, we use the phrase ‘axiom of
372
The Arché Papers on the Mathematics of Abstraction
separation’ for the version of separation in which the first-order variables are restricted to sets. The second-order variable remains unrestricted. Thus construed, the axiom of separation amounts to a thesis that if a property Q is GOOD, then for any property P (whether P is GOOD or BAD), the property λx(Px & Qx) is GOOD. In other words, if Q is GOOD, then every sub-property of Q is GOOD. It follows from this that the version of separation restricted to pure sets also holds (see Theorem I). That is, if x is pure and P is any property, then there is a pure set whose members are exactly the members of x such that Px. Separation clearly holds for those instances of (RV) in which GOOD(Q) and BAD(Q) are a matter of the size of the Q’s. On these conceptions, a property Q is GOOD just in case it does not apply to too many objects. Clearly, λx(Px&Qx) does not apply to any more objects than Q does. So if Q is not too big, then neither is λx(Px&Qx). In particular, the axiom of separation holds for those instances of (RV) in which a property is GOOD if and only if it applies to fewer than κ-many objects (for some fixed cardinal number κ). It also holds for New V, the instance of (RV) in which ‘BAD’ is ‘equinumerous with the universe’. On the primary instance of (RV) in which ‘BAD’ is ‘indefinitely extensible’, separation is a statement that there are no indefinitely extensible properties that are sub-properties of Definite properties. This sounds plausible enough. As noted above, Russell ([1906, p. 152) called the thesis that a totality is a set just in case it is Definite the ‘limitation of size theory’. He glossed it as follows: ‘there will be (so to speak) a certain limit of size which no [set] can reach; and any supposed [set] which reaches or surpasses this limit is . . . improper . . . , i.e., is a non-entity’. The axiom of separation is a converse to this. It says that if a totality does not exceed the limit, then it determines a set. This is consonant with Russell’s framework. In Section 3, we saw that Zermelo invoked something like this interpretation and, of course, Zermelo accepted separation. For his part, Dummett ([1991], p. 317) explicitly endorses the principle that underlies separation: ‘it must be allowed that every concept defined over a [D]efinite totality determines a [D]efinite subtotality’. The same idea is expressed in Dummett ([1993], p. 441): ‘A concept whose application to a determinate totality is itself determinate must pick out a determinate subtotality of elements that fall under it’. In the context of this passage, Dummett begins with a given Definite totality of objects, and then applies the indefinitely extensible property of ‘being a class that is not a member of itself’, concluding that the resulting sub-totality is Definite. Although we seem to have reached a consensus, recall that Dummett’s notion of Definite is highly attenuated. He holds that every infinite totality is indefinitely extensible. So for Dummett, separation is only a thesis that if a property is finite, then so is every sub-property. If we relax Dummett’s notion of Definite—as we must if we are to obtain a viable, rich set theory—but retain
Prolegomenon to Any Future Neo-Logicist Set Theory
373
the rest of Dummett’s framework, we lose separation. Dummett introduced the notion of indefinite extensibility in [1963], where he argued that the notion of arithmetic truth is indefinitely extensible. The argument for this turns on the incompleteness theorem, and does not depend on his (later) view that every infinite totality is indefinitely extensible, nor does it turn on Dummett’s thesis that only intuitionistic logic applies to theories of indefinitely extensible totalities. It follows from Dummett’s 1963 conclusion that the property of being the Gödel code of an arithmetic truth is indefinitely extensible. Yet each such Gödel code is also a natural number. So if the natural numbers are a Definite totality (contra Dummett), but Dummett is right with his earlier argument that ‘arithmetic truth’ is indefinitely extensible, then Separation fails. There is a set of natural numbers, but no set of Gödel numbers of arithmetic truths. In short, if our neo-logicist is going to invoke (RV) with ‘BAD’ as ‘indefinitely extensible’, then she must either give up on the existence of infinite sets, give up on separation, or undermine Dummett’s claim that the arithmetic truths are indefinitely extensible. The first of these is a bitter pill to swallow, and given the central role of separation in contemporary set theory, we do not get a viable theory with the second option. Of course, this does not amount to an argument for the third horn of the Dummettian dilemma. The semantic antinomies lead to similar conundrums for indefinite extensibility and separation. Consider, for example, the property D of ‘being a natural number that can be uniquely characterized in English with fewer than three thousand characters’. Suppose that D is Definite. Since there are only finitely many combinations of three thousand Latin letters, it is not the case that there are infinitely many natural numbers x such that Dx. Let n be the least number x such that Dx is false. So Dn is false. But we just defined n, and used fewer than three thousand characters to do so. So we should have Dn—a contradiction. This suggests that D is indefinitely extensible. And D is not even infinite! This, of course, is familiar ground. I leave it to the neo-logicist who is tempted by the present approach to articulate the notion of indefinite extensibility to avoid these undesirable results, and obtain a viable set theory. Replacement: ∀ R[(∀z∀y∀w(Rzy & Rzw) → y = w) → ∀ x∃y∀z∀w((z ∈ x & Rzw) → w ∈ y)]. In words, replacement says that if a relation R is many-one (i.e., is functional) then for every x there is an extension y whose members include every w such that there is a z ∈ x with Rzw. Unrestricted, the axiom of separation is a trivial consequence of (RV)—under the running assumption that the universal property U is BAD. For any x, just let y be . So from now on, we use the phrase ‘axiom of replacement’ for the version of replacement in which the first-order variables are restricted to sets. It says that if a relation R is
374
The Arché Papers on the Mathematics of Abstraction
many-one then for every set x there is a set y whose members include every w such that there is a z ∈ x with Rzw. Replacement and separation together entail that if R is many-one, then for every set x there is a set y such that w ∈ y if and only if there is a z ∈ x with Rzw. In other words, if P is GOOD and Q is equinumerous with a sub-property of P, then Q is also GOOD. Replacement and separation together pretty much express the ‘limitation of size’ conception of set theory. If P is not too big and Q is equinumerous with a sub-property of P, then Q is not too big—for the simple reason that Q is no bigger than P. Replacement holds on those instances of (RV) in which a property is GOOD if and only if it applies to fewer than κ-many objects (for some fixed cardinal number κ), and it also holds for New V. One would think that replacement should also hold when ‘BAD’ is ‘indefinitely extensible’ as well. If R is many-one, and if a totality P is Definite, then the result of replacing each P with its correlate under R (if it has one) should be Definite as well. Suppose that R is many-one, and that if z is pure and Rzw then w is pure. Let P be any property of pure sets. Replacement and separation entail that λw(∃z(Pz&Rzw)) is GOOD and so its extension is a set s. By hypothesis, every member of s is pure, and so from Theorem I above, s is itself pure. So replacement and separation entail the version of the axiom of replacement in which the first-order variables are restricted to pure sets. It is well-known that replacement does not follow from the other axioms. This applies even when (RV) is added as an axiom. In effect, we can define the Ext operator on models of Zermelo set theory to produce models of (RV). For example, let c be any object that is not a set. Let C0 = {c}. For each ordinal α, let Cα+1 be the powerset of Cα , and if β is a limit ordinal, then let Cβ be union of the Cα for α < β. Let C be C2ω . In other words, C is the iterative hierarchy up to level 2ω, beginning with a single urelement c. On C define a property P to be GOOD if its extension is a member of C. That is, P is GOOD if and only if {x| Px} ∈ C. Now, if P is GOOD, then let Ext(P) = {x| Px}; and if P is BAD, then let Ext(P) = c. That is, we interpret c as the bad thing . That is, it is straightforward that (RV) holds under this interpretation. All of the axioms of Zermelo set theory—restricted to sets—hold on this interpretation, but replacement fails. Union:
∀x∃y∀z(z ∈ y ≡ ∃w(z ∈ w&w ∈ x))
In the present context, this axiom is probably the most technically interesting. If the first two quantifiers are restricted to sets, then Union says that if a property P is GOOD, then the property λx(∃Q(P(Ext(Q)) & Qx)) is also GOOD. We saw in Section 4 that this fails in every serious interpretation of (RV). If the universal property is BAD and singletons are GOOD, then { } is a set, but there is no set whose members are the members of { }.
Prolegomenon to Any Future Neo-Logicist Set Theory
375
This unpleasant fact occurs because the bad thing can be a member of an otherwise decent set. Let us define set-union to be the result of restricting all of the quantifiers in Union to sets. So set-union says that if x is a set then there is a set whose members are the set-members of the set-members of x. Similarly, define pure-union to be the result of restricting all of the quantifiers in Union to pure sets. Suppose that x is pure. Then the members of the members of x are all pure. So pure-union amounts to a thesis that if x is pure, then the property of being a member of a member of x is GOOD. Pure-union follows from set-union. There are models of (RV) which satisfy the other axioms, restricted to pure sets, but pure-union (and thus set-union) fail. Indeed, recall that in Section 3, we noted that for any cardinal number κ, there is an instance of (RV) in which a property is GOOD if and only if it applies to fewer than κ-many objects. This version of (RV) is satisfiable on a rank Vλ if the cofinality of λ is at least κ. Suppose that the cofinality of λ is at least κ, and let c be any object that is not a set. Let C0 = {c}. As above, for each ordinal α, let Cα+1 be the powerset of Cα , and if β is a limit ordinal, then let Cβ be union of Cα for α < β. Let C be Cλ . On C define a property P to be GOOD if it applies to fewer than κ-many objects in C. Since the cofinality of λ is at least κ, P is GOOD only if {x | Px} ∈ C. So if P is GOOD, then let Ext(P) = {x| Px}. If P is BAD, then let Ext(P) = c. Again, we interpret c as the bad thing . It is straightforward that (RV) holds on this interpretation. A set is pure if it is a member of Vκ and its cardinality is less than κ. As usual, let 0 = ℵ0 ; for each ordinal α, let α+1 be the cardinality of the powerset of α ; and if β is a limit ordinal, then let β be the union of α for α < β. Now let κ be ω and let λ be any cardinal whose cofinality is greater than ω . Then Cλ is a model of (RV) in which ‘GOOD’ is ‘holds of fewer than ω -many objects’. Notice, however, that the property of ‘being n , for some n ∈ ω’ is GOOD, since it holds of only countably many objects. So there is a set b = {0 , 1 , 2 , . . .}. However, the property of being a member of a member of b holds of each member of ω . So this property is BAD. Thus b has no union in C, under this version of (RV). Thus, pure-union and set-union both fail in C. This example also refutes the unrestricted version of the union axiom. There is no object in C whose members are the members of the members of b. Notice that the other axioms of Zermelo–Fraenkel set theory, restricted to sets, hold in C under this version of (RV), as do the other axioms restricted to pure sets. Pairs, separation, replacement, infinity, and choice are immediate, or follow from previous considerations. That leaves powerset. If x is a set then the cardinality of x is less than ω , and so there is a natural number n such that the cardinality of x is less than n . Thus, the cardinality of the powerset of x is less than or equal to n+1 , and so the cardinality of the powerset of x is less than ω . So the property of being a subset of x is GOOD, and so the powerset of x is a set in this model.
376
The Arché Papers on the Mathematics of Abstraction
Of course, the counterexample here is artificial. It turns on κ being a singular limit in the -series, and λ having sufficiently large cofinality. The point was just to show that union, pure-union, and set-union do not follow from (RV) plus the other axioms of set theory. Boolos [1989] shows how to adapt a lovely demonstration from Azriel Lévy [1968] to establish pure-union for New V, the instance of (RV) in which ‘BAD’ is ‘equinumerous with the universe’. What of set-union and pure-union on the preferred interpretation of ‘BAD’ as ‘indefinitely extensible’? Set-union amounts to a statement that if t is a Definite totality of Definite totalities, then the property of being a member of a member of t is itself Definite. As we just saw, the size of the union of a set s can be (much) larger than the size of s and larger than the size of any member of s. The issue here concerns whether this union can be so much larger that it becomes indefinitely extensible. Pending further articulation of the notion of indefinite extensibility, I do not know how to adjudicate the question. Infinity: As we saw above, the axiom of pairs entails that the universe is infinite, but (RV) and the other axioms together (restricted to sets and/or to pure sets) do not entail that there is an infinite set. That is, it does not follow from those axioms that there is a GOOD property that applies to infinitely many objects. To see this, interpret ‘BAD’ as ‘applies to infinitely many objects’ and interpret the ‘Ext’ operator so that (RV) and the other axioms, restricted to sets, hold on Vω , the universe of hereditarily-finite sets. This model is, of course, infinite, but in it every ‘set’ is finite. Our next order of business is an axiom asserting the existence of an infinite set. Let x be any object. Let sx be the extension of λy(y ∈ x ∨ y = x). It follows from Pairs and set-union that if x is a set, then so is sx. Also, if x is pure and sx is a set, then sx is pure. Infinity: ∃x(Ø ∈ x & ∀y(y ∈ x → sy ∈ x)) Unrestricted, this trivially follows from (RV), together with our running assumption that the Universal property is BAD. Just let x be . Of course, is this ‘axiom’ has nothing to do with the size of the universe—it holds if the only object in the domain. However, if the opening quantifier of Infinity is restricted to sets, the axiom states that there is a set that contains Ø and is closed under s. With separation, this version of Infinity entails that the property of being a finite ordinal is GOOD. Thus, ω is a set. In light of replacement, this axiom is equivalent to a statement that there is a GOOD property that applies to infinitely many things. If we define BAD as ‘indefinitely extensible’, then the requisite thesis is that there is a Definite property that applies to infinitely many objects. As noted, Dummett rejects this, perhaps with Leibniz, but most others who invoke indefinite extensibility accept it. Again, Dummett must lose this debate, if
Prolegomenon to Any Future Neo-Logicist Set Theory
377
there is to be a strong, viable neo-logicist set theory developed along present lines. Choice: ∀x[(∀y ∈ x∃w ∈ y & ∀y ∈ x∀z ∈ x¬∃w(w ∈ y & w ∈ z)) → ∃v∀y ∈ x∃!z(z ∈ y & z ∈ v)] This says that if every member of x is non-empty, and if the members of x are pairwise disjoint, then there is a v that contains exactly one member of each member of x. Under the prevailing assumption that the universal property U is BAD, there is no need to restrict this axiom. If x is the bad thing , then the antecedent of the conditional is false (unless is the only object), and if x is not an extension, then the consequent is vacuously true. The present version is sometimes called local choice, since it only entails the existence of ‘choices’ on sets (i.e., the extensions of GOOD properties). If we follow Zermelo [1930] and assume global choice as a general logical principle (perhaps in the form ∀R(∀x∃y(Rxy) → ∃ f ∀x(Rxfx)) from Shapiro [1991], p. 67), then local choice follows from set-union and separation. Global choice follows from New V: we noted above that it follows from (RV) that the property of being an ordinal is BAD. So from New V, it follows that the ordinals are equinumerous with the universe. This yields a wellordering on the universe, and global (and thus local) choice follow from that (see Shapiro and Weir [1999]). In the general setting, no version of Choice follows from (RV) and the other axioms (unless it is assumed in the meta-theory as in Zermelo [1930]). The neo-logicist who follows the present program should either explicitly assume choice, defend it, or drop the principle. The local axiom of choice is weaker than global choice. Perhaps our neo-logicist may find local choice more defensible than global choice, since some BAD properties may be too ill-behaved for choice to hold (of which more momentarily). Local choice, at least, is now widely accepted among mathematicians. The underlying reason seems to be pragmatic. After several decades of intense study, we know how crippled the most central branches of mathematics—real analysis, topology, etc.—would be without this principle (see Moore [1982]). I do not know how much weight such pragmatic considerations can play in the neo-logicist quest for an a priori foundation for mathematics, based on acceptable abstraction principles. Powerset: ∀x∃y∀z(z ∈ y ≡ ∀w(w ∈ z → w ∈ x)). There is no need to restrict this principle either. If x is , then the relevant y is also . If x is not an extension, then its ‘powerset’ is the same as that of the empty set. The only cases of interest are where x is the extension of a GOOD property. Let P, Q be properties. Call Q a sub-property of P if ∀x(Qx → Px). Separation entails that if P is GOOD and Q is a sub-property of P, then Q is GOOD. And if x is pure and y ⊆ x, then y is pure. Now define the
378
The Arché Papers on the Mathematics of Abstraction
power-property P, written (P) to be the property of being the extension of a sub-property of P. That is, (P)x ≡ ∃Q(x = Ext(Q) & ∀y(Qy → Py)). The powerset axiom amounts to a thesis that if P is GOOD, then the powerproperty of P is GOOD. This principle entails the version of powerset where the variables are restricted to pure sets. With Infinity, and the usual settheoretic definitions, Powerset entails that the property of being a real number is GOOD. On the preferred interpretation of ‘BAD’ as ‘indefinitely extensible’, Powerset may be the most difficult axiom for our neo-logicist to justify. Some set-theorists conclude from the wealth of independence results that a certain indeterminacy holds of the powerset of any infinite set, and this indeterminacy has at least the flavor of indefinite extensibility. For example, given any set x, it is determinate whether or not x is a subset of ω, for this is just for every member of x to be a finite ordinal. But, the argument continues, there is a certain indefiniteness about the totality of the subsets of ω. In present terms, this suggests that even if the natural numbers form a Definite totality, the real numbers may not. These set theorists thus hold that Dummett and Leibniz may be wrong about countable infinities—those are Definite all right—but they are right about the continuum, and any other supposedly larger collections. The first place where Zermelo’s well-ordering theorem strains intuitions is the real numbers or, equivalently, the powerset of ω. Zermelo showed that the existence of such a well-ordering follows from a choice function on the real numbers (i.e., a function that picks out a member of each non-empty set of reals). The aforementioned set theorists, or our Dummettian, might respond by insisting on the letter of the official axiom of choice above, namely that Definite totalities have choice functions. We cannot conclude that the real numbers are well-ordered until we have shown that they constitute a Definite totality. If this is correct, it tells against the thesis underlying Powerset on this version of (RV). Of course, most contemporary set-theorists do not balk at the powerset axiom (nor at the well-ordering theorem as applied to infinite powersets). Without at least some infinite powersets, we do not have a viable set theory. So far as I know, however, these set-theorists and mathematicians are not neo-logicists, and do not attempt to justify the axioms in terms of acceptable abstraction principles. The theory is accepted on pragmatic grounds. Again, the burden on a neo-logicist intent on capturing a viable set theory is to provide an a priori justification for the powerset principle.
6.
Brief closing
To summarize, assume that the empty property is GOOD and the universal property is BAD. Extensionality follows from (RV) alone. Foundation fails,
Prolegomenon to Any Future Neo-Logicist Set Theory
379
but when the quantifiers are restricted to pure sets, foundation follows from (RV). Null set is the assumption that the empty property is GOOD. We gloss the other axioms in the context of (RV): Pairs: for any x, y, the property that holds of x, y, and nothing else is GOOD. Separation (restricted to sets): for any GOOD property Q of sets, every sub-property of Q is GOOD. Replacement (restricted to sets): for any GOOD property Q of sets, every property of sets equinumerous with a sub-property of Q is GOOD. Union (restricted to sets): for any GOOD property Q, the property of being a set that is a member of a set that is a member of Q is GOOD. Infinity: there is a GOOD property that applies to infinitely many objects. Choice: for any GOOD property Q, if, for every x,y, if Qx and Qy, then x and y are non-empty and have no members in common, then there is a GOOD property P such that for each x such that Qx there is a unique z such that Pz and z ∈ x. Powerset: for any GOOD property Q, the property of being the extension of a sub-property of Q is GOOD. The route above was long, but our conclusion is short. I argued in Section 2 that something like (RV) must be part of a neo-logicist foundation for set theory. Indeed, (RV) is little more than a statement that sets are individuated extensionally, and extensionality is an essential property of sets, if anything is. The neo-logicist who wants to develop set theory needs to articulate the notion of a BAD property. If she can show that the resulting instance of (RV) is an acceptable abstraction, and can sustain the theses that underlie the various axioms above, then she has a theory as powerful as ZFC. I will leave it to the reader to decide whether to do a modus ponens or a modus tolens at this juncture. Suppose that the neo-logicist articulates ‘BAD’ as something like ‘indefinitely extensible’, and succeeds in justifying the axioms of ZFC, as above. We have that the property of being a pure set is BAD and thus indefinitely extensible. This suggests another axiom, which has further ontological consequences: Reflection: Let be any sentence (in the language of second-order set theory) that holds on the ‘totality’ of pure sets. Then there is a GOOD property P such that holds when the quantifiers are restricted to P. This scheme is sometimes called a ‘principle of plentitude’. The idea is that the iterative hierarchy—the ‘totality’ of pure sets—is so indefinite that it is not possible to characterize it uniquely. Any sentence that is true of the iterative hierarchy is true in some set. Something like this seems to underlie Zermelo [1930]. The instances of Reflection in which is first-order are themselves theorems of ZFC. However, the higher-order instances have consequences concerning the ‘size’ of the iterative hierarchy. Since second-order ZFC is
380
The Arché Papers on the Mathematics of Abstraction
finitely axiomatized, Reflection entails that there is a set s that satisfies the axioms of second-order ZFC. Let c be the collection of cardinals in s. By separation, c is a set. Since powerset holds in s, c is larger than the powerset of any of its members. Since replacement holds in s, no cardinal in c is cofinal in c. Thus, c is a strong inaccessible. That is, second-order ZFC plus Reflection entails that there is a strongly inaccessible cardinal. Indeed, Reflection entails that the property of being a strong inaccessible is itself BAD—the ‘totality’ of strong inaccessibles is too large to be a set. This is only the beginning. The consequences of Reflection include the existence of many so-called ‘small large cardinals’, such as Mahlo cardinals and hyper-Mahlo cardinals. Reflection is equivalent to one of Lévy’s [1960, 1960a] ‘strong axioms of infinity’ (see also Shapiro [1987] or [1991], Chapter 6, §6.3). In sum, things are wide open concerning the extent of indefinite extensibility. Some folks, like Dummett and perhaps Leibniz, hold that only finite totalities are Definite. Others hold that there are Definite totalities that are unfathomably large—at least to those of us who have trouble thinking of Mahlo cardinals as small large cardinals. I do not know where the neo-logicist should fall on this continuum, but a theory in the neighborhood of ZFC requires some distance from the Dummett–Leibniz end.
Acknowledgments I would like to thank the Arché working group at the University of St. Andrews for devoting several sessions to this project. Special thanks to Peter Clark, Roy Cook, Bob Hale, Jeffrey Ketland, Fraser MacBride, Graham Priest, Agustin Rayo, Crispin Wright, and two anonymous referees.
References Boolos, G. [1987]: ‘The consistency of Frege’s Foundations of arithmetic’, in Judith Jarvis Thompson (ed.), On being and saying: Essays for Richard Cartwright, Cambridge, Massachusetts: The MIT Press, pp. 3–20. Boolos, G. [1989]: ‘Iteration again’, Philosophical Topics, 17, pp. 5–21. Boolos, G. [1997]: ‘Is Hume’s principle analytic?’, in Richard Heck, Jr. (ed.), Language, thought, and logic, Oxford: Oxford University Press, pp. 245–261. Cantor, G. [1899]: ‘Letter to Dedekind’, in van Heijenoort (ed.) [1967], pp. 113–117. Clark, P. [2000]: ‘Indefinite extensibility and set theory’, talk to Arché workshop on abstraction, University of St. Andrews. Dummett, M. [1963]: ‘The philosophical significance of Gödel’s theorem’, Ratio, 5, pp. 140– 155. Dummett, M. [1991]: Frege: Philosophy of mathematics, Cambridge, Massachusetts: Harvard University Press. Dummett, M. [1993]: The seas of language, Oxford: Oxford University Press. Frege, G. [1884]: Die Grundlagen der Arithmetik, Breslau: Koebner; The foundations of arithmetic; translated by J. Austin, second edition, New York: Harper, 1960. Frege, G. [1893]: Grundgesetze der Arithmetik 1, Olms: Hildescheim. Hale, Bob [1987]: Abstract objects, Oxford: Basil Blackwell.
Prolegomenon to Any Future Neo-Logicist Set Theory
381
Hale, Bob [2000]: ‘Reals by abstraction’, Philosophia Mathematica 8(3), pp. 100–123; reprinted in Hale and Wright [2001], pp. 399–420. Hale, Bob, and Crispin Wright, [2001]: The reason’s proper study: essays toward a neo-Fregean philosophy of mathematics, Oxford: Oxford University Press. Heck, R. [1992]: ‘On the consistency of second-order contextual definitions’, Nous, 26, pp. 491–494. Leibniz, G. [1863]: Mathematische Scriften von Gottfried Wilhelm Leibniz, edited by C. I. Gerhart, Berlin: A. Asher, H. Halle, W. Schmidt, 1849–1863. Leibniz, G. [1980]: Samtliche Schriften und Briefe: Philosophishce Schriften. Series 6, Volumes 1–3, Berlin: Akademie-Verlag, 1923–1980. Leibniz, G. [1996]: New essays on human understanding, edited and translated by P. Remnant and J. Bennett, New York: Cambridge University Press. Levey, Samuel [1998]: ‘Leibniz on mathematics and the actually infinite division of matter’, Philosophical Review, 107, pp. 49–96. Lévy, A. [1960]: ‘Principles of reflection in axiomatic set theory’, Fundamenta Mathematicae, 49, pp. 1–10. Lévy, A. [1960a]: ‘Axiom schemata of strong infinity in axiomatic set theory’, Pacific Journal of Mathematics, 10, pp. 223–238. Lévy, A. [1968]: ‘On von Neumann’s axiom system for set theory’, American Mathematical Monthly, 75, pp. 762–763. Moore, G. H. [1982]: Zermelo’s axiom of choice: Its origins, development, and influence, New York: Springer-Verlag. Quine, W. V. O. [1986]: Philosophy of logic, second edition, Englewood Cliffs, New Jersey: Prentice-Hall. Russell, B. [1903]: The principles of mathematics, London: Allen and Unwin. Russell, B. [1906]: ‘On some difficulties in the theory of transfinite numbers and order types’, Proceedings of the London Mathematical Society, 4, pp. 29–53. Russell, B. [1993]: Introduction to mathematical philosophy, New York: Dover (first published in 1919). Shapiro, S. [1987]: ‘Principles of reflection and second-order logic’, Journal of Philosophical Logic, 16, pp. 309–333. Shapiro, S. [1991]: Foundations without foundationalism: A case for second-order logic, Oxford: Oxford University Press. Shapiro, S. [2000]: ‘Frege meets Dedekind: a neo-logicist treatment of real analysis’, Notre Dame Journal of Formal Logic 41, pp. 335–364. Shapiro, S. and A. Weir [1999]: ‘New V, ZF, and abstraction’, Philosophia Mathematica, 7(3), pp. 293–321. Shapiro, S. and A. Weir [2000]: “‘Neo-logicist” logic is not epistemically innocent’, Philosophia Mathematica, 8(3), pp. 163–189. Van Heijenoort, J. (ed.) [1967]: From Frege to Gödel, Cambridge, Massachusetts: Harvard University Press. Von Neumann, J. [1925]: ‘Eine Axiomatisierung der Mengenlehre’, Journal für die reine und angewandte Mathematik, 154, 219–240; translated in van Heijenoort [1967], pp. 393–413. Weir, A., [2003]: “Neo-Fregeanism: an embarrassment of riches?”, Notre Dame Journal of Formal Logic 44, pp. 13–48. Wright, C. [1983]: Frege’s conception of numbers as objects, Aberdeen: Aberdeen University Press. Wright, C. [1997]: ‘On the philosophical significance of Frege’s theorem’, in Richard Heck, Jr. (ed.), Language, thought, and logic, Oxford: Oxford University Press, pp. 201–244; reprinted in Hale and Wright [2001], pp. 272–306. Wright, C. [1998]: ‘On the harmless impredicativity of N= (Hume’s principle)’, in Mathias Schirn (ed.), The philosophy of mathematics today, Oxford: Oxford University Press, pp. 339–368; reprinted in Hale and Wright [2001], pp. 229–255. Wright, C. [1999]: ‘Is Hume’s Principle analytic’, Notre Dame Journal of Formal Logic, 40, pp. 6–30; reprinted in Hale and Wright [2001], pp. 307–332.
382
The Arché Papers on the Mathematics of Abstraction
Wright, C. and Bob Hale [2000]: ‘Implicit definition and the a priori’, in P. Boghossian and C. Peacocke (eds.) New essays on the a priori, Oxford: Oxford University Press, pp. 286–319; reprinted in Hale and Wright [2001], pp. 117–150. Zermelo, E. [1930]: ‘Über Grenzzahlen und Mengenbereiche: Neue Untersuchungen über die Grundlagen der Mengenlehre’, Fundamenta Mathematicae, 16, pp. 29–47; translated as ‘On boundary numbers and domains of sets: new investigations in the foundations of set theory’, in William Ewald (ed.), From Kant to Hilbert: a source book in the foundations of mathematics, Oxford: Oxford University Press, 1996, Volume 2, pp. 1219– 1233.
NEO-FREGEANISM: AN EMBARRASSMENT OF RICHES 1 A. Weir
1.
Introduction
In the last decade or two there has been a revival of interest in logicism recast not as the doctrine that mathematics is logic but rather as the claim mathematical truths have something like the status assigned to them by the logicists. The ‘neo-logicist’ contention is that mathematical truths are known, where they are, neither by some mysterious form of direct intuition nor by empirical confirmation, even of an indirect and holistic fashion via the scientific theories they contribute to. Rather mathematical knowledge arises on the basis solely of the understanding of the basic mathematical and logical concepts which anyone who grasps the mathematical truths has. This view might be interpreted as saying that mathematical truths are analytic, are true by virtue of meaning, similarly that fundamental mathematical inference rules are meaningconstitutive. Since the notion of analyticity is still under a cloud, in some quarters, a more broadly acceptable goal for the neo-logicist might be to try to establish that mathematical axioms are implicit definitions since, prima facie, anyway, this does not commit one to the notion of analyticity; this, indeed, is the direction which recent work has taken (see Hale and Wright [11]). Having “something like the status assigned them by the logicists” is a vague notion, one which the neo-logicists need to clarify if their view is to be assessed fully. But I take it to be clear enough to be going on with. In particular, if it could established that mathematical truths, or even just a substantial proportion of them, are known by a process different from that proposed by the neo-Kantian platonists or by Quinean empiricists, a process whose materials essentially involve only appeal to grasp of mathematical language, then this would constitute a major advance in the epistemology of mathematics. It would establish the ‘epistemic innocence’ (cf. Shapiro and 1 This paper first appeared in the Notre Dame Journal of Formal Logic 44, [2004], pp. 13–48. Reprinted by kind permission of the editor and the University of Notre Dame.
383 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 383–420. c 2007 Springer.
384
The Arché Papers on the Mathematics of Abstraction
Weir [26] p. 160) of mathematics. So in what follows I will assume that the idea of “epistemic innocence” is clear enough for a fruitful debate with the neo-logicist to take place, while noting that the neo-logicist owes the wider philosophical community a fuller account of what it amounts to. The main proponents of this program, philosophers such as Wright and Hale, 2 take it that they are continuing and developing a program initiated by Frege and so characterise the program as neo-Fregean as well as neo-logicist. Neo-Fregeans also want to uphold Frege’s platonism at least to the extent of holding that truth in pure mathematics is as objective as truth in the empirical sciences, however, exactly one wishes to analyse the notion of objectivity in the sciences. They reject, moreover, any form of relativism in mathematics (cf. Wright [36] p. 293). They also reject the idea that there is a plurality of mathematical domains—of different set theories, or different domains of sets, numbers and categories, and so forth—which cannot all be accumulated into a single mega-universe. Clearly, the neo-Fregean position is a highly attractive one for anyone sympathetic to the traditional view that mathematics is a system of objective truths knowable a priori but who is also sensitive to the usual epistemological problems raised against platonistic mathematics, most notably the puzzle as to how we could gain knowledge of a world of causally inert abstract objects. But how can grasp of mathematical language yield knowledge of the existence of a rich realm of abstract entities which exist independently of our language or conceptual system? To try to explain this, the neo-Fregean focuses on abstraction principles. Abstraction principles are principles of the form: 3 αx(ϕx) = αx(ψ x) ↔ (ϕ, ψ) where α is some term-forming variable-binding operator which forms singular terms from open sentences and is an equivalence relation over properties. One key example is “Hume’s Principle”(HP): 4 ∀X ∀Y ((nxXx = nxYx) ↔ X 1 − 1Y ) with X 1–1 Y the second-order sentence which expresses the existence of a one–one correspondence between the X ’s and the Y ’s. An important impetus to the revival of neo-Fregeanism has been the detailed sketch, by Wright (Wright [35]) of what has become known as Frege’s Theorem (cf. Boolos [2] p. 209, Wright [36] p. 273)—the derivability of second-order arithmetic 2 See the papers in Hale and Wright [12] particularly Hale [13], Wright [36], [37] and [38] (page references are to Hale and Wright [12] not to the original articles). See also Hale [10] and Wright [35]. For a different, more constructivist neo-logicism, see Neil Tennant [28], [29] and [30]. 3 The neo-Fregeans concern themselves mostly with second-order abstraction principles, in which the right hand side specifies an equivalence relation over the domain of properties, rather than first-order abstraction principles specified by an equivalence relation over individuals of which Frege’s abstraction of identity for directions from parallelism for lines—Grundlagen §§64–65, Frege [8] pp. 74–7—is a wellknown example. 4 The term is George Boolos’ in Boolos [1] p. 171 following Frege’s rather honorific reference to the Treatise Book I, III.i in Grundlagen §63, [8] p. 73.
Neo-Fregeanism: An Embarrassment of Riches
385
from second-order logic plus Hume’s Principle. For example, from this principle one can derive in standard second-order logic 5 a theory even stronger than (though equi-consistent with) the usual Peano–Dedekind formulation of second-order arithmetic. 6 The more general neo-Fregeanism goal, following on from this result, is to show that there is an abstraction principle, or set of principles, A such that A plus second-order logic yields all mathematical truths, or at any rate all those truths we need to do empirical science and metamathematics. In addition to the formal goal, the neo-Fregean seeks to convince us that both the abstraction principle A and second-order logic are epistemically innocent. One way to cash this out would be to claim that anyone who grasped a proof of a mathematical result R from A and second-order logic is thereby in a position to know that R is true in something like an a priori fashion. 7 This means that our mathematician must be able to know a priori (or in an epistemically innocent way) the truth of A, and similarly know in some innocent fashion the truth of the axioms and soundness of the inference rules used in the derivation. It is with the abstraction principles, and the problems which arise if one holds both that they are objectively true and that they are epistemically innocent, that this paper concerns itself. There are, of course, a number of other steep challenges which neo-Fregeanism faces, for example, the antiAnselmian insistence that one cannot prove objective existence claims a priori; or the challenge of showing that the standard second-order logic used in the derivation of Frege’s Theorem and presumably in any stronger mathematical results, is a system of epistemically innocent truths. For this logic is a classical “non-free” logic which includes the full impredicative axiom schema of comprehension whereby the second-order quantifiers are interpreted as including in their range every subset S of the domain of individuals (or include a property for each such extension). 8 However, in this paper I will focus solely on the innocence or guilt of the abstraction principles which the neo-Fregean appeals to. The structure of the paper is as follows. The next section sets out the main objection I will be concerned with, the “Embarrassment of Riches” (ER) objection whilst Section 3 looks at Wright’s first main response, the appeal to 5 See Shapiro [24] for an account of standard second-order logic which I take to include Axiom Schemata of Comprehension for predicate formulae of any adicity. 6 Richard Heck [16] shows that Hume’s Principle generates a stronger theory than the usual formulation of second-order Peano arithmetic (with 0 and successor or predecessor), relative to standard bridge principles defining the notions of the one theory in terms of the other. Burgess, Hazen and Hodes also noted the consistency of the system (Heck [16] fn. 12). Wright notes ([36] p. 273 fn. 4) that Charles Parsons first pointed out in 1964, what Wright [35] later showed in some detail, namely that Hume’s Principle yields second-order arithmetic. 7 Important questions arise concerning the relationship between the knowledge of the mathematical logician who derives R from A in this way and the “ordinary” mathematician (in cases of simple arithmetic, this can be any individual with a basic competence in counting and so forth) who knows R without using any formal logic. However, I leave those questions to one side in this paper. 8 For objections to the claim that the second-order logic needed to gain substantial results from Hume’s Principle is epistemically innocent see Shapiro and Weir [27].
386
The Arché Papers on the Mathematics of Abstraction
conservativeness principles. Section 4 shows that this response on its own is inadequate while Section 5 looks at a couple of additional criteria that Wright sketches, arguing that they too will not do. In Section 6 I develop a third notion into a stronger criteria, stability which seems to provide the best hope for the neo-Fregean of providing an answer to the embarrassment of riches objection. In Section 7, though, I argue that an analogue of the ER objection simply recurs at a meta-theoretic level. In the final section, Section 8, I argue that this shows the full neo-Fregean program is fatally flawed but that there may be less ‘Fregean’ variants which can survive these objections.
2.
Embarrassment of riches
Since my focus is on the abstraction principles, not the logic, I shall assume for the purposes of the present argument that a priori existence proofs cannot be ruled out of court, that there are analytic or meaning-constitutive, or more broadly, perhaps, epistemically innocent, principles or rules and that the system of such principles and rules includes standard second-order logic. The embarrassment of riches objection is to the effect that more principles than can possibly all hold true together can be validated by the neo-Fregean methods. 9 I will introduce the ER objections by starting with a related one which Crispin Wright calls the “Bad Company” objection, one raised by George Boolos, Michael Dummett and Hartry Field: the objection is that Hume’s Principle is formally very similar to the naïve rules for class, embodied, in one form, in Frege’s notorious Axiom V: 10 ∀X ∀Y ({x : Xx} = {x : Yx} ↔ ∀z(Xz ↔ Yz)) If the former is analytic, so is the latter. But since the latter is inconsistent, it cannot, surely, be analytic, hence neither is Hume’s Principle, nor any similar abstraction principle. As it stands, this Bad Company objection is not all that strong. One possible response is to deny that Axiom V is inconsistent (or at least deny that the theory of Axiom V is trivial). This would involve fairly extensive restriction of classical logic, of course: Axiom V is inconsistent not only in intuitionistic logic but also in relevant logics such as T or RWX, these being weaker than the well-known E and R. However, one increasingly common strategy with non-classical logics is to so set up the division between operational rules 9 So there is a link with Anselm since the objections bear a structural resemblance to objections made against Anselm’s ontological proof of the existence of God. Gaunilo of Marmoutier famously objected to Anselm that his proof could be adapted to prove that the most excellent island exists and objectors following Gaunilo claimed that Anselmian arguments could be used to generate existence proofs for too many types of things. 10 George Boolos [2] p. 214; see also [3]. For Field see [5] p. 158. Dummett also criticises Wright on similar lines. He objects to the use of a method which is known, he alleges, to lead to disaster when one has given no principled explanation of the difference between the legitimate and illegitimate uses—a principled explanation amounting to more than just saying that no contradiction seems to follow in the legitimate case; see [4] pp. 188–9, 208.
Neo-Fregeanism: An Embarrassment of Riches
387
(e.g. in natural deduction systems introduction and elimination rules for the logical constants) and structural rules that the operational rules yield classical logic given classical structural rules but then block antinomy by allowing classical structural rules only in special cases (ideally cases involving all of standard mathematics). Wright, indeed, shows some sympathy with some fairly heterodox lines of thought by entertaining seriously the possibility of rejecting the applicability of Cantor’s Powerset Theorem to the domains of interesting abstraction theories (Wright [36] p. 294). 11 (Cantor himself thought the theorem did not apply to ‘inconsistent multiplicities’). Still, accepting as true unadulterated Axiom V is a very radical response to take. But one need not be so radical in order to respond effectively to ‘Bad Company’. For even if one accepts that Axiom V is trivially inconsistent, though formally speaking an abstraction principle with the same overall structure as Hume’s Principle, this still does not tell very heavily against neoFregeanism. The neo-Fregean can deny that Axiom V is epistemically innocent simply by laying down consistency as a criterion on epistemic innocence so still affirming that Hume’s Principle is innocent. Since hidden inconsistency could lurk in many other abstraction principles, the neo-Fregean will have to concede that analyticity or epistemic innocence is not a purely formal matter, nor a decidable one (cf. Wright [36] p. 213, fn. 27). But this is a plausible position to adopt on independent grounds: analytic rules, in that sense, need not be transparently analytic to those who follow them. After all the neo-Fregean will want to hold that indefinitely many currently undecided mathematical theses are a priori true, even though it may take a genius to come up with a proof of some of them. There is, however, a related but far stronger point: there are indefinitely many consistent but pairwise inconsistent abstraction principles. If all consistent analytic principles are analytic, then both of two such principles are analytic and presumably true which is absurd. 12 This style of objection is what I mean by the Embarrassment of Riches or ER objection. This point is made by Richard Heck (Heck [14]) utilising abstraction principles of the form: ∀X ∀Y (αxXx = αxYx ↔ (P ∨ ∀x(Xx ↔ Yx))) where P contains no occurrences of the α abstraction operator. This principle is satisfiable iff P is. 13 Hence for incompatible values of P (e.g. P0 = the universe is of size ℵ0 versus P1 = the universe is of size ℵ1 ), we get satisfiable but 11 There are set theories with classical background logics in which this holds too, for example, those of Church and Mitchell for which see Forster [7] especially Chapter 4. A more natural such theory, arguably, is Arnold Oberschelp’s ‘Set theory over classes’ [20]. 12 More generally one can find sets, or proper classes of principles such that taken singly or in pairs they are consistent but the set or class as a whole is unsatisfiable. See Fine [6] p. 514. 13 If P is unsatisfiable the principle is logically equivalent to Axiom V; conversely, any model in which P holds can be expanded to satisfy the principle by assigning some one object to αxXx, for every X , so that αxXx = αxYx holds universally.
388
The Arché Papers on the Mathematics of Abstraction
incompatible principles, indeed provably incompatible principles where, as in the two examples just given, we have two principles Pi , P j such that Pi , P j ⊥. A case which will be of particular interest in what follows occurs when P takes the form Bad(X ) & Bad(Y ) where Badness is a second-order property of properties for which equinumerosity is a congruence (a “cardinality property”). We then get disjunctivised generalisations of Axiom V of the form: ∀X ∀Y ({x : Xx} = {x : Yx} ↔ ((Bad(X ) & Bad(Y )) ∨ ∀x(Xx ↔ Yx))). I will call disjunctivised Axiom V principles of this type distraction principles. For example, as instances of Bad we could choose finite, infinite, uncountably infinite or Big, where a property is Big iff there is a bijection from it onto the universe. This latter version is in fact George Boolos’ New V (Boolos [1]) and the idea of distraction principles is simply a generalisation of his notion. Further instantiations of the schematic ‘Bad’ include properties such as being of size at least ℵn , or at least n or at least θn , where θn is the nth-inaccessible cardinal, (here n is finite). We could also set exact cardinality limits on Bad, e.g. countably infinite or exactly ℵn /n /θn , or weaken these clauses to at most ℵn /n /θn and so on. All these concepts, and more, are definable in second-order logic (Garland [9]). The set theories which result are interesting in that they embody a limitation of size principle, widely seen as a non-ad hoc method of restricting naïve set theory and avoiding paradox. If two properties are the same size then either both are Bad, or both are Good (= ∼Bad). Define a set to be the extension of a Good Property: Set x iff ∃X (Good(X ) & x = {x : Xx}). Then it is easy, using the definition of ∈ by: x ∈ y iff ∃X (y = {x : Xx} & Xx) to prove from the abstraction principle in question a comprehension principle for sets (equivalently Good properties): ∀X (GoodX → ∀x(x ∈ {x : Xx} ↔ Xx)) and it will follow that if the extension of X is a set and X is equinumerous with Y then Y too determines a set as its extension. However, Heck’s problem arises even in this particular case—there are incompatible distractions. In particular, let ϕ and ψ be cardinality properties which are (a) provably incompatible in that ∼ ∃X (ϕ(X ) & ψ(X )) 14 and (b) such that both are provably properties for which equinumerosity is a congruence that is ∀X , Y ((ϕ X & X 1 − 1 Y ) → ϕY ), likewise for ψ. 14 Here ‘’ represents provability in standard pure second-order logic with the full impredicative Axiom Schema of Comprehension.
Neo-Fregeanism: An Embarrassment of Riches
389
Examples of a pair of properties which satisfy both (a) and (b) are the pair ‘exactly of size ℵ0 ’ and ‘at least of size ℵ1 ’. Consider, then, two distraction principles D1 , in which the badness property is Bad1 = (Big & ϕ) and D2 , in which Bad2 = (Big & ψ) with ϕ and ψ as in (a) and (b). Theorem 2.1: D1 , D2 ⊥. Proof: Note firstly that for any distraction principle Di we have ∃X (Badi (X )), for the badness property featuring in that principle. The argument is a reductio: if not the principle collapses into Axiom V. So by existential instantiations (or by assumptions for existential elimination) from the two such existential generalisations derivable from D1 and D2 we have: Big(F) & ϕ(F) & Big(G) & ψ(G).
(1)
From (1) we get F 1–1 G by composition of functions hence from (1) again together with property (b) above we get, ϕ(F) & ψ(F) contradicting (a) above. So the neo-Fregean needs to discriminate further among the distraction principles. One further constraint might be that such principles must not only be proof-theoretically consistent (i.e. we do not have D ⊥, for the distraction principle D) but satisfiable in a full standard second-order model. But here again it is easy to produce incompatible principles each of which is satisfiable, for example, DFin with Bad as (Dedekind)-finite versus DInf with Bad as (Dedekind) infinite. Both of these principles have models. For models for distraction principles exist iff we can biject the class of good properties into a proper subset Sets of the domain of individuals D0 , the range being the good classes, or sets; all the bad properties being mapped into a ‘bad guy’, a dummy ‘proper class’ object ♠ ∈ D0 —Sets. (For simplicity and with no commitment to a nominalistic metaphysics, I will identify properties over a domain D of individuals with subsets of D.) On various assumptions, we can find models for all the variants for Bad given above. It is straightforward to show that all and only the (non-empty) finite models satisfy Bad as finite, though the models are a degenerate case in which there are no sets, no good classes, and all properties are mapped to the one dummy object ♠. 15 Using ZFC we can show that Bad as infinite has models in all and only the infinite cardinalities, because the number of good, that is finite, subsets of a universe of cardinality ℵα is just ℵα . So we select an ℵα -sized proper subset of D0 , biject all finite properties into it, and the rest to 15 Similarly distraction principles with Bad as ‘at most α’ have models in which there is just one abstract given by the principle, the proper class abstract, at all cardinalities ≤ α. Where α is finite and non-zero, there is also a bizarre model of size α + 1 for Bad = [at most α] in which there are two classes, the bad proper class and the good universal set. These two are co-extensional, with x ∈ y defined by ∃F(y = {x : Fx} & Fy), so the axiom of extensionality fails for this distraction principle, though it holds for any distraction principle (trivially in this case) when relativised to sets.
390
The Arché Papers on the Mathematics of Abstraction
♠ ∈ D0 —Sets. But clearly no standard model can satisfy both these principles simultaneously. And of course there are other abstraction principles which hold only in infinite models and so are incompatible with DFin , for instance Boolos’ New V (Boolos [1]) in which Bad is Big. This has models at ℵ0 , and also at all “nearly strong” inaccessible cardinalities, 16 if we add to ZFC the assumption that such cardinals exist. Theorem 2.2: The number of smaller subsets of a set S of nearly strong inaccessible cardinality θ is just θ. Proof: Without loss of generality we can consider the cardinal θ itself rather than S. Since θ is regular, every small subset of θ is also a subset of some λ < θ . There are at most 2λ ≤ θ subsets of each such λ and there are exactly θ such cardinals λ. Thus there are at most θ × θ = θ small subsets of θ (and obviously there are at least that many). So this time we select a θ-sized proper subset of the domain of individuals and biject the properties of cardinality < θ onto it, and the rest to the dummy proper class. By similar techniques, ZFC plus the axiom of inaccessibles proves the existence of models for abstraction principles with Bad as ‘Inaccessible’, Bad as ‘at least/exactly the size of the nth-inaccessible’ and so on. These models are particularly interesting, of course, since, Sets being of inaccessible size, we get a ZF-ish theory (which can be derived from the abstraction principle given a standard proof theory). 17 We do not in general get the Axiom of Choice but we do get it for some instantiations of Bad, for example, as ‘exactly size α’ or with Bad = Big as in the New V distraction principle (see Shapiro and Weir [26] §3). Neither do we get foundation. To see this partition (working in ZFC) sets into two disjoint θ -sized sets S1 and S2 such that our bijection of the properties into the individuals maps {α} to α, for all α ∈ S1 and maps the remaining small properties into S2 . Then there will exist a Big, universe-sized sub-domain of Sets in which each member is equal to its unit set; that is, the term {x : x = y}, in an assignment to variables in which α is assigned to y, is itself assigned α since λx(x = y) is interpreted by {α}. Nonetheless we can define the well-founded classes (relative to Ø as base) by WC x ≡d f. ∀X ((X Ø & ∀y(∀z(z ∈ y → Xz) → Xy)) → Xx) 16 Here I am defining a strong inaccessible to be a regular limit cardinal ℵ with λ a non-zero limit λ which is such that ℵλ > 2κ for all κ < ℵλ . Define ℵλ to be nearly strong iff the above holds with the last κ clause amended to ℵλ ≥ 2 for all κ < ℵλ , that is, ℵλ can be “caught”—but not overtaken—from below using 2x i.e. the powerset operation. See [26] p. 316. 17 Cf. Weir [36] Appendix I. If we let Bad be inaccessible then subsets will not hold in general: e.g. if the cardinality of the model is θ1 , that is, the second inaccessible then there will be sets of size < θ1 but > θ0 which have θ0 sub-extensions which are not sets. We can get round this by letting Bad be inaccessible & Big; alternatively we could use exactly inaccessiblen for Bad. If the generalised continuum hypothesis is true then Bad as exactly α, where α is a regular cardinal, (e.g. ℵn+1 ) will have models too as there will be exactly α smaller subsets of α.
Neo-Fregeanism: An Embarrassment of Riches
391
(i.e. the inductive closure from the empty set under the membership operation) and prove from this the WC classes form a well-founded hierarchy. So in these theories with Bad as exactly the nth-inaccessible we can get, by restricting to well-founded sets, ZFC (and a bit more, for n > 1). Overall, then, the class of Distraction principles yields a rich and interesting set of theories, an important sub-class of abstraction principles. The problem for the neo-Fregean is precisely that it is too rich, that we have an Embarrassment of Riches. Even on fairly weak assumptions, there are incompatible distraction principles (e.g. Bad as finite versus Bad as infinite) such that both are satisfiable; if both can be known in epistemically innocent fashion both of them must be true, which is absurd. Commenting on Heck’s examples of consistent but pairwise inconsistent abstractions, Boolos asserts forthrightly: His article seems to me to do in, once and for all, the idea that “contextual definitions” like Hume’s principle or Basic Law V, have, in general, any privileged logical status. (Boolos [3] p. 231)
3.
Conservativism
Wright, however, does not accept that he has been “done in”. He considers principles which, like DFin are true in only finite models but he finds the incompatibility of, e.g. Hume’s Principle with such principles no more worrying than the formal resemblance all abstraction principles have to Axiom V ([36] pp. 295–7). But given the satisfiability of both principles, is this attitude justified? Have we not here pairs of principles each with an equal title to be classed as epistemically innocent but such that at least one must be false. Certainly neo-Fregeans cannot withhold the title of analytic or innocent from DFin on the grounds that they know via intuitive acquaintance with the world of mathematical objects that this world, hence the universe as a whole, is infinite. Nor can they rule out DFin on the grounds that as a basis for empirical science it appears to be somewhat unfruitful, to say the least. If appeal to intuition or pragmatic utility is allowed to determine which abstraction principles are legitimate and which are not then neo-Fregeans can have no principled objection to the mathematical epistemology of the Kantians or the empiricist epistemology of Quine or Putnam; and if that type of empiricist epistemology is acceptable then the use of abstraction principles, rather than, say, axiom systems such as ZFC, in the development of mathematics would be largely a matter of taste and convenience. We must remember in this connection that mathematics is not neutral with respect to logic, certainly not with respect to second-order logic at any rate. 18 18 In fact, the semantics of classical first-order logic is only left “unscathed” by mathematics if one accepts some theory such as ZF and uses it to provide the model-theoretic semantics for the logic. Thus intuitionists criticise classical first-order logic for mathematical reasons, taking mathematics to be more fundamental than logic and radical finitists may hold that there are no more than n things is a
392
The Arché Papers on the Mathematics of Abstraction
Thus CH | ⊥ holds for anyone who defines | set-theoretically and who believes that the second-order formulation CH of the continuum hypothesis is false. Similarly the implication fails for those who hold CH to be true. For the former theorist, the continuum hypothesis is not a logical possibility in the semantic sense, where this somewhat obscure notion is precisified model-theoretically. It is not a logical possibility since a set-theoretic universe in which the continuum hypothesis holds is mathematically unavailable, on this view, hence presumably mathematically impossible, though the statement that the continuum hypothesis holds is consistent proof-theoretically, if ZF is (given standard, finitistic notions of proof). Similarly WO | ⊥ fails for a platonist who accepts WO, but holds for any platonist who accepts the axiom of determinacy, which entails the falsity of the well-ordering theorem WO, and so forth. For this latter theorist, mathematics rules out the existence of a structure which represents the logical possibility of WO. Since logical consequence is such a rich and structurally complex notion, it is inevitable that any position which moves beyond blind acceptance of some system of primitive rules will have to use mathematics in the investigation of the properties of logical systems, whether currently favoured or disfavoured. There can be no neutral, non-aligned mathematics, as far as logic is concerned. Moreover, mathematics is no less fundamental than logic—for the neoFregean. It does not seem to make much sense to say that the principles of logic are more meaning-constitutive, more analytic, more epistemically innocent, than mathematical principles applicable to term-forming, rather than sentential, operators. How, then, can the neo-Fregeans rule out laying down that all structures, hence all empirical structures, must be finite? Why is that illegitimate, if it is legitimate to require that other mathematically impossible ‘structures’ which can be specified without proof-theoretic inconsistency be ignored? Wright, however, has a general and principled objection to principles such as DFin and the whole range of abstraction principles in which Bad takes the form ‘exactly size α’ or ‘at most size β’. It is that such principles are nonconservative because they place upper bounds on the size of the total universe of individuals and hence on the range of acceptable empirical theories and this is something which no genuine mathematical theory can do, granted that mathematics is a priori. So the telling objection to DFin is not that it is useless for empirical science but that it has a property which no theory, not even a theory which happens in the actual world to be empirically fruitful, can have if it is a genuinely mathematical theory. This is Wright’s primary response to the Embarrassment of Riches objection. logical truth, for sufficiently high n. Similarly a “finitistic neo-neo-Fregean(!)” who held to the distraction principle Pω with Bad as [exactly ℵ0 -sized] would reject as unintelligible much of first-order model theory (the compactness theorem, for example) since only finite sets of wffs, only finite models exist, and so forth.
Neo-Fregeanism: An Embarrassment of Riches
393
To evaluate this response we have to look at conservativeness more closely. The formulation above is too loose of course: mathematics does place constraints on acceptable empirical theories: it rules out theories which say that the number of space time regions is both continuum-sized and of size ℵ0 , for example. Rather the idea is that mathematical theory should be compatible with any natural possibility otherwise we would need to know, presumably a posteriori, that the physical world is not structured in one of the possible ways which are inconsistent with mathematics in order to know that mathematics is actually true. And that would conflict with the a priori status of mathematical truth. 19 Hence adding a true mathematical theory to an empirical theory T should not enable us to prove any more physical conjectures than those which already followed from T . If T does not entail C, if there is a possibility of T being true and C false, then mathematics should not conflict with that possibility. So there should likewise be a possibility of T holding together with any body of mathematical truths and yet C still being false. Perhaps, then, if we admit only conservative abstraction principles the set of such conservative principles will be a consistent set and we will knock out at least one of each of the warring pairs of consistent but pairwise inconsistent abstraction principles. There are a number of natural ways to characterise the above notion of conservativeness, all of them making use of relativisations of well-formed formulas (wffs) by formulas. Let P A represent the result of restricting quantifiers in the wff P by a formula in one free individual variable Ax. For example: 1. P A = P for atomic P 2. The relativisation transformation distributes over the sentential operators 3. (∀yϕ) A = ∀y(Ay → ϕ A ) (∃yϕ) A = ∃y(Ay & ϕ A ) (∀X ϕ) A = ∀X (∀y(Xy → Ay) → ϕ A ) (∃X ϕ) A = ∃X (∀y(Xy → Ay) & ϕ A )
In the discussion of conservativeness which follows, I will assume we start from a language L which we expand to L + by the addition of class abstracts— for example, L + is closed under the operation of applying class brackets to wffs ϕx to form new singular terms {x : ϕx}. Since I will be considering abstraction principles which yield a theory of classes sufficient to define ordered pairs subject to the law of ordered pairs: (x, y = w, z ↔ (x = w & y = z)) we can consider only monadic second-order logic, with relations represented by properties of ordered pairs. 19 Such an argument will not impress a Quinean empiricist about mathematics, of course.
394
The Arché Papers on the Mathematics of Abstraction
Probably the most natural form of conservativeness principle is the type utilised by Field (see [5] pp. 96–97, fn. 21, 125–6). A syntactic version of his criterion is: Let T be a theory in L and A an abstractionist theory in L + . T, A need not be consistent. Let ∼Mx be ∼ ∃X (x = {x : Xx}) so that the extension of Mx comprises the abstracts of theory A. Then if T∼M , A C∼M then T C.
Thus we relativise our theory and consequence to the ∼Ms, the nonmathematical, concrete sub-universe. Replace by | and we get the semantic version of Field’s criterion. However, Wright had initially utilised the following conservativeness principle: Let θ be any theory with which <s consistent. Then is conservative with respect to θ just in case, for any T expressible in the language of θ, θ∪ { } entails the -restriction of T only if θ entails T . ([36] p. 297, fn. 49)
(A -restricted formula restricts the first-order quantifiers, in the intended interpretation, to the members of the domain of individuals which are not referents of the abstraction terms.) However, let θ be: If there is an infinite property then Clinton is not an adulterer. Hume’s Principle plus θ deductively and hence also semantically entails that Clinton is not an adulterer but θ does not entail deductively or semantically this on its own (even if Clinton himself thinks that ‘Clinton is not an adulterer’ is true by virtue of meaning alone). And ‘Clinton is not an adulterer’ is the
-restriction of ‘Clinton is not an adulterer’. 20 Thus even the finite version of Hume’s Principle, in which the initial second-order universal quantifiers are restricted to finite properties, is not conservative on Wright’s criterion. For finite Hume, like HP, also entails that there are infinitely many individuals in the domain of the first-order quantifiers (Heck [16]). For this sort of reason, Wright abandons his initial notion of conservativeness in favour of a Field-type criterion. 21 I will look at one further group of conservativeness principles which arise by letting the restriction predicate be a simple unary predicate Ex. The criteria are then the same as Field’s except that we relativise to E rather than to ∼ ∃X (x = {x : Xx}), so that the L + abstractionist theory A is conservative iff T E , A C E only if T C. The main idea here is that E picks out the empirical or physical items (and the restricted second-order quantifiers ranges 20 Or if one is unhappy with the use of proper names, replace the consequent ‘Clinton is not an adulterer’ with, for example, ‘everything has zero mass’. HP + θ entails the -restriction of the conclusion, namely every non-abstract is of zero mass, but θ alone does not entail that everything is of zero mass. 21 See [38] fn. 21 p. 319. Wright’s amended requirement, however, seems to me to be too restrictive— he limits unnecessarily the criterion to theories which are consistent with the abstraction principle. But suppose our theory T is an ultraquantised scientific theory which holds that the universe contains exactly 1050 objects, so one inconsistent with HP. Nonetheless the relativisation T∼M is perfectly consistent with HP, it holds only that there are exactly 1050 non-mathematical objects.
Neo-Fregeanism: An Embarrassment of Riches
395
over properties or sets of physical items) but we remain neutral as to whether mathematical items are part of the physical world or not. Thus, with T our empirical theory above, models of T ∼M embed the original physical structure in a sub-structure disjoint from the sub-structure satisfying the mathematical theory but T E is compatible both with overlap and with disjointness. This type of principle I will call Caesar-neutral since such a principle applies equally well whether or not mathematical abstracts are necessarily disjoint from empirical items. But note also that L may already contain mathematical language, may contain some abstraction operators (numerical operators, say) and in L + we introduce a new one (a set-theoretic operator, for example). Here again the ‘Caesar-neutral’ principle seems reasonable since it allows us to be neutral as to whether the new abstracts overlap the old ones or not, whether some numbers are also sets, it may be. As introduced above, these two conservativeness principles, the Field and the Caesar-neutral, come themselves in two sub-brands—syntactic and semantic. 22 Have we any grounds for preferring one to the other? Certainly there is something uncomfortable for the neo-Fregean in appealing to semantic consequence as part of a program designed to show that mathematics is analytic. For Frege’s original idea was that mathematics should be provable from logic plus definitions, not that it should be a semantic consequence of it. Moreover, more recent attempts to legitimise the notion of analyticity appeal to such ideas as meaning-constitutive inference rules so there seems to be a close link between notions of analyticity and those of proof and derivation. On the other hand unless one is prepared to accept (with Zermelo and a number of other prominent logicians of the earlier part of the last century—see Moore [18] and [19]) that infinitary proofs are as legitimate an idealisation of actual inferential 300 practice as proofs with 1010 steps, then proofs must satisfy the restrictions in the Gödelian theorems. In that case, a neo-Fregean who utilises a prooftheoretic notion of entailment must give up on the completeness of analytically true mathematics. 23 A further problem with a syntactic notion of conservativeness is that it is heavily dependent on proof architecture; on some proof systems even logic is not syntactically conservative. Thus in standard natural deduction systems, adding the negation rules to the → fragment of propositional logic yields new theorems in the old language, e.g. Peirce’s law—(((P → Q) → P) → P)—while in others’, for example, Gentzen’s LK, the negative rules are 22 Syntactic conservativeness and semantic conservativeness are independent of one another since both are formulated in terms of conditionals of the form if T∼ M , P entails C∼ M then T entails C, for the
appropriate notion of entailment and to get from one to the other one needs a completeness result for at least one component of the conditional, a result which fails for standard second-order consequence. 23 Perhaps though the neo-Fregean can claim that only those sentences provable from analytic principles can be known so that the Gödel sentence for at least one formal proof system must be unknown (but perhaps reasonably believed?) by us. See Shapiro [25] for more on the problems incompleteness results pose for neo-Fregeans who accept the Dummett/Prawitz programme of harmony constraints on introduction and elimination rules in acceptable proof systems. See also [12] fn. 5 pp. 4–5.
396
The Arché Papers on the Mathematics of Abstraction
conservative with respect to the negation-free fragment. But we surely do not want our notions of what is conservative, if questions of which mathematical principles are true or false are to hinge on them, to depend on (arguably) aesthetic qualities such as the neatness, by this or that group’s lights, of a particular proof architecture (cf. Weir [31]). Moreover, there is an even more troublesome prospect for conservativeness defined syntactically in a variant of the Caesar-neutral form. Consider for the moment, for simplicity, only second-order abstractions. Suppose we add to the language not only a simple first-order predicate E which we envisage as picking out the empirical domain, but also a second-order predicate F which we use to relativise yet further the second-order quantifiers thus: (∀X ϕ)E,F = ∀X ((∀y(Xy → E y) & F(X )) → ϕ E,F ), (∃X ϕ)E,F = ∃X ((∀y(Xy → E y) & F(X )) & ϕ E,F ). There seems little grounds for the neo-Fregean to object to second-order predicates which take first-order predicates as arguments. The abstraction operator, after all, is a second-order functional expression which takes firstorder predicates as arguments. One motivation for this modification of the Caesar-neutral criterion is to accommodate those who believe that not all predicates stand for genuine properties. Many scientific realists, for example, do not believe that all extensions determine a property; only some are the extension of genuine natural kinds which “cut reality at the joints”. If we think of F, then, as picking out in our intended interpretation the genuine properties, then in relativising a theory T , in a language extended by the introduction of an abstraction operator, to T E,F we are ensuring that the quantifiers in T E,F range over only the original empirical domain and the empirical properties of the items in that domain. Thus we should expect abstraction principles to be conservative here too. They should not enable us to prove anything about the original empirical domain or the empirical properties of items in that domain that we could not already prove before we added that principle. If we construe this conservativeness principle syntactically, we get: IF T E,F , A C E,F THEN T C, E and F as above. (Here T, C are wffs of L.)
Theorem 3.1: Any principle which, for some given infinite cardinality α, holds in all domains of size α, 24 is syntactically conservative on the modified Caesar-neutral criterion. 25 24 Logical abstraction principles, containing no non-logical vocabulary on the right-hand side of the equivalence, are of this nature—if they hold in one domain of cardinality α they hold in all domains of that cardinality. See Fine [6] pp. 509, 552. 25 This result shows that New V is deductively conservative on the modified Caesar-neutral criterion; however, it is deductively non-conservative on the Field criterion since one can derive global well-ordering WO from it—see [26] §3. For on the Field criterion we restrict WO to WO∼M by restricting the domain
Neo-Fregeanism: An Embarrassment of Riches
397
Proof: Suppose ∼[T C]; hence by Henkin completeness and Löwenheim– Skolem for Henkin models (see Shapiro [24] §4.3) there is a countable Henkin model H satisfying each instance of the axiom scheme of comprehension, with countable individual domain d, countable property domain D, D a set of subsets of d, such that | H T ,∼ C. Suppose principle A is true in all domains of size α, for some infinite α. Add in enough individuals to d get a size α domain d ∗ and expand the predicate domain to a full second-order domain P(d ∗ ). A is true in the new model H * (in which we interpret all constants from the original language just as they are interpreted in H ). Now just because there is a Henkin model satisfying T ,∼ C it does not follow that there is a full second-order model satisfying it. But we are concerned with T E,F , ∼ C E,F . Let the extension of E (in H ∗ ) be d, and that of F be D so that in T E,F , ∼ C E,F first-order quantifiers are relativised so that they range over d, second-order relativised quantifiers range over D. Then an induction on wff complexity establishes that each wff in T E,F , ∼ C E,F has the same value in H *, relative to any assignment to the first and second-order free variables of members of d and D respectively, as the corresponding wff in T ,∼ C has in H . It follows that each wff in T E,F , ∼ C E,F is true in H * iff the corresponding wff in T , ∼ C is true in H . Thus ∼[T E,F , A | C E,F ]. Hence by soundness ∼[T E,F , A C E,F ]. Thus although the Caesar-neutral criterion is a very reasonable-looking requirement to place on a mathematical principle A in a language with secondorder predicate constants, there is no problem in finding conservative, in this sense, but syntactically incompatible principles (e.g. distraction principles with Bad as exactly ℵ0 versus Bad as at least ℵ1 ).
4.
Inconsistent conservatives
So I will focus in this section on semantic conservativeness, (omitting the qualification ‘semantic’ unless specifically wishing to contrast with syntactic conservativeness). I list now some apposite results. Theorem 4.1: Hume’s Principle is conservative in the Field and Caesarneutral senses. (Pure ZFC) The meaning of the parenthetical reference to Pure ZFC is that, assuming ZFC in our metalanguage, we can prove that Hume’s Principle is conservative in to the non-mathematical individuals; but we can still prove from New V, what cannot be proven outright, that there is a well-ordering over that domain, namely the restriction of the well-ordering over the universe. But on the modified Caesar-neutral criterion, we restrict to WO E,F and this now states that there is a well-ordering R which satisfies the second-order property F, and which well-orders the domain E; and this we cannot prove from New V. The proof of WO from New V shows that New V is semantically nonconservative on the Field criterion, if we suppose the falsity of WO, another example of the non-neutrality of mathematical consequence.
398
The Arché Papers on the Mathematics of Abstraction
the Field and Caesar-neutral senses. 26 This of course is music to neo-Fregean ears. We can get a more general result using the notion of an unbounded abstraction principle, defining this as a principle such that for every cardinal κ, there is a larger cardinal λ such that the principle is satisfiable in all domains of cardinality λ. Theorem 4.2: All unbounded principles are conservative (in both Field and Caesar-neutral senses). (ZFC). Proof of 4.1: Suppose ∼ (T | C). Then there is a full set model M, of cardinality κ, in which all of T are true and C is false. Expand M to M* by adding an ℵα -sized, ℵα ≥ κ, set N of new members—the numbers—to the individual domain D0 of M and take the full power set P(D0 *) of D0 * = D0 ∪ N as the property domain (with all non-logical constants assigned the same interpretation in M* as in M). Each set X in P(D0 *) has a cardinal card X and the number of these cardinals is the number β of cardinals γ , 0 ≤ γ ≤ ℵα and β ≤ ℵα . We can thus map the cardinals of the sets in P(D0 *) into N ⊆ D0 * by a function f , interpreting nxϕx by f (card X ) where X is the extension of ϕx; this yields an interpretation in which Hume’s Principle is true in M*. Define Nx by ∃X (x = nx(Xx)) so that the extension of ∼Nx is the set of non-numbers of the domain, hence a subset of D0 . Call an assignment to free variables an M-assignment iff σ assigns only members of D0 to individual free variables and only members of P(D0 ) to predicate free variables. A proof by induction on wff complexity then establishes that for all P ∈ L, P ∼N is satisfied in model M* by an M-assignment σ iff P is satisfied by that same assignment σ in M. It follows that P ∼N is true in M* iff P is true in M, hence M* is a counterexample model for the entailment [T ∼N , HP | C ∼N ]. A variant of this argument establishes the result for the Caesar-neutral criterion. It will be useful here to generalise beyond second-order abstractions to the higher-order case. Rather than start from a base language of simple type theory, a cumulative type theory will suit our purposes better. Here we suppose that for every finite order we have predicates and variables of that order (countably many in the latter case) and that an atom F(G) is well-formed iff the order of F is greater than that of G. In the semantics, a (standard) model is a pair d, I where d is the individual domain, the range of the 0th-order variables. The cardinality of the model is the cardinality of d. The second component I of a model is an interpretation of all the constants in the appropriate domains. The nth-order quantifiers range over the nth-order domain (as remarked we will get by with monadic predication since we can introduce ordered pairs once 26 Using Scott’s “trick” of defining the cardinal |x| of x as the set of all sets of least rank equinumerous with x, a first-order form of Hume’s Principle can be derived from ZF, though Levy [17] proved that Hume’s Principle is not syntactically conservative vis à vis first-order ZF minus foundation, or ZF plus arbitrarily many urelements.
Neo-Fregeanism: An Embarrassment of Riches
399
we have some set theory). The n + 1th-order domain Dn+1 is Dn ∪ P(Dn ), the union of Dn with its power set. More generally, where s ⊆ d, define s0 = s, sn+1 = sn ∪ P(sn ) and define the cumulative hierarchy generated by s by ∪i∈ω si . This language L C of cumulative type theory is then expanded to a language L + by adding an (i+ 1)th-order abstraction operator. We can think of L C as the base language L 0 of hierarchy. At L n+1 we apply the operator in question, for example, class brackets, to one-place open sentences ϕ i+1 X i of L n to get the new singular terms {X i : ϕ i+1 X i }i+1 of L n+1 and expand the set of atoms to include these. L + is then ∪n<ω L n . 27 The interpretation of these class operators, then, will be that they represent functions from the ith-order properties into the individuals. Thus a fourth-order abstraction will take the form: ∀X 3 ∀Y 3 (oxX3 x = oxY3 x ↔ E(X 3 , Y 3 )) where E is an equivalence relation over third-order predicates. In the semantics we show by induction that interpretations are stable through the hierarchy, in that a wff has the same value (relative to an assignment) in all sub-languages in which it occurs; therefore it can be assigned a unique value in L + .A further useful extension is to add abstractor quantifiers. 28 Abstraction operators of order n + 1 are formally functional terms which take nth-order open sentences as arguments and yield singular terms as outputs. 29 We can thus add, at each order of the language, quantifiers over such terms. In the language thus augmented, there can therefore occur sentences such as: ∃ f 4 (∀X 3 ∀Y 3 ( f 4 X 3 = f 4 Y 3 ↔ E(X 3 , Y 3 ))) The range of an n + 1th-order abstraction quantifier f is the set of all functions from Dn into D0 . Finally we can iterate this whole process by adding a further abstraction operator to generate a language L ++ and so forth. In order to prove Theorem 4.2 we need to generalise the notion of the relativisation of a formula to our more complex languages. Define by recursion the meta-theoretic terms An [X ], where X is an nth-order variable, by: A1 [X 1 ] ≡d f. ∀y(X 1 y → Ay) An+1 [X n+1 ] ≡d f. ∀Y n (X n+1 Y n → An [Y n ]) and then generalise the predicate quantification clauses in the definition of ϕ A to: (∀X n ϕ) A = ∀X n (An [X n ] → ϕ A ) (∃X n ϕ) A = ∃X n (An [X n ] & ϕ A ) 27 Wherever possible I will omit the superscripts and subscripts, which are meta-theoretic notation indicating order. 28 See [28], §4.2. 29 Probably the neatest way to do this, and to handle variable-binding, is by use of λ terms but to avoid even more clutter I will forbear from adding those.
400
The Arché Papers on the Mathematics of Abstraction
adding, for abstractor quantification: (∀ f n+1 ϕ) A = ∀ f n+1 ∀X n ∀y(( f n+1 X n = y → (An [X n ] → Ay)) → ϕ A ) (∃ f n+1 ϕ) A = ∃ f n+1 ∀X n ∀y(( f n+1 X n = y → (An [X n ] → Ay)) & ϕ A ) Thus the relativised abstractor quantifiers generalise over functions whose range, for any properties in the cumulative hierarchy generated from the subset d A of d which satisfies A, is also a member of d A . Now we can return to the proof of Theorem 4.2—all unbounded principles are conservative (in both Field and Caesar-neutral senses)—in which we are considering abstraction principles of arbitrary order in a language L + which extends, by the addition of the operator, a language L which may itself contain other abstraction operators and other non-logical names and predicates. Proof: Suppose, then, that ith-order abstraction principle A is unbounded and that ∼ (T | C) where all wffs in T , C belong to L. Let M be a counterexample model to the entailment with individual domain d of size α. Since A is unbounded, it is true in all models of some cardinality β ≥ α. Expand, if need be, the individual domain of M to create a size β domain d∗ of a new standard model M*. The interpretation function I * of M* agrees with I on all name and predicate constants. Furthermore, each nth-order abstraction operator in L is interpreted in the same way in L + , for inputs from Dn . For members of D*n − Dn we let all the operators map to some dummy object in d. This means that the abstraction principles other than A may fail in L + . However, let |D*n |, where D*n is the range of the nth-order predicate variables in M*, be the partition of D*n effected by the equivalence relation on the right-hand side of A. Since A holds in all models of cardinality β, there exists a function gfrom |D*n | into d*. 30 Interpreting the abstraction operator {x : ϕx} of A by g, A is true in M*. Where S ⊆ D*n is the interpretation of ϕx in L j and s the member of |D*n | to which S belongs, we assign g(s) as the referent of {x : ϕx} in L j+1 and show that semantic values are stable as we go through the hierarchy. Finally we prove by induction on the complexity of arbitrary wff P that P has the same value in M relative to M-assignment σ as P*, the Field or Caesarneutral relativisation of P, has in M*. In the Caesar-neural case, we assign d as the extension of Ex. An M-assignment assigns to each variable an item of the appropriate order from the cumulative hierarchy generated by d. The proof is a relatively straightforward generalisation of the analogous stage in the proof of Theorem 4.1. For example, if we assume the theorem holds for ϕx and we are considering the inductive case for an individual universal quantification then for the Caesar-neutral criterion (the argument for the Field case is similar) we argue 30 In the Field criterion case we add β new individuals to create d* and the function f maps |D* | into n d ∗ − d.
Neo-Fregeanism: An Embarrassment of Riches
401
∀xϕx is true in M relative to σ iff for all x-variant (M) assignments σ [x/α], ϕx is true in M relative to σ [x/α] iff (Inductive Hypothesis) ϕ E x is true in M* relative to σ [x/α], for all α ∈ d iff ∀x(Ex → ϕ E x) is true in M* relative to σ (since any non-M-assignment x-variant σ [x/β] satisfies Ex → ϕ E x vacuously since β does not satisfy Ex). Hence P is true in M iff P* is true in M* so that ∼ (T *, A | C*). 31
We have seen that there are at least some unbounded abstraction principles—Hume’s Principle is one—and it is easy to see that there are also unbounded distraction principles, for instance D Inf where Bad = Dedekindinfinite. But are there too many unbounded abstraction principles? The answer is yes. For instance, suppose the general continuum hypothesis (GCH) is true. Then Bad as ‘is the size of a successor cardinal’ is satisfied at every successor cardinal ℵα+1 , since the number of small subsets is ℵα+1 —a fortiori the number of subsets not the size of a successor cardinal, is ≤ ℵα+1 (cf. [26] p. 315). But by similar reasoning the distraction principle with Bad as ‘the size of an “odd” successor’, with an odd successor a cardinal of the form ℵα+2n+1 , is true at all odd successor cardinals; and similarly the distraction principle with Bad as ‘the size of an even successor cardinal’ is true at all even successor cardinals, granted GCH. These last two principles are unbounded, hence conservative, but clearly are not simultaneously satisfiable in a full standard model (cf. Fine [6] p. 514). Or, dropping GCH in the background meta-theory but adding the strong axiom of inaccessibles—for every cardinal κ there is a larger (strong) inaccessible—we can show that taking Bad as ‘has the size of a successor in the series of inaccessibles’ yields an unbounded (hence conservative) distraction principle incompatible with taking Bad as ‘has the size of a limit in the series of inaccessibles’, though the latter is similarly unbounded ([26] p. 319). The neo-Fregean may well refuse to accept the truth of GCH and might not accept the axiom of inaccessibles (though the latter is very widely accepted amongst set-theorists). But embarrassment of riches arises on weaker assumptions. Take any predicate ϕ such that the ϕ’s and the non-ϕ’s are unbounded: i.e. ϕx might be ‘x is a successor cardinal’. There are infinitely many such predicates. Next take any ‘at least κ’ distraction principle D (i.e. in the principle D, Bad is ‘at least of size κ’) which holds at arbitrarily high ϕ cardinals and also at arbitrarily high non-ϕ cardinals. (Since it is a logical principle, D will hold in all models of cardinality κ if it holds in at least one.) ‘At least countably infinite’ will always satisfy these conditions. Now consider:
31 Note in particular that if B is any abstraction principle of L true in M, B* will be true in M*.
402
The Arché Papers on the Mathematics of Abstraction
D1 : Bad1 (X ) = X is size at least κ and there is some Y with X ⊆ Y , such that card(Y ) is a ϕ cardinal. D2 : Bad2 (X ) = X is size at least κ and there is some Y with X ⊆ Y such that card(Y ) is a non-ϕ cardinal. Theorem 4.3: Theorem 4.3 (ZFC): D1 and D2 are unbounded, pairwise unsatisfiable principles. Proof: For any cardinal, we can find a larger ϕ cardinal ℵα such that D holds at ℵα ; it follows that the number of subsets of size < κ of any ℵα -sized domain must be ≤ ℵα . All subsets of size β, κ ≤ β ≤ ℵα are, however, Bad1 since they are of size at least κ and a subset of a property−λx(x = x)—whose cardinality satisfies ϕ. Hence all these Bad1 properties can be mapped onto the dummy proper class and the rest, the Good1 properties bijected into the domain of individuals. Thus D1 is satisfiable in every model of size ℵα . But D2 is satisfiable in no ϕ cardinal-sized model. For any universe-sized subset of the domain is Good2 , since it is not a subset of a set whose size is a ϕ cardinal. But there are 2ℵβ such universe-sized Good2 subsets, where ℵβ is the size of the domain, so D2 is not satisfiable in such a model. Similarly D2 is satisfiable at arbitrarily high non-ϕ cardinals but D1 is satisfiable at none of them. So once again we see that not only are there consistent but pairwise inconsistent duos of abstraction principles; there can also be pairwise incompatible but semantically conservative abstraction principles. Wright’s first criterion for winnowing out good from bad abstractions—conservativeness—cannot do the filtering job on its own. Perhaps the neo-Fregean will reject even this meta-theoretic argument establishing the existence of jointly incompatible but conservative theories, though it would seem that to do so the neo-Fregean would need to have no truck with ZFC set theory and all its works and pomps. Certainly, there would be a pragmatic inconsistency or self-refutation if the neo-Fregean relied, in a meta-theoretic validation of his or her position, on results which could not be derived from abstraction principles which the neo-Fregean found acceptable and it is possible that the neo-Fregean will settle on abstraction principles incompatible with ZFC. On the other hand, ZFC is a very fruitful mathematical theory which is accepted by most set-theorists. This does not preclude the possibility of root and branch criticism of the theory from the philosophers but, unless the theory can be shown to be inconsistent, the more of this standard mathematical theory the neo-Fregeans reject, the less plausible their position becomes. Hence the neo-Fregeans, though they ought to aim at eventually throwing away the ladder of ZFC and similar set theories, will want to land on a spot from which a large body of that theory (certainly enough to do contemporary physics and to yield conservativeness
Neo-Fregeanism: An Embarrassment of Riches
403
results for proper parts of the theory, such as Hume’s Principle) can be recaptured.
5.
Modest conservatives
Can we get round the problem raised in the previous section if we tighten further the conditions on the acceptability of abstraction principles? Wright proposed in [36] a second conservativeness criterion which he later characterises thus: Distractions entail conditionals of the form: −(∃F)(φF) → (∀F)(∀G)( F = G ↔ (∀x)(Fx ↔ Gx)) The immediate intent of the proposed constraint is that anything derivable by the reductio of the antecedent of such a conditional afforded by its paradoxical consequent should be in independent good standing. . . . So, an abstraction is good only if any entailed conditional whose consequent is Basic Law V (or, therefore, any other inconsistency) is such that all further consequences which can be obtained by discharging the antecedent are in independent good standing, as may be attested by their derivation in pure higher-order logic (like the case of New V) or their independent derivability from the abstraction in question (like the case of Hume’s Principle). (Wright [38] p. 326)
So let A be any abstraction and C any consequence of A. Classically, C is equivalent to ∼ C → ⊥, so Wright requires that ∼∼ C which can be obtained by discharging the antecedent, is of ‘independent good standing’, hence requires (granted the classical equivalence of C and ∼∼ C) that anything derivable from an abstraction be derivable “independently”. Wright acknowledges the need for clarification here: But what does that mean? In particular, how might it be characterized so as not to outlaw any proof by reductio ad absurdum? ([38] p. 327)
Wright suggests that an ‘independent derivation’ must not be ‘paradoxexploitative’ and gives the following account of the latter notion: a derivation from a conservative abstraction is paradox-exploitative just if there is a representation of its form of which any instance is valid and of which some instance amounts to a proof of the non-conservativeness of another abstraction. For instance, the derivation of the successor-inaccessibility of the universe from the Distraction canvassed above is paradox-exploitative because it may be schematised under a valid form of which another instance is a derivation, from the appropriately corresponding Distraction, that the universe contains 144 objects. ([38] p. 327)
The corresponding distraction is presumably the distraction with Bad = ‘has exactly 144 instances’ but that distraction is unsatisfiable. But perhaps Bad = ‘is of size ℵ0 ’ will do the job just as well for Wright, since this puts a cap on the physical universe, contrary to conservativeness.
404
The Arché Papers on the Mathematics of Abstraction
All this is surely rather odd. It cannot be that an axiom or principle P is suspect because there is a proof π of C from P which shares a form with a proof π * of D from Q, D and Q instantiating the relevant schematic forms of C and P, and where D is something we would reject: this kind of “sharing form” is not criminal. Wright requires that Q is not just any old formula: it must pass the first criterion of conservativeness. And it is true that the most obvious proof that the universe is at least successor inaccessible from the distraction with Bad = ‘at least successor inaccessible’ shares form with a similarly obvious proof (utilising the collapse of the distractions into Axiom V if there is no Bad property) that the universe is of size exactly ℵ0 , a proof whose premiss is the distraction with Bad = ‘is of size ℵ0 ’. But then there a similar proof that the universe is infinite from the distraction D Inf in which Bad = ‘Dedekind infinite’. Is this distraction to be rejected because of structural similarities between proofs of results from D Inf as premiss and proofs of dodgy results, such as that the universe is of size exactly ℵ0 , from other abstractions? For D Inf is satisfiable at all infinite cardinalities, just like Hume’s Principle. Wright may say that there are “independent proofs” of the infinity of the universe from D Inf , ones which in a clear sense appeal only to properties of the abstracts themselves, for instance by proving that there are infinitely many natural numbers, that is, set-theoretic surrogates for natural numbers defined in the usual Zermelo or Von Neumann ways. But ‘paradox-exploitation’ was supposed to give sense to the notion of ‘independent derivability’; we cannot then require the latter notion to make sense of the former. Moreover, it is not true that, as Wright says, the only resources they [“roguish distractions”] have to show . . . that the universe is limit-inaccessible or successor inaccessible, or whatever, are those furnished by the inconsistency of Basic Law V and the consequent modus tollens on the relevant conditional. ([38] p. 326)
For any proof of a result C from premiss P there are infinitely many other proofs of that result from that premiss. Consider the following proof schema, applicable to any distraction, that there are a Bad number of abstracts, specifically sets: Take r = {x : Set x & x ∈ / x}. If r is a Set, if the property λx(Set x & x ∈ / x) is Good, then comprehension holds of r . That is, ∀y(y ∈ r ↔ (Set x & x ∈ / x)) so in particular r ∈ r ↔ Set r & r ∈ /r from which it follows that ∼Set r . So Set r →∼Set r hence ∼Set r , that is, from our definition of Set, the property of being a non-self-membered Set is Bad. So, if Bad is some cardinality concept, we can prove in the above fashion that there are a Bad number of sets, namely the sets which do not belong to themselves.
405
Neo-Fregeanism: An Embarrassment of Riches
This proof seems as set-theoretic as any. Yet we can use it to show that the universe must be at least of the cardinality given by the Badness concept, since the sub-universe of sets has that cardinality, without appealing to any result about the cardinality of the whole universe. (To be sure, since the Bad cardinal is infinite so that all singleton properties determine unit sets, we can conclude further that the universe is exactly of the cardinality given by Bad, but this way of proving the result ‘originates in a requirement that the distraction imposes on its own abstracts’ to paraphrase Wright [38] p. 329.) Is the above proof paradox-exploitative? If so, what of the standard proofs that there is no Russell set, that there is no universal set (else by Subsets there would be a Russell set), or that the powerset of x is larger than x—why are the standard proofs of these results not also paradox-exploitative? If so, is this exploitation such a bad thing? Another constraint which Wright suggests adding to conservativeness is “modesty”: an abstraction is Modest if its addition to any theory with which it is consistent results in no consequences—whether proof- or model-theoretically established—for the ontology of the combined theory which cannot be justified by reference to its consequences for its own abstracts. And again, justification is the crucial point: an abstraction may fail this constraint even though every consequence it has for the ontology of a combined theory may be seen to follow from things it entails about its proper abstracts; in particular, it will not count if, as in the case of the Limit-inaccessible Distraction, a consequence for the combined ontology is needed as a lemma in the proof that the abstracts have a property from which that very consequence follows. ([38] p. 330, Wright’s emphasis)
Wright’s emphasis on justification is indeed essential here. For suppose we drop all reference to issues of justification. What is left seems to be a reflection principle which I will call Modest Reflection. Let L be an abstraction-operator free language, L + the extension of L resulting from adding an abstraction operator governed by a logical abstraction A and P a sentence of L. Modest Reflection: IfA | P
then A | P M
(and of course there is a syntactic version in which | is replaced by ) That is, if a thesis holds in all A universes, the abstract sub-universe reflects that thesis—the consequence P for the combined ontology holds only when the restricted version of P holds for the abstracts. In such a case let us say that A reflects modestly; the principle is a sort of negative converse of conservativeness: If A, T ∼M | P ∼M ,
then A | P,
where ∼M restricts to the nonabstracts. Wright’s text, in particular the reference to “no consequences” for the combined theory which “cannot be justified by referent to its consequences
406
The Arché Papers on the Mathematics of Abstraction
for its own abstracts (emphasis mine) suggests the stronger principle If A, T | P
then A | P M (for any T consistent with A)
But this constraint seems too strong. Suppose we take HP and add it to ZF (but the point will also hold for empirical theories); so far as we know HP is consistent with ZF. Or equivalently add HP to (HP → ZF), with ZF a secondorder finite axiomatisation. The pair HP, (HP → ZF) entail that there is an uncountably infinity of individuals. But HP on its own does not entail that there is an uncountable infinity of numbers, so HP comes out as immodest on this reading. Perhaps the constraint is rather: If A, T | P
then A, T | P M .
But this just is Modest Reflection where T has a finite axiomatisation or where we allow infinitary wffs, for then we just consider T → P. Theorem 5.1: Every logical distraction in which unit properties are Good reflects modestly. Proof: Suppose there is a counterexample model M to A entails P M , one with domain d. Let n ⊆ d be the set of all referents of individual constants in P and a ⊆ d be the set of all abstracts in M. Since unit properties are Good, A holds only in infinite domains and a, d, and n ∪ a all have the same cardinality. Construct the model M* by letting its domain be n ∪ a, so its variables range over the cumulative hierarchy CHn,d generated by n ∪ a; interpret its individual constants as in M and its predicate constants by the restriction of the M-interpretation to CHn,d . This is a counter example model to P—proof by induction over wff complexity. Since the distraction is logical and since M* is the same size as M, A holds in M* too. Hence A does not entail P. So Wright needs the clause about all consequences being “justified” and not merely “following” from “things it entails about its proper abstracts”. But what on earth does this mean? It suggests some tight proof-theoretic notion, as when a classicist might hold to classical semantic consequence but pay special attention to consequences derivable in relevant logic or some such. Even if something could be made of this, what on earth does it have to do with “meaning-constitutive” or “a priori” or “epistemically innocent” principles? One can see how simple rules such as &E or ∨I are meaning-constitutive (if, at any rate, one is not rabidly Quinean to an extent that the later Quine himself shied away from). But it is very hard to see what proof-theoretic modesty or the complex definition of paradox-exploitation has to do with this. The whole approach exudes a strong whiff of ad hocery, the epicycles which are being
Neo-Fregeanism: An Embarrassment of Riches
407
generated give out strong signals that we are in the presence of a degenerating research strategy, if not program, 32 as Wright himself seems to acknowledge: That is apt to seem uneasily complex and less clearly motivated than one would wish. ([38] p. 327)
6.
Stability
However, just as the neo-Fregean program seems to be in deep trouble, Wright comes up with a much more powerful, simple and intuitive idea: any epistemically kosher abstraction must not only be conservative, it must be compatible with all other conservative abstractions: it is not clear that any purpose is served by the continuing insistence on derivations of a given valid form. Why not just say that pairwise incompatible but individually conservative abstractions are ruled out—however, the incompatibility is demonstrated—and have done with it? ([38] p. 328)
Are there any abstractions which are both conservative and compatible with any other conservative abstraction (i.e. there is a model in which both are true)? Call any such abstraction irenic; and say that an abstraction is stable, if for some cardinal κ, it is true at all and only models of cardinalities ≥ κ (cf. [6] p. 511). Theorem 6.1: The stable abstractions are the irenic ones. Proof: (Left to right) Suppose A is stable; by Theorem 4.2 it is conservative, being unbounded. Since A is stable it holds in all models ≥ κ, for some κ (remember conservative simpliciter means semantically conservative). Consider now a “Ramsified” version of A in which we replace each constant term (name, predicate, abstraction operator) by a variable of appropriate type and preface the result A[x1 , . . . , xn ] (where the variables need not all be individual variables) by the corresponding string of existential quantifiers to get ∃(x1 , . . . , xn )A[x1 , . . . , xn ], a purely logical formula I will represent by ∃[A]. This formula cannot be true in a model M of size less than κ else by interpreting each constant c by the object, property or operator function assigned to the corresponding variable xi in the assignment which satisfies A[x1 , . . . , xn ] we would generate a model of size < κ which satisfies A. Now let B be another conservative principle introduced by a new abstraction operator and take the language of principle A to be the base language L for the new principle, so that by adding the abstraction operator of B to L we get 32 In “Implicit Definition and the A Priori” [11], the authors assimilate abstraction principles not to primitive inference rules, but to implicit definitions, for instance, of scientific terms. The claim that concerns of ‘justificatory modesty’ and ‘paradox-exploitation’ have a role to play here is not as implausible as it would be in the case of primitive inference rules; but is still, I think, implausible. I discuss the appeal to implicit definition a little further below in Section 7.
408
The Arché Papers on the Mathematics of Abstraction
our new language L + . We cannot have B, (∃(x1 , . . . , xn )A[x1 , . . . , xn ])∼B | ⊥ else by conservativeness we would have ∃(x1 , . . . , xn )A[x1 , . . . , xn ]) | ⊥ and hence A | ⊥, contrary to the stability of A. So there is a model N of B, (∃(x1 , . . . , xn )A[x1 , . . . , xn ])∼B . Moreover, if we reduce this to a model N ∼B with individual domain the non-B’s, the result will be a model of (∃(x 1 , . . . , xn )A[x1 , . . . , xn ]) since this is a purely logical sentence. By interpreting each constant c—name, predicate or operator—in A by the item assigned to the variable which instantiates c in A[x1 , . . . , xn ] by an assignment σ which verifies (∃(x1 , . . . , xn )A[x1 , . . . , xn ]) we get a model N *∼B in which A is true. Hence N *∼B , and thus N ∼B and so N must be of size λ ≥ κ. But N is a model of B. By the definition of stability, A is true in N together with B. (Right to left) Every irenic abstraction is stable. Suppose nth-order abstraction A: ∀X ∀Y (αxXx = αxYx ↔ E(X, Y )) is unstable so that for each cardinal κ, there is a higher cardinal λ such that A fails at some model of size λ. In such a model the n+ 1th-order formula ∃ f ∀X ∀Y (fX = fY ↔ E(X, Y )) fails. Consider now the abstraction B: ∀W ∀Z (βx W x = βx Z x ↔ [∼∃ f ∀X ∀Y ( f X = f Y ↔ E(X, Y )) ∨ ∀x(W x ↔ Z x)]). The right-hand-side is an equivalence relation since whenever the left disjunct is true (and so abstraction A false) every property bears the relation to every other while when the left disjunct is false the whole formula is co-extensive with the equivalence relation ∀x(Wx ↔ Zx). But when the left disjunct is true, principle B is trivially satisfied by letting βxWx = βxZx for any assignment to W and Z , that is, by having a single abstract, while principle A is unsatisfied. On the other hand, when the left disjunct is false so is B, because equivalent in those contexts to Axiom V, though abstraction A is true. Since [∼∃ f ∀X ∀Y ( f X = f Y ↔ E(X, Y )) holds at models of arbitrarily high cardinalities, B is unbounded and so conservative. 33 But as we have seen, A is semantically incompatible with B so A is not irenic. What the neo-Fregean needs, then, are (non-trivial) stable principles, best of all stable principles which do not hold below the continuum but “kick-in” a few beths further up. For in that case, stable abstraction principles will suffice for the derivation of the mathematics needed for modern science, they will 33 Thus we have a recipe for creating trivial abstractions which are stable from cardinality κ up, where κ is such that there is a formula ϕ of our language (which will play the left disjunct role) true in all and only models ≥ κ.
Neo-Fregeanism: An Embarrassment of Riches
409
provide abstract ontologies of sufficient size to construct the reals, complex numbers, functions over reals and so forth. 34 Now Shapiro and Weir showed ([26] p. 319) that “at least κ” distraction principles, κ > ω, are unstable (there stability is called ‘the strong unbounded condition’ cf. p. 318), every such distraction failing at each of an unbounded series of singular limit cardinals. But in the context of our cumulative type theory, we can find fairly natural distraction principles which are stable. For example, start either from Hume’s Principle or the comparable but in some respects more useful distraction D Inf : ∀X ∀Y (αxXx = αxYx ↔ ((Infinite(X ) & Infinite(Y )) ∨ ∀x(Xx ↔ Yx))). (where ‘Infinite’ is ‘Dedekind Infinite’, for example, there is a bijection from the property into a proper sub-property). Using AC we can prove D Inf is true in all infinite cardinalities (at ℵκ there are ℵκ -many finite sets; map the others to the dummy proper class). Moreover from this, from the fact that all finite properties determine sets, it is clear that semantically it is at least as strong as SF (ZF minus the axiom of infinity) restricted to pure sets (to exclude the ill-founded ones). Classing our initial principle as D1 we now add a further second-order Distraction principle in which Bad, or rather Bad2 is ∼Num2 (X 1 ) where Num x is our definition of the finite numbers or their set-theoretic surrogates and Num2 X ≡df. ∀x(Xx → Num x), D 2 is: ∀F 1 ∀G 1 ({x : F 1 x}1 = {x : G 1 x}1 ↔ ((∼ Num2 (F 1 ) & ∼ Num2 (G 1 )) ∨ ∀x(F 1 x ↔ G 1 x)). By dint of the occurrences of the numerical or zero-order class operator on the right-hand side (when we unpack ‘Num2 ’), this is a non-logical abstraction. The Bad first-order properties, as specified by this distraction, are those which are non-numerical2 , that is are not subsets of the set of finite numbers of the domain. 35 Next add the third-order Distraction principle in which Bad3 is ∼Num3 (F 2 ) where Num3 X 2 ≡df. ∀Y (X 2 Y → Num2 Y ) This third-order distraction D 3 is then ∀F 2 ∀G 2 ({X : F 2 X }2 = {X : G 2 X }2 ↔ ((∼ Num3 (F 2 ) & ∼ Num3 (G 2 )) ∨ ∀X (F 2 X ↔ G 2 X )). 34 Neo-Fregean approaches to real analysis are to be found in [13]. 35 Recall that for simplicity I am identifying properties with extensions: first-order properties are simply
subsets of the domain of individuals, and so on.
410
The Arché Papers on the Mathematics of Abstraction
The Bad second-order properties, as specified by this distraction, are those which are non-numerical3 , that is not all of the first-order properties which instantiate them are numerical2 properties. Continue further by adding a fourth-order distraction D 4 with Bad4 defined in terms of Num4 —having only Num3 instances: Num4 X 3 ≡df. ∀Y (X 3 Y → Num3 Y ) and so on through all the finite types. 36 Theorem 6.2: The set of all these principles is satisfied in all and only models of size ≥ω . It is stable and irenic. Proof: Take any standard model M with individual domain d of cardinality ≥ω . This will satisfy D Inf (or HP) by assigning some countable subset as the extension |Num| of Num. Again in every standard model, the continuum-sized powerset of |Num| is the extension |Num2 | of Num2 , the 2 -sized powerset of |Num2 | is the extension of Num3 and so forth. Since there are continuum-many Good2 (i.e. Num2 ) first-order properties, D2 is satisfiable by mapping these into a continuum-sized subset of d and all other properties into a dummy class and using that map to interpret the operator {x: F 1 x}1 . Note that D2 could not be satisfied in any domain smaller than the continuum. Similarly we interpret D 3 by means of a map from the 2 many Good3 properties into the domain, and likewise through all the principles D i for i ∈ ω Hence ∪i∈ω D i is satisfied by M, though in any domain smaller than ω , for some k, all principles D j for j ≥ k will fail to be satisfied. Moreover, we can show that ∪i∈ω D i is irenic by essentially the same argument as used in Theorem §VI.1. We ‘Ramsify’ each D i to yield a purely logical sentence (∃(x1 , . . . , xn )D i [x1 , . . . , xn ]) Where B is any conservative abstraction, the set {B, (∃(x1 , . . . , xn )D i [x1 , . . . , xn ])∼B (i ∈ ω)} is satisfiable in a model N else (∃(x1 , . . . , xn )D i [x1 , . . . , xn ])(i ∈ ω) | ⊥. contrary to the satisfiability of ∪i∈ω D i . By shrinking N down to the sub-model N∼B with individual domain the non-B’s we get a model which satisfies all of the (∃(x1 , . . . , xn )D i [x1 , . . . , xn ]) and so a variant model of the same size which satisfies ∪i∈ω D i . This shows as before that N must be of size ≥ω hence, by the stability of ∪i∈ω D i , the set of sentences ∪i∈ω D i , B is satisfied by N . This theory ∪i∈ω D i —call it BETH< ω is thus immune from the embarrassment of riches problem and gives us a slice of the cumulative hierarchy up to Vω albeit in a rather restrictive form. We have the natural numbers, all sets of natural numbers, all subsets of the powerset of the set of natural numbers 36 We could extend this into the transfinite by introducing predicates of all ordinal type < α, for some fixed ordinal α and letting Fβ (Gγ ) be well-formed where γ < β.
Neo-Fregeanism: An Embarrassment of Riches
411
and so on. Ontologically, then, we have all the pure structures we need for the applied mathematics for contemporary science, numbers, reals, functions over reals, etc. However, quite simple set-theoretic principles fail. Thus if {x : ϕx} is a set of order n + 1 then there is no guarantee that its unit set exists (as a set) because there is no guarantee that {x : ϕx} is also a set of order n. Nor is it clear how the neo-Fregean could actually apply this ontology in science since there are no sets of urelements, just sets of numbers, sets of sets of numbers and so forth. Perhaps she could introduce a further ‘impure’ set operator, for instance one governed by a distraction principle with bad as ‘at least ω ’. This principle is not stable and neither is the result of augmenting BETHω with it. But perhaps the neo-Fregean could accept this: there is no a priori applied set theory but there is, she might claim, an a priori pure mathematical theory, BETHω . And if we need more things in our heaven and earth than provided for by BETHω we can extend the type theory into the transfinite and thereby force the size of the universe up even higher. This prospect raises a worry. If there is the possibility of adding stronger and stronger such principles, how big is the universe? Might there not be a proper class of stable principles, in which case, if the lower limits which each principle forces the universe to have are unbounded, there will be no (set-theoretic) model of the whole set of principles (cf. [6] p. 514). But this situation is not so different from that which faces the ZF theorist, who cannot prove that a set-theoretical model for her intended interpretation of the theory exists. It is consistent with ZF that there are no inaccessible cardinals, in which case the set of ZF axioms holds in no set-sized standard model. Moreover, the “intended model” has a domain—the universe of sets—which is provably, in the theory itself, not a set. This shows that stability cannot be a necessary condition on acceptability of a theory. One might, though, try for a more disjunctive criterion: a principle A is acceptable iff it is either stable or true (or necessarily true) in the intended interpretation. Or, to avoid adding in a primitive truth predicate or ascending up a further order in the type theory in order to define truth, we could define acceptability, relative to an abstractionist theory A, by [P is stable or P is provable from A]. If the abstractionist theory A suffices for sufficient proof theory to let us represent the relation ‘provable in second-order logic from A’ then the abstractionist theory will be able to prove its own acceptability.
7.
ER II
Has the neo-Fregean hit the jackpot, then? One cause for concern surfaced earlier in connection with the criteria of paradox-exploitation and justificatory modesty. It is not enough for the neo-Fregean to find a criterion which characterises a consistent set of abstraction principles which together yields as much mathematics as we think we need (for application in science, for example). The neo-Fregean also needs an argument which shows that all the principles
412
The Arché Papers on the Mathematics of Abstraction
satisfying the criterion are analytic or meaning-constitutive or implicit definitions which in some interesting sense are epistemically innocent. We could come to know their truth without resort to mysterious intuition or appeal to pragmatic criteria of utility in science. But what has the acceptability, in the sense of the previous section, of an abstraction got to do with it being meaningconstitutive or an implicit definition? This objection can be given more force by considering the following worry, analogous to the original embarrassment of riches objection. Consider a bunch of theorists, each taking a distraction principle as the basis for their pure mathematics, but a different one, utilising a different definition of Bad, in each case. Angus is a finitist who accepts as his sole second-order abstraction principle the distraction D CInf with Bad = Countably Infinite. He holds that the only properties one can generalise over in abstraction principles are numerically definite ones and maintains that only finite properties are numerically definite. Indeed he might hold that only such properties exist. Bronagh, however, takes as her principle the distraction with Bad = ω -sized 37 while for Calum Bad = the size of the first inaccessible, θ0 . Dervla defines Bad(F) by Big(F) & F is the size of a Mahlo cardinal & ∼ GCH so that, since Dervla can prove the universe is Bad, Dervla can prove the General Continuum Hypothesis (GCH) is false. Finally Ewan, who thinks that all the others are wimps, defines Bad(F) by Big(F) & F is the size of a measurable cardinal & GCH so that Ewan can prove the GCH. Suppose, now, we agree with Calum. Then we can rule out Angus’s theory, since it places a cap on the universe at ℵ0 and we know that the universe is bigger than that; indeed we might believe the empirical universe has more individuals than that, has continuum-many space time points, perhaps. Angus’s theory is unstable and non-conservative. Where P is the claim that there are least ℵ1 things and where Ax picks out the abstracts of A = D CInf , we have P∼A , A | ⊥ but not P | ⊥ (we believe). Indeed Angus’s theory is provably false, from our perspective, since the universe is provably not countable; his theory is unacceptable. Similarly Bronagh’s theory is non-conservative since it caps the universe at ω . Both Dervla and Ewan have massively nonconservative theories: there are no (standard) models of either, since there are no Mahlo-sized or measurable sets. Again both theories are disprovable. Calum’s theory, however, is trivially provable and so acceptable. The obvious difficulty here is that Angus, Bronagh, Dervla and Ewan can all tell similar stories. They can all take over the same definition of stability and each can define ‘acceptable’ in the same way but relative to provability 37 Or rather with ‘Bad’ replaced by a formula which ZFC theorists can translate as ‘has cardinality ’. ω Bronagh herself might well reject that translation because she might deny that the universe has a cardinality.
Neo-Fregeanism: An Embarrassment of Riches
413
from their own abstraction principle. Moreover, from the standpoint of any one theory, each of the others is unstable either because it places a cap on the universe at some unacceptably low cardinality or because it has no set models at all. And since the five distractions are pairwise inconsistent, each can prove that every other is unacceptable. The finitist Angus, to be sure, might have problems accommodating contemporary science since it seems, to most, to be steeped in commitment to continuum-sized and larger universes. But of course if he insists that intellectual integrity requires us to write off standard physics as an intellectual incoherence which, inexplicably for the moment, works well (compare Berkeley on infinitesimals), the neo-Fregean is in no position to reject this argument on pragmatic grounds of utility for empirical theory lest the Quinean seize on the admission as acceptance of a Quinean epistemology of mathematics. Note, moreover, that though Dervla and Ewan will think that Angus, Bronagh and Calum place a non-conservative cap on the size of the universe, that is not how that trio will see things. Assuming that cardinals are sets, in all three of those theories, it is provably the case for every cardinal size there is a larger one. All three theorists can deny that the universe as a whole has a size: for Angus, the notion of ℵ0 as a legitimate number is a myth, it represents rather the absolute infinite; Bronagh holds the same view of ω , Calum of θ0 . Do we, then, have an analogue of Embarrassment of Riches returning to haunt us at the meta-theoretic level? It might seem not. Even the notions of consistency and consequence are essentially contested. We might find that logics L 1 and L 2 both have proponents; each claims their own logic as a legitimate formalisation of the notion of entailment but denies that the other logic is. We could also find that a widely accepted mathematical theory T entails existential consequence E in logic L 1 but not in logic L 2 . If a theorist duly deduces E from T using L 1 can she not be said to know E unless she can further prove that there is a distinction between correct and incorrect conceptions of entailment and that L 1 is an explication of the correct notion? Clearly not, this sets an impossibly high standard for justification and knowledge. It cannot, therefore, be held that Calum can only know, innocently, the mathematical consequences he derives from his distraction principle unless he can somehow refute, to everyone’s satisfaction, the claims of Angus, Bronagh, Dervla and Ewan to be providing rival, legitimate positions. To be justified in one’s claims regarding some topic one does not have to be able to knock out all other contenders to knowledge in a contest held in some Archimedean arena. Nonetheless, even in the case of consistency and logical consequence, there is a legitimate concern the neo-Fregean has to answer. If L 2 is not a correct logic then from the neo-Fregean perspective there must be something in the practices of those who use it, or attempt to use it, which prevents its rules from being analytic, meaning-constitutive or otherwise epistemically innocent. Similarly the rules and principles of L 1 must have this favoured epistemically innocent status. The users of L 1 need not be able to demonstrate this is the
414
The Arché Papers on the Mathematics of Abstraction
case. Nor indeed is it necessary that we, the meta-theorists be able to do so either. But if we cannot offer some account of what is for one theory to be correct, the other not, then the idea that the existential consequences of T in L 1 limn the true structure of mathematical reality but the rival ontology extracted from T by L 2 does not has no plausibility at all. Only radical Quineans are likely to hold to the thesis that no logical practices can be said to be analytic or meaning-constitutive and that none can be ruled out as devoid of a coherent meaning. However, the claim that the full second-order logic invoked by neo-Fregeans is a body of analytic rules or axioms is, as remarked in Section 1, much more contentious. The move from second-order logic to abstraction principles is yet more contentious still. The neo-Fregean who cashes out ‘a priori’ as something like analytic or meaningconstitutive has to persuade us that it is reasonable to think that among rival abstractionist theorists such as those found in the Angus to Ewan group, at most one principle is analytic or meaning-constitutive. Supposing Ewan does limn the true structure of reality, it must be the case that his inferential practices—in inferring instances of the right-hand side of his distraction principle from the left-hand side and vice versa, for example—are analytic whilst those of the others are not. The neo-Fregean has to reject the notion that the inferential practices of Angus, Bronagh, Calum and Dervla are every bit as analytic of their notions of class as Ewan’s is of his. This, I would argue, is hugely implausible. The neo-Fregean might, then, construe the a priori nature of mathematical knowledge not in terms of analyticity but in terms of implicit definition. In empirical science we can have two perfectly consistent but pairwise inconsistent theories both satisfiable by (non-isomorphic) abstract structures. Yet only one of them might implicitly define a system of physical magnitudes (and perhaps explicitly define it, if the conditions for Beth’s theorem are met) because of the brute empirical fact that a real structure answering to the one exists but not to the other. Can the neo-Fregean hold that, for example, Calum might know, in brute external fashion, that his sets exist and Angus et al. fail to know the same of theirs for no other reason than that Calum’s universe is the actual universe of sets, none of the other theorists’ universes is? The danger here is obvious. How does the neo-Fregean position differ from Quinean holistic empiricism in which mathematical theories are posits which, like the rest of theoretical science, are to be confirmed or disconfirmed only indirectly to the extent that they contribute to a well-confirmed overall theory of the world? In what sense is Calum’s knowledge a priori? Had the mathematical universe been different, his mathematical beliefs would have been false, though they would have arisen in exactly the same way. From a traditional platonistic perspective, of course, this counterfactual is an empty one with an impossible antecedent: the same mathematical universe exists in all possible circumstances. This suggests a possible response by the neo-Fregean. The neo-Fregean might respond by rejecting the claim that
Neo-Fregeanism: An Embarrassment of Riches
415
acceptability, because it depends on notions of provability and model-theoretic consequence, depends on mathematical notions which stand in need of further justification. The neo-Fregean, might, for example, interpret these notions modally. In so doing, one could argue against Angus, Bronagh and Calum and so on, on the grounds that they all limit mathematical reality—there could be more than a finite, or n or accessible number of things, and any theory which says otherwise cannot be conservative. 38 But there are evident problems with this response: if one appeals to a principle of modal maximality, “whatever size could exist, does actually exist in mathematical reality”, how on earth is one to represent this mathematically? What abstraction principle will one use? One might demur from providing a single principle and appeal instead to an infinite set of principles: add as many abstractions as one can till one reaches a maximal acceptable set. But why think there will be a unique such set? Even if one eschews uncountable languages and supposes we have a neutral notion of what size a set (or perhaps proper class) of abstraction principles could be, it is not the case that there is a neutral linear ordering of abstraction principles in terms of the size of the universe they permit as the cases of Dervla and Ewan show. Most fundamentally of all, though, this modal response owes us an explanation of our knowledge of modality and more generally an account of the nature of modality. How do we know that there could be infinite sets? If we do not know this, how can we rule that Angus’ finitary theory places illegitimate bounds on the size of the mathematical ontology? Clearly the neoFregean making this modal reply cannot analyse possibility as the existence of set-theoretic models since then our supposed knowledge that there could be infinite sets becomes knowledge of the actual existence of sets containing infinite sets and we are back with the problem we started with. Perhaps the neo-Fregean will take modality as primitive. But if she adopts a realist account of modality, we are owed an explanation of how we acquire our knowledge of what is possible and what is not. A Lewisian type of modal realism would once again bring us back to the very same problems: how do we know there exist causally and spatially isolated possible worlds containing infinite sets— not by intuition surely? Nor is it obvious that rival accounts of modal realism, possibilities as properties of actual reality, and so forth, have any better answer to these epistemological problems than Lewis has. Or the neo-Fregean might analyse necessity and possibility (at least of the type in question in mathematics) in terms of analyticity or kindred notions. A proposition is necessary if it can be derived using only analytic or meaningconstitutive inference rules, or some such. But once again we move from frying pan to fire. I think it is plausible that abstraction principles such as HP, when formulated in rule form, yield rules which are meaning-constitutive of the 38 A variant on this response is to appeal to Dummett’s notion of “indefinite extensibility”. We should see the total mathematical universe as an “indefinitely extensible totality” so that no fixed set of abstraction principles captures it. I am very sceptical about the possibility of putting Dummett’s notion to any such use: see [32] §3.i.
416
The Arché Papers on the Mathematics of Abstraction
operators they introduce just as certain types of introduction and elimination rules are arguably analytic for logical operators. But there is nothing to discriminate among abstraction principles in this regard (at least where they are all consistent); they can all be regarded as analytic, in this sense, of the operators they introduce. And to say that some are not genuinely possible, in the analytic sense of possibility, because they conflict with the real analytic abstraction principles which partially determine what it possible and what is not once again involves us in a vicious regress. The neo-Fregean may say that all intellectual argument and discussion must start from some framework of assumptions, even when revising, after the fashion of Neurath in his boat, those assumptions. In our case, the starting point of most philosophers of mathematics is that of a ZFC-like theory, so we are justified in interpreting stability and acceptability using that theory, even if the theory is a ladder which we kick away when moving to acceptance of an abstraction principle. 39 But if, as the foregoing considerations suggest, any reasonable abstractionist theory we arrive at will itself provide a vantage point from which we can see that many different theories will validate themselves as stable and acceptable and others as unacceptable, how can we justify hewing to the one we have arrived at? Not, surely, because of its closeness to ZFC. How could it be that a theory is a priori true because it fits well with a historically dominant theory which was developed by theorists who almost all rejected neo-Fregeanism and its account of a priori truth? These considerations then, while they cannot in the nature of the case amount to a conclusive proof that no satisfactory criterion for winnowing out acceptable from unacceptable abstraction principles will emerge, strongly indicate that there is no such criterion which can do the job the neo-Fregean needs it to do: roughly, single out as a priori or epistemically innocent a consistent set of principles which can be interpreted in a semantically homogenous fashion with respect to the empirical part of the physical theories they form part of and which yield classical analysis and the mathematics needed for science.
8.
Final remarks
Even if this is so, however, it does not follow that the neo-Fregean program has accomplished nothing. There may, for example, be significant partial successes. For there may be ways to blunt the above difficulties which capture much of what the neo-Fregean set out to achieve—some less ambitious but recognisably similar program may be one which can be carried through.
39 Though of course the favoured abstraction principle may entail that ZFC is a correct theory of pure sets, as far as it goes.
Neo-Fregeanism: An Embarrassment of Riches
417
Among the possible revisions of the neo-Fregean program, the most radical move is to stand one’s ground right at the outset of the sequence of difficulties sketched above, and refuse to concede that some abstraction principles are unacceptable. That is, one embraces all abstraction principles, including Axiom V, as meaning-constitutive truths. As remarked at the beginning of Section 2, one must then blame the triviality of the classical naïve set theory not on Axiom V but on the logic which generates triviality and since triviality ensues in fairly weak logics, this option involves quite a radical breach with Frege’s thoroughly classical approach to logic. But that in itself is not a refutation. The most developed form of the naïve approach is that to be found in the dialetheism of Priest [22] and [23]. Priest accepts that Axiom V yields contradiction but concludes that since it is analytically true, so are some contradictions and adopts a paraconsistent logic in order to avoid triviality. But it is not necessary (or at least not obviously necessary) that one embraces true contradictions if one embraces Axiom V: radical enough revisions to the logic will block the derivation of contradiction (cf. Weir [33] and [34]). In both cases, however, one has to show that the revisions are not so radical as to block the derivation of standard mathematics from Axiom V. If either of these ‘naïve’ approaches could be made to work, they would help towards validation at least one major aspect of the neo-Fregean program, namely the idea that mathematics follows from meaning-constitutive truths. There is, however, a less radical way to circumvent the embarrassment of riches objection by embracing equally and without discrimination all abstraction principles and that is to abandon any claim that secondorder formal calculi, at least with the full impredicative axiom scheme of comprehension, are logics. Rather one restricts logic to classical firstorder logic, or perhaps predicative second-order systems and combines logic thus circumscribed with abstraction schemata such as first-order Axiom V. {x : ϕx} = {x : ψ x} ↔ ∀x(ϕx ↔ ψ x). This, as Parson’s has shown, is consistent and Richard Heck has extended the result to predicative Axiom V in a setting of predicative second-order logic (Parsons [21], Heck [15]). The strategy can be extended to show that the set of all first-order abstraction principles is consistent. 40 The drawback here is that the resulting system is rather weak: certainly much weaker than second-order 40 The basic strategy here is to order the terms of the language and assign each predicate ϕx its own eigen object in a countably infinite domain; for classes one then assigns {x : ϕx} that object unless ϕ is co-extensive with an earlier term ψ, in which case {x : ϕx} is assigned the same referent as {x : ψ x}. For any other abstraction principle with its operator [x : θ x] one proceeds in the same way but substitutes ϕ bears R to ψ for ϕ is co-extensive with ψ, where R is the equivalence relation on properties generated by the (logical) right-hand side of the abstraction principle for [x : θ x]; so long as one’s logic satisfies an extensionality principle, as pure second-order logic plus logical abstraction principles does—cf. [6], p. 555—R cannot be more fine-grained than co-extensionality.
418
The Arché Papers on the Mathematics of Abstraction
Peano Arithmetic, far less analysis or even the lower reaches of set theory. Nonetheless a theorem of infinity is provable in the system; indeed, as Richard Heck shows (op. cit.), the predicative theory is stronger than the arithmetic theory Q. A neo-Fregean amending her views in this way could no longer claim that all mathematical truths are analytic or epistemically innocent. She would have to adopt a two-tiered approach. There exists an a priori proof that there are infinitely many (presumably abstract) objects with the properties described in a theory around the strength of Q. As to their further properties, as to whether there are continuum-sized domains of abstract objects, for example with the structural properties characterised in analysis—here one can only put forward conjectures to be tested by the “fruitfulness of their consequences”. As against this one may say that if pragmatic justification is permitted for parts of mathematics why not everywhere? But perhaps conjectures regarding a realm of abstract objects are on a better footing when one has an independent (in this case a priori) justification that the realm of objects itself exists. Nonetheless this type of revision undoubtedly also takes us far from the usual neo-Fregean conception. A different response is to maintain that all consistent abstraction principles are true but to avoid incoherence not by radical change of logic but by relativising truth. If one can interpret the abstraction principles, and the existential claims following from them, as true in some sort of mind-dependent fashion, then one can accept each of the principles as analytic and as generating a notional universe, different such universes for different principles. One cannot, classically, amalgamate these universes; but then many anti-realists hold that there can be a plurality of mind-dependent domains, domains which are incompatible or incommensurable in some way and so cannot be accumulated or subsumed into a single all-encompassing domain. If one was a realist in general but an anti-realist about mathematics in particular then this would yield exactly the right metaphysical position for a classicist who wishes to maintain that all (consistent) abstraction principles (of whatever order) are analytic. No mathematical domains exist in reality but a plurality of often incompatible such domains exist, virtually (whatever exactly that could mean; clearly there are enormous problems for the view being mooted in explicating this). Here then we divorce the two strands of neo-Fregeanism—the epistemological and the ontological—distinguished by Hale and Wright (cf. the introduction to [12]). The resulting anti-platonist neo-Fregeanism is less vulnerable to any ontological argument jibe since there is no commitment to the derivability of objective existence claims from concepts alone. The real universe is not at all the same as the notional universes which humans construct; on this view, the question as to the cardinality of the real universe is an absolute one to be answered not by mathematical theory but rather by empirical, non-analytic theories.
Neo-Fregeanism: An Embarrassment of Riches
419
Wright himself toys with something like this non-realist line of thought: we shall have to say that how many objects there are, and hence which objects of which kinds there are, is something which is relative to the scheme of concepts we happen to employ; so that in the abstract realm, our adoption of a particular conceptual scheme affects not merely which objects we shall recognize to exist, as in the concrete case, but which objects actually exist. That is not perhaps an incoherent view. ([36] p. 293)
He goes on to say, though, that this position “is utterly foreign to the Fregean spirit which the new logicism was supposed to safeguard”. Certainly it is foreign to the platonistic strands in Fregean thought; but it may be the only way to safeguard the idea that our justification for our mathematical theories rests not with intuition nor with any indirect, and somewhat precarious, assessment of its utility in science but flows rather from the meaning of the mathematical operators which figure in our theories. The resulting view would perhaps be close to Dummett’s in his Frege: Philosophy of Mathematics (Dummett [4]): reference for mathematical terms is a “softer” notion than for nonmathematical terms. Whether this is a reasonable move for a neo-Fregean to make will depend on how dearly held the ontological aspect of Fregeanism is, compared to the epistemological. However, that may be, the conclusion I draw over all is that, in the form in which it is presented by its leading exponents—as vindicating in nonempiricist, non-Kantian fashion, mathematics platonistically construed—neoFregeanism is critically wounded by the embarrassment of riches objection; however, the neo-Fregean program has yielded rich insights into mathematical truth and epistemology and less platonistic variants of the program may yet bear fruit. 41
References [1] Boolos, G., “Saving Frege from Contradiction” in Boolos [4] pp. 171–82, first published in the Proceedings of the Aristotelian Society, 1986/87. [2] Boolos, G., “The Standard of Equality of Numbers” in Boolos [4] pp. 202–19. (First published in G. Boolos (ed.), Meaning and Method: essays in honour of Hilary Putnam, Cambridge University Press, Cambridge, England, 1990.) [3] Boolos, G., “Whence the Contradiction” in Boolos [4] pp. 220–36, first published in Proceedings of the Aristotelian Society, Supplementary Volume LXVII, 1993. [4] Dummett, M., Frege Philosophy of Mathematics, Duckworth, London, 1991. [5] Field, H., Realism, Mathematics and Modality, Blackwell, Oxford, 1989. [6] Fine, K., “The Limits of Abstraction” in M. Schirn (ed.), Philosophy of Mathematics Today, Clarendon Press, Oxford, 1998, pp. 503–629. [7] Forster, T.E., Set Theory with a Universal Set, Clarendon Press, Oxford, 1992. 41 An early version of this paper was read at the first Abstraction Day meeting at St. Andrews University in November 1998; a later version was read at the same venue in December 2000 under the auspices of the University’s Arché Centre. I am grateful to all the participants at those talks and in particular to Bob Hale, Stewart Shapiro and Crispin Wright for discussions then and on many other occasions. Further comments on later drafts by Julian Cole and an anonymous referee have also proved extremely helpful.
420
The Arché Papers on the Mathematics of Abstraction
[8] Frege G., Grundlagen translated by J. L. Austin as The Foundations of Arithmetic, Second Edition, Blackwell, Oxford, 1980. [9] Garland, S., “Second-order Cardinal Characterizability”, Proceedings of the Symposia in Pure Mathematics, 113 (II) (1974), pp. 127–46. [10] Hale, B., Abstract Objects, Blackwell, Oxford, 1987. [11] Hale, B. and Wright, C., “Implicit Definition and the A Priori” in P. Boghossian and C. Peacocke (eds.), New Essays on the A Priori, Clarendon Press, Oxford, 2000, pp. 286–319. [12] Hale, B. and Wright, C., The Reason’s Proper Study, Clarendon Press, Oxford, 2001. [13] Hale, B., “Reals by Abstraction”, in Hale and Wright [14] pp. 399–420, first published in Philosophia Mathematica (III), 8 (2000). [14] Heck, R., “On the Consistency of Second-Order Contextual Definitions”, Noûs 26 (1992), pp. 491–4. [15] Heck, R., “The Consistency of Predicative Fragments of Frege’s Grundgesetze der Arithmetik”, History and Philosophy of Logic 17 (1996), pp. 209–20. [16] Heck, R., “Finitude and Hume’s Principle”, Journal of Philosophical Logic 26 (1997), pp. 589–617. [17] Levy, A., “The Definability of Cardinal Numbers”, in J. Bulloff (ed.), Foundations of Mathematics, Springer, Berlin, 1969, pp. 15–38. [18] Moore, G.H., “Beyond First-Order Logic: The Historical Interplay between Mathematical Logic and Axiomatic Set Theory”, History and Philosophy of Logic 1 (1980), pp. 95–137. [19] Moore, G.H., “The Emergence of First-Order Logic”, in History and Philosophy of Modern Mathematics, in (Minnesota Studies in the Philosophy of Science No. 11), University of Minnesota Press, Minneapolis, 1988, pp. 95–135. [20] Oberschelp, A., Set theory over classes, in Dissertationies Mathematicae, CVI, Warsaw: Panstwowe Wydawnictwo Naukowe, 1973. [21] Parsons, T, “On the Consistency of the First-Order Portion of Frege’s Logical System”, Notre Dame Journal of Formal Logic, 28 (1987), pp. 161–68. [22] Priest, G., In Contradiction, Dordrecht: Nijhoff, 1987. [23] Priest, G., “What’s so Bad about Contradictions”, Journal of Philosophy, 95 (1998), pp. 410–26. [24] Shapiro, S., Foundations without Foundationalism, Clarendon Press, Oxford, 1991. [25] Shapiro, S., “Induction and Indefinite Extensibility: The Gödel sentence is true but did somebody change the subject?” Mind, 107 (1998), pp. 597–624. [26] Shapiro, S. and Weir A., “New V, ZF and Abstraction”, Philosophia Mathematica, 7 (1999), pp. 293–321. [27] Shapiro, S. and Weir A., “Neo-logicist” Logic is not Epistemically Innocent”, Philosophia Mathematica (III), 8 (2000), pp. 160–89. [28] Tennant, N., Anti-realism and Logic, Clarendon Press, Oxford, 1978. [29] Tennant, N., “On the Necessary Existence of Numbers”, Noûs, 31 (1997), pp. 307–36. [30] Tennant, N., The Taming of the True, Clarendon Press, Oxford, 1997. [31] Weir A., “Classical Harmony”, The Notre Dame Journal of Formal Logic, 27 (1986), pp. 459–82. [32] Weir A., “Dummett on Impredicativity”, Grazer Philosophische Studien, 55 (1998), pp. 65–101. [33] Weir A.,”Naïve Set Theory is Innocent!” Mind 107 (1998), pp. 763–98. [34] Weir A., “Naïve Set Theory, Paraconsistency and Indeterminacy I”, Logique et Analyze, 161–163 (1998), pp. 219–66. [35] Wright, C., Frege’s Conception of Numbers as Objects, Aberdeen University Press, Aberdeen, 1983. [36] Wright, C., “On the Philosophical Significance of Frege’s Theorem”, in Hale and Wright [14] pp. 272–306, first published in R. Heck (ed.), Language, Thought and Logic: Essays in Honour of Michael Dummett, Clarendon Press, Oxford, 1997. [37] Wright, C., “On the Harmless Impredicativity of N= (Hume’s Principle)” in Hale and Wright [14] pp. 229–55, first published in M. Schirn (ed.), Philosophy of Mathematics Today, Clarendon Press, Oxford, 1998. [38] Wright, C., “Is Hume’s Principle Analytic”, in Hale and Wright [14] pp. pp. 307–34, first published in the Notre Dame Journal of Formal Logic 40, 1999.
ITERATION ONE MORE TIME 1 Roy T. Cook
1.
Motivation
There are (at least) two reasons for investigating abstraction principles for set theory. The first concerns the technical feasibility of a neo-logicist foundation for all of mathematics. The second concerns the connection between the theory of Fregean extensions (as codified in various restrictions of Basic Law V) and the mathematical notion of set (as codified in various axiomatic set theories, such as ZFC). 2 Neo-logicists argue that we can reproduce (the most important parts of) mathematics using abstraction principles. An abstraction principle is any second-order formula 3 of the form: (∀P)(∀Q)[@(P) = @(Q) ↔ E(P, Q)] “@” here is a function from properties (or relations) to objects, and E is an equivalence relation on the properties (or relations). Abstraction principles are intended, in some sense, to be implicit definitions of the abstraction operator @ occurring on the left-hand side of the biconditional, and as a result allow us to take, as objects, characteristics that the properties (or relations) have in common. Frege’s Basic Law V is: BLV : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ (∀x)(Px ↔ Qx)] Frege derives all of arithmetic from BLV plus second-order logic, but Russell’s discovery that BLV is inconsistent with the second-order comprehension 1 This paper first appeared in the Notre Dame Journal of Formal Logic 44, 2004], pp. 63–92. Reprinted by kind permission of the editor and the University of Notre Dame. 2 This paper is intended, among other things, to further the comparison of the iterative and limitation of size conceptions of set begun by George Boolos in “The Iterative Conception of Set” [1971] and “Iteration Again” [1989]. 3 I assume standard set theoretic semantics for second-order logic, where the second-order predicate variables range over the full powerset of the domain, and which contains the second-order axiom of choice and the full comprehension scheme. For details see Shapiro [1991].
421 Roy T. Cook (ed.), The Arché Papers on the Mathematics of Abstraction, 421–454. c 2007 Springer.
422
The Arché Papers on the Mathematics of Abstraction
axiom renders this result less noteworthy. The resurrection of logicism stems from the observation that Frege’s only ineliminable use of BLV occurs in his derivation of Hume’s Principle: HP : (∀P)(∀Q)[N UM(P) = N UM(Q) ↔ P ≈ Q] (P ≈ Q is the second-order formula asserting that there is a 1 – 1 correspondence between the P’s and the Q’s.) 4 The “N UM” operator is, in effect, a number generating function, mapping properties onto the number corresponding to the cardinality of the extension of the property. Unlike BLV above, HP is consistent. Frege’s derivation of arithmetic in the Grundgesetze can be reconstructed in second-order logic plus HP, thereby avoiding the troublesome BLV. This result, quite remarkable as a mathematical fact independent of any philosophical implications, has come to be called Frege’s Theorem. Given the success of Hume’s Principle, neo-logicists have attempted to extend this treatment to more powerful mathematical theories. Although the results are somewhat promising in the case of real analysis (see Hale [2000]), the attempts to capture set theory within the neo-logicist framework have so far been disappointing (see Shapiro and Weir [1999]). The purpose of this paper is to further investigate such neo-Fregean treatments of set theory. Two issues arise when one is reconstructing mathematical theories within the neo-logicist framework, one purely mathematical and one purely (or primarily) philosophical. First, one has to formulate abstraction principles which provide one with what is recognizably the mathematical theory in question. Second, one needs to defend these principles as neo-logicistically acceptable, where the notion of ‘acceptable’ might be fleshed out in terms of analyticity, implicit definition, stipulation, and so on. I shall have little to say here with regard to the second issue, and that only in passing. It is the first issue that is addressed by the results below. Even if one is not amenable to the philosophy of mathematics espoused by neo-Fregeans, the framework provided by neo-logicist style variants of Basic Law V nevertheless provides an elegant and powerful setting within which to study and compare various intuitive notions of set (or of collection). George Boolos’ NewV [1989] was formulated in order to capture one popular idea underlying attempts to provide a foundation for set theory (and thus for all of mathematics), the limitation of size conception of set. NewerV, the abstraction principle introduced below, is intended to codify its main rival, the 4 Every abstraction principle considered in this paper can be expressed using only the resources of second-order logic (plus, in some cases, previously defined abstraction operators). I give the formal expressions in the notes as necessary. The second-order formula expressing that there is a one-to-one correspondence between the P’s and the Q’s is:
(∃R)((∀x)(P x → (∃!y)(Qy ∧ Rx y)) ∧ (∀z)(Qz → (∃!x)(P x ∧ Rx z))) where (∃!x)(x) is an abbreviation for: (∃x)((x) ∧ (∀y)(y → y = x)).
Iteration One More Time
423
iterative conception of set. As we shall see, NewV and NewerV provide quite different theories of Fregean extensions (i.e. set theories), and neither provides an account of sets as strong as second-order ZFC. As a result, we seemed forced to accept that the notion of set and the accompanying formal set theory accepted and studied by mathematicians and philosophers outstrips the content of both the limitation of size doctrine and the iterative conception of set.
2.
Two notions of set
Historically, there are (at least) two competing notions of set that have motivated mathematicians and philosophers studying the foundations of mathematics, the iterative conception, and the limitation of size conception. 5 Boolos, in his insightful comparison of the two notions in “Iteration Again” [1989], characterizes two versions of the limitation of size notion: On a stronger version of limitation of size, objects form a set if and only if they are not in one–one correspondence with all the objects there are. On a weaker, there is no set whose members are in one–one correspondence with all objects, but objects do form a set if they are in one–one correspondence with the members of a given set. (Under certain natural conditions, this last hypothesis can be weakened to: if there are no more of them than there are members of a given set.) The difference between the two versions is that the weaker does not guarantee that objects will always form a set if they are not in one–one correspondence with all objects. (p. 90)
Boolos’ NewV, which will be examined briefly in the next section, corresponds to a neo-logicist reconstruction of the stronger version of the limitation of size conception of set. The iterative notion of set, founded on the idea that each set is built up from other sets or objects that are simpler, or at least prior, is characterized by Boolos as follows: According to the iterative, or cumulative, conception of sets, sets are formed at stages; indeed, every set is formed at some stage of the following “process”: at stage 0 all possible collections of individuals are formed . . . The sets formed at stage 1 are all possible collections of sets formed at stage 0, . . . The sets formed at stage 2 are all possible collections of sets formed at stages 0 and 1. The sets formed at stage 3 are all possible collections of sets formed at stages 0, 1, and 2 . . . The sets formed at stage 4 . . . In general, for any natural number n, the sets formed at stage n are all possible collections of sets formed at stages earlier than n, i.e. stages 0, 1, . . . , n − 1. Immediately after all stages 0, 1, 2, . . . there is a stage, stage ω. The sets formed at stage ω are, similarly, all possible collections of collections of sets formed at stages earlier ω, i.e., stages 0, 1, 2, . . . After stage ω comes stage ω + 1: at which . . . In general, for each α, the sets formed at stage α are all possible collections of sets formed at stages earlier 5 There have been a number of careful and thorough studies of the historical development of, and interactions between, these two notions of set, Hallett [1984] being one of best. I do not propose to make any contribution to this historical project here, but intend rather to examine the technical merits of the iterative conception as formulated within the neo-logicist framework.
424
The Arché Papers on the Mathematics of Abstraction than α . There is no last stage: each stage is immediately followed by another. Thus there are stages ω + 2, ω + 3, . . . Immediately after all of these, there is a stage ω + ω, alias ω · 2. Then ω · 2 + 1, ω · 2 + 2, etc. Immediately after all ω, ω · 2, ω · 3, . . . comes ω · ω, alias ω2 . Then ω2 + 1. . . and so it goes. ([1989], p. 88)
Boolos gives a formal axiomatization of stages, and sets formed at stages, and investigates which set theoretic axioms follow from this characterization. In many respects we will end up agreeing with Boolos’ conclusions. There is one major point of (possible) disagreement, however, concerning the axiom of infinity. Thus, a brief look at Boolos’ discussion of infinity is in order. Boolos argues that the axiom of infinity follows from the iterative conception of sets, but this is only because, in characterizing sets as formed in stages, he assumes that there is a limit stage, that is, a stage ω. After providing an axiom called Inf he writes: Inf states that there is a “limit” stage, a stage later than some stage but not immediately later than any stage earlier than it: the existence of stage ω and hence of such a stage as Inf claims to exist is a notable feature of the conception we have described. Inf is too weak to capture the full strength of the claims about the existence of infinite stages made in the rough description; a further axiom would be needed to guarantee the existence of a stage ω + ω, for example. It suffices, however, for the derivation of the sentence of set theory customarily called “the axiom of infinity”. Inf, it should be noted, is used only in the derivation of the axiom of infinity. ([1989], p. 92)
Even if too weak to capture all of the iterative conception as described in the passage quoted earlier, Inf is still, as Boolos puts it, quite ‘notable’, since it amounts to nothing less than assuming the truth of the axiom of infinity. This is not to say that Boolos has given an incorrect description of (the intuitions behind) the iterative conception, rather, he has described one conception of set, which we might call Boolos-iterative set theory, that is codified by something like ZFC−replacement. In what follows, a more general conception of iteration (based on abstraction) will be presented, one that does not itself imply the axiom of infinity. Within this framework we can isolate additional principles of varying strength that imply (among other things) the existence of an infinite set. In particular, we will see exactly what assumptions are needed in order to arrive at a theory akin to Boolos-iterative set theory.
3.
NewV
As the first step towards a neo-logicist account of the limitation of size conception of set theory, a variation of BLV due to George Boolos [1989] called NewV has been proposed (Where “P is ‘Big”’ is an abbreviation for the second-order formula asserting that the P’s are equinumerous with the entire
425
Iteration One More Time
domain): 6 NewV : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ ((∀x)(Px ↔ Qx) ∨(Big(P) ∧ Big(Q)))] A set is the extension of a small property: Set(x) ↔ (∃P)[x = E XT(P) ∧ ¬Big(P)] The membership relation is defined in terms of the EXT operator: x ∈ y ↔ (∃P)[Px ∧ y = E XT(P)] Restricting the relevant quantifiers to sets, NewV entails the second-order extensionality, separation, empty set, pairing, and replacement axioms. 7 Oddly, however, NewV proves the negation of the union axiom: Union : (∀x)(Set(x) → (∃y)(Set(y) ∧ (∀z)(z ∈ y ↔ (∃w)(z ∈ w ∧ w ∈ x)))) The reason for this failure is that the singleton of the ‘Bad’ extension (i.e. the singleton of the extension of all ‘Big’ properties) is a set, but its union, the ‘Bad’ extension itself, is not. We can reformulate the union axiom so that, for any set, the axiom asserts the existence of another set that contains exactly the elements of every set that is contained in the original set, that is, we ignore any elements of the original set that are not sets themselves: Union∗ : (∀x)(Set(x) → (∃y)(Set(y) ∧ (∀z)(z ∈ y ↔ (∃w)(Set(w) ∧ (z ∈ w ∧ w ∈ x))))) NewV entails this variant of the axiom, since ∪{∅} = ∪{E XT(x = x)} = ∅. Here I do not wish to get embroiled in debates regarding which of these is the “correct” formulation of the union axiom, so in what follows we shall examine the behavior of both principles. There are a number of ways that we might further restrict the notion of set. First, we lay down two conditions concepts might satisfy: BoolosClosed (F) ↔ (∀y)((Set(y) ∧ (∀z)(z ∈ y → F z)) → F y) Transitive (F) ↔ (∀y)((Set(y) ∧ F y) → (∀z)(z ∈ y → F z)) We can then define several useful conditions that sets might satisfy: BoolosPure (x) ↔ (∀F)(BoolosClosed (F) → F x) Transitive (x) ↔ (∀F)((∀y)(F y ↔ y ∈ x) → Transitive (F)) Hereditary (x) ↔ (∃F)(Transitive (F) ∧ (∀y)(F y → Set(y)) ∧ F x) 6 (∃ f )((∀x)(∀y)(( f (x) = f (y) → x = y) ∧ (∀x)(∃y)(Py ∧ f (y) = x))). In light of the SchröderBernstein theorem, which can be proved in second-order logic (see Shapiro [1991], pp. 102–103) this can be simplified to (∃ f )(∀x)(∃y)(Py ∧ f (y) = x). 7 The axiom of choice for sets follows immediately from the fact that we have assumed that choice holds in second-order logic.
426
The Arché Papers on the Mathematics of Abstraction
Intuitively, Boolos-pure 8 sets are those that we can “build up” from the empty set. A set is hereditary if its members are sets, and the members of its members are sets, and the members of the members of its members are sets . . . ad infinitum. We can straightforwardly prove that NewV (in fact, any consistent restriction of Basic Law V) implies that all Boolos-pure sets are hereditary. The possibility, within NewV set theory, of hereditary sets that are not Boolos-pure has been extensively studied in Uzquiano and Jané [2004]. If we restrict the quantifiers to Boolos-pure sets or hereditary sets we can still derive the axioms of extensionality, separation, empty set, pairing, union (the original formulation, in addition to union∗ ), and replacement. NewV also proves the axiom of foundation when relativized to the Boolos-pure sets, although foundation may fail for the hereditary sets (see Uzquiano and Jané [forthcoming]). The failure of foundation for hereditary sets will become important in the discussion of the iterative conception below. Neither the axiom of infinity nor the powerset axiom (nor either of them relativized to any of the restrictions discussed above) follow from NewV alone, however. 9 We should note that the failure of these axioms does not depend on the particular way in which we interpreted “set” and “∈” within the theory of NewV, since the conditions relevant to the satisfaction of infinity and powerset can be formulated independently of these definitions (namely, that the universe contain a non-‘Big’ extension holding of infinitely many extensions on the one hand, and that the collection of extensions must be either countably infinite or of size α for a limit α on the other). Thus, if we wish to formulate a neologicist set theory of a strength similar to that of ZFC, NewV is mathematically inadequate.
4.
The basic formal theory
The first step in formulating the iterative conception of set within the neologicist framework is to generate, in some neo-logicistically acceptable way, 8 Boolos-closed and Boolos-pure are the conditions Boolos calls “closed” and “pure” on p. 100 of Boolos [1989]. The formulation of the notion of hereditary set, and the observation that the class of hereditary sets need not be coextensive with the class of Boolos-pure sets, appears for the first time in Uzquiano and Jané [2004] (as does the terminology “Boolos-pure”). 9 To see that the axiom of infinity does not follow from NewV we need merely note that the model < V{⊗} (ω), I > satisfies NewV where (for ⊗ an arbitrary object, not a hereditarily finite set):
V{⊗} (0) = {⊗} V{⊗} (n + 1) = V{⊗} (n) ∪ (℘ (V{⊗} (n))) V{⊗} (ω) = ∪V{⊗} (n)
(n ∈ ω)
I (E XT(P)) = {x ∈ V{⊗} (ω) : x ∈ I (P)} if {x ∈ V{⊗} (ω) : x ∈ I (P)} finite. I (E XT(P)) = ⊗ otherwise. Demonstrating the independence of the powerset axiom is a bit more complicated. (See Shapiro and Weir [1999]).
427
Iteration One More Time
an ordering of some definite collection of objects that can serve to enumerate stages. We achieve this by utilizing a variant of the Order-Type Abstraction Principle: 10 OAP : (∀R)(∀S)[OT(R) = OT(S) iffR ∼ = S] Of course, OAP is inconsistent – the Burali-Forti Paradox can be derived from it. Consider, however, the Size-Restricted Ordinal Abstraction Principle: 11 SOAP : (∀R)(∀S)[O RD(R) = O RD(S) ↔ (((¬WO(R) ∨ Big(R)) ∧(¬WO(S) ∨ Big(S))) ∨ (WO(R) ∧ WO(S) ∧ R ∼ =S ∧ ¬Big(R) ∧¬Big(R)))]. We first note that SOAP is satisfiable (our metatheory throughout the paper will be first-order ZFC-foundation): Theorem 4.1: SOAP can be satisfied on any infinite set. Proof: Given an infinite set X , we can construct a model of SOAP with X as domain: Let κ be the cardinality of X . Then there is a 1–1 mapping f from κ onto X . For each non-Big well-ordering R on X , O RD(R) is f (γ + 1) where γ < κ is the ordinal such that R is isomorphic to γ . For any relation R on X where R either is not a well-ordering or is Big, O RD(R) is f (0). Additionally, SOAP is only satisfied on infinite models: Theorem 4.2: Any model of SOAP has an infinite domain. Proof: Assume that M is a model of SOAP with domain D where |D| = n for some finite n. Then there are distinct objects given by SOAP for each of the well-ordering types 0, 1, . . . , n – 1, and there is an object that is the value of O RD(R) for any R that is Big or not a well-ordering. Since this latter object is distinct from the objects given by SOAP for each of the non-Big ordering types, D contains at least n + 1 distinct objects. Contradiction. 10 R ∼ = S abbreviates the claim that R and S are isomorphic. We can give the second-order formula for R and S being isomorphic as follows. First, the following abbreviations:
A(x) =df (∃y)(R(x, y) ∨ R(y, x))
B(y) =df (∃y)(S(x, y) ∨ S(y, x))
We then have: (∃ f )((∀x)(A(x) → B( f (x))) ∧ (∀x)(B(x) → (∃y)( f (y) = x) ∧ (∀x)(∀y)( f (x) = f (y) → x = y) ∧ (∀x)(∀y)(R(x, y) ↔ S( f (x), f (y))))).
11 WO(R) abbreviates the claim that R is a well-ordering. Define A(x) as in note 9 above. Then: (∀x)(¬R(x, x)) ∧ (∀x)(∀y)(∀z)((R(x, y) ∧ R(y, z)) → R(x, z)) ∧ (∀P)(((∃x)(Px) ∧(∀x)(Px → A(x))) → (∃y)(Py ∧(∀z)(Pz → (z = y ∨ R(y, z))))).
428
The Arché Papers on the Mathematics of Abstraction
The following abbreviation will be useful: ON(α) ↔ (∃R)(α = O RD(R) ∧ ¬Big(R) ∧ WO(R)) It is important to emphasize that ordinal numbers (i.e. the objects in the range of the O RD operator), upon which we will be building our neo-logicist account of set theory, are not (or are not necessarily) identical to the sets that we usually call ordinals (i.e. ∅, {∅}, {∅, {∅}}, . . ., ω, etc.). Thus, in what follows we will be careful to distinguish between ordinal numbers (i.e. O RD(R) for some R) and ordinals (i.e. transitive pure sets well-ordered by membership). Nevertheless, there is a clear correspondence between the class of ordinals and the class of ordinal numbers, and we shall, for convenience, use lower-case Greek letters for both. We define the ordering on the ordinal numbers generated by SOAP in the usual way: O RD(R) < O RD(S) ↔ (∃ f )((∀x)(∀y)((R(x, y) → S( f (x), f (y))) ∧ (∃z)(∀w)((∃v)(R(v, w) → S( f (w), z))))) The common theorems about well-orderings can be proved to hold of the ordinal numbers generated by SOAP by standard proofs and are assumed in what follows. Most important is the fact that Theorems 4.1 and 4.2 imply the following corollary: Corollary 4.3: For any model of SOAP, the collection of ordinal numbers, ordered by <, is isomorphic to an infinite ZFC cardinal number. This implies that, in any model of SOAP, there is no last ordinal number. Next, we have a principle telling us what objects we have access to prior to “applying” the iterative operation of set formation. This Basis Axiom will be some instance of the following schema: B A : BASE(x) ↔ x The strength of our iterative set theory will depend greatly on what formula we select for , as we shall see in Sections 8 and 11 when we examine some particular candidates. There is no restriction that the members of the basis are not sets, and we allow that the ‘Bad’ extension (‘Bad’ within the present context is defined below) might be contained in the basis. We now define the notion of ‘stage’. (In what follows, three different membership symbols will appear. ∈S , is used when defining our notion of stage. ∈N is the set theoretic membership relation defined within the neo-logicist set theory. Finally, ∈ without subscripts is to be understood as the membership relation of first-order ZFC-foundation, used when we are working in the metatheory. Subscripts (or their lack thereof) will
429
Iteration One More Time
also be used to label notions defined in terms of the various notions of membership.): For all α such that α is an ordinal number and all x, x ∈S Stg(α) ↔ ON(α) ∧ BASE(x) ∨(∃P)(x = E XT(P) ∧ (∃β)(ON(β) ∧ β < α ∧ (∀y)(P y →∈S Stg(β)))). The first stage consists of the elements of basis, and each succeeding stage contains the basis (if any) plus the extension of every property all of whose instances are contained in some prior stage. This definition guarantees that if x ∈S Stg(α) for some ordinal number α, then x ∈S Stg(β) for all β > α. The following abstraction principle “generates” extensions of properties within the iterative hierarchy: NewerV : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ (¬(∃α)(ON(α) ∧ (∀x)(P x → x ∈S Stg(α))) ∧ ¬(∃α)(ON(α) ∧ (∀x)(Qx → x ∈S Stg(α)))) ∨ (∀x)(P x ↔ Qx)] To clarify things, we can reword NewerV along the lines of Boolos’ NewV: NewerV : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ ((∀x)(P x ↔ Qx) ∨ (Bad(P) ∧ Bad(Q)))] where: Bad(P) ↔ ¬(∃α)(ON(α) ∧ (∀x)(P x → x ∈S Stg(α))) Boolos’ definitions of set and membership can be reformulated in the present context to obtain similar notions within NewerV set theory (NewerV set theory should be understood to denote the theory that follows from the conjunction of NewerV and SOAP): 12 Set(x) ↔ (∃P)[x = E XT(P) ∧ ¬Bad(P)] x ∈N y ↔ (∃P)[P x ∧ y = E XT(P)] Along these lines, we can define notions of Boolos-pure sets, hereditary sets, and so on, just as was done for NewV above. 12 Although it is tempting to identify ∈ and ∈ , the reader should be careful not to. “x ∈ Stg(α)” is a N S S defined binary relation holding between an object x and an ordinal number α, and asserts, intuitively, that x is ‘formed’ by the αth stage. In other words, ∈S and Stg(α) are not separable pieces of vocabulary. On the other hand, “∈N ” expresses a relation holding of an object and an extension. As a result “x ∈N Stg(α)” is not well-formed in the language given above. Nevertheless: (∀x)(∀α)(α is an ordinal number → (x ∈S Stg(α) ↔ x ∈N E XT(y ∈S Stg (α)))) is a theorem.
430
The Arché Papers on the Mathematics of Abstraction
A clarification at this point is useful to avoid confusion. We can define the notion of urelement in the standard way as follows: UR(x) ↔ ¬Set (x) Again, there is no guarantee that the elements of the basis are all urelement, or vice versa. If we restrict the relevant quantifiers to sets, then NewerV entails the extensionality, empty set, separation, union∗ (but not union), pairing, and powerset axioms. Derivations are given in the appendix. NewerV, like NewV, proves the axiom of foundation if restricted to Boolos-pure sets, although foundation may fail to hold of the hereditary sets or sets in general. Similarly, the union axiom holds when restricted to Boolos-pure or hereditary sets.
5.
NewerV and abstraction
NewerV, as formulated above, is circular – it contains reference to stages on the right-hand side of the biconditional yet our definition of stage contains explicit use of the extensions forming operator supposedly being defined. On the face of it this objection does not seem overly compelling – as is well known, for the implicit definitions codified in abstraction principles such as Hume’s Principle and NewV to do the work intended, the quantifiers on the right-hand side of the biconditional must range over all objects, including the abstracts being introduced (and defined) on the left. Once this is accepted, there seems little reason not to explicitly refer to extensions in our definition of the identity conditions for extensions, since we are already forced to quantify over them in such a definition. Nevertheless, a method by which to avoid this outright circularity would no doubt be welcomed, and fortunately such a method exists. In order to avoid such circularity, we have our extensions forming operator E XT apply, not to concepts, but to pairs (P, α) where P is a concept and α is an ordinal number. We then define the notion of stage∗ as follows: For all α such that α is an ordinal number and all x, x ∈S Stg ∗ (α) ↔ ON(α) ∧ B AS E(x) ∨ (∃β)(∃P)(β < α ∧ x = E XT∗ (P, β) ∧ (∀y)(P y → y ∈∗S Stg(β))) The appropriate abstraction principle would be: NewerV∗ : (∀P)(∀α)(∀Q)(∀β)[E XT∗ (P, α) = E XT∗ (Q, β) ↔ ((∀x)(P x ↔ x) ∨ (¬(ON(α) ∧ ∀x) (P x → x ∈∗S Stg(α))) ∧ ¬(ON(β) ∧ (∀x)(P x → x ∈∗S Stg(β))))]
Iteration One More Time
431
and we would define set and membership as: Set (x)∗ ↔ (∃P)(∃α)[x = E XT∗ (P, α) ∧ ¬Bad(P, α)] x ∈N y ↔ (∃P)(∃α)[P x ∧ y = E XT∗ (P, galpha)] q
where: Bad(P, α) ↔ ¬(ON(α)g ∧ g∀x)(P x → x ∈∗S Stg(α))) In order to insure that this method works, we need to verify that: (∀P)(∀α)(∀β)((¬(ON(α) ∧ ∀x)(P x → x ∈∗S Stg(α))) ∧ ¬(ON(β) ∧ (∀x)(P x → x ∈∗S Stg(β)))) → E XT∗ (P, α) = E XT∗ (P, β) that is: (∀P)(∀α)(∀β)((¬Bad(P, a) ∧ ¬Bad(P, b)) → E XT∗ (P, α) = E XT∗ (P, β)) This can be straightforwardly derived from SOAP + NewerV ∗ . Essentially, we have replaced the circular definition of extensions with a recursive definition, where at each “level” we introduce new extensions defined in terms of the ones at previous levels. To make things more intuitive, we can think of the recursive formulation NewerV* as schematic for infinitely many non-circular definitions of infinitely many extension-forming operators. First, we obtain the level-0 extensions: NewerV0 : (∀P)(∀Q)[E XT0 (P) = E XT0 (Q) ↔ ((∀x)(P x ↔ x) ∨ ((∃x)(P x ∧ ¬B AS E(x)) ∧ (∃x)(Qx ∧ ¬BASE(x))))] That is, level-0 extensions correspond to those collections whose members are members of the basis. We then define level-1 extensions in terms of level-0 extensions: NewerV1 : (∀P)(∀Q)[E XT1 (P) = E XT1 (Q) ↔ (∀x)(P x ↔ x) ∨ ((∃x)(P x ∧ ¬BASE(x) ∧ ¬(∃F)(x = E XT0 (F))) ∧ (∃x)(Qx ∧ ¬BASE(x) ∧ ¬(∃F)(x = E XT0 (F))))] where level-1 extensions correspond to those collections whose members are either members of the basis or are level-0 extensions. We can continue in this way, explicitly defining more general extension operators, where the level-n extensions correspond to those sets whose members are either members of the basis or are level-m extensions for some m < n. While this method will only take us (at best, assuming both enough objects and enough abstraction principles) as far as level-α extensions for α < 0 , we should note that each instance of NewerV N is, within the neo-logicist framework, an abstraction principle implicitly defining an abstraction operator
432
The Arché Papers on the Mathematics of Abstraction
E XTN in terms of previously defined operators. 13 NewerV ∗ is a generalization of this process, allowing us to handle all cases simultaneously (including ranks numbered by ordinals for which we might not have names) and thus does not, perhaps, deserve the title of “abstraction principle” in a literal sense. Nevertheless, NewerV ∗ is a natural generalization of such a piecemeal process of abstraction, and seems well within the spirit, if not the letter, of the neologicist approach. With NewerV ∗ in place, we can define the notion of extension simpliciter: x = E XT(P) ↔ (∃α)(ON(α) ∧ (∀y)(P y → y ∈∗S Stg(a)) ∧ x = E XT∗ (P, α)) ∨ (∀α)(ON(α) → (∃x)(P x ∧ x ∈ / ∗S Stg(α))) ∧ x = E XT∗ (x = x, 0)) In other words, the extension of a concept P is the extension∗ of P and any ordinal α such that (P, α) is not Bad, and is the Bad extension (E XT∗ (x = x, 0)) if there is no such ordinal. We can then define stages in terms of extensions as before: For all α such that α is an ordinal number and all x, x ∈S Stg(α) ↔ ON(A) ∧ B AS E(x) ∨ (∃P)(x = E XT(P) ∧ (∃β)(ON(β) ∧ β < α ∧ (∀y)(P y →∈S Stg(β)))) The resulting derivation of NewerV (using this definition) from NewerV ∗ is left to the reader. This way of proceeding, using bounded quantification, accomplishes what the simpler formulation NewerV does, and in addition makes the recursive nature of iterative extensions more explicit: E XT∗ (P, α) is defined in terms of ∈∗S Stg, and ∈∗S Stg is defined in terms of E XT∗ (Q, β) for β < α. Of course, it is possible that neo-Fregeans (or their opponents) might find NewerV ∗ as objectionable as the explicitly circular NewerV. At this point I have no additional positive argument for the acceptability of NewerV ∗ other than its intuitive plausibility. What can be noted, however, is that if the Neo-Fregean refuses to accept both NewerV and NewerV ∗ , then he will most likely find himself unable to formulate any version of the iterative conception of set. There seems to be no means by which one can formulate a general iterative principle for abstractions within the neo-logicist framework other than by providing identity criteria for the two extensions occurring on the left-hand side of the biconditional in terms of conditions being imposed (on the right-hand side of the biconditional) on other extensions that might be members of the original pair of extensions. The reader should note that first abstraction principle, NewerV, is meant to be the ‘official’ formulation of the neo-logicist iterative conception of sets, and 13 Providing abstraction principles whose right-hand side contain abstraction operators defined previously is (at least accepted as) neo-logicistically acceptable methodology, as the literature on reconstructing the reals attests. See, e.g., Hale [2000].
433
Iteration One More Time
will be used below, The ‘recursive’ reformulation NewerV ∗ is provided only to assuage worries regarding circularity.
6.
Standard models
Given a particular BASE, we construct ranks within standard second-order ZFC-foundation as follows: VBASE (0) = {x : B AS E(x)} VBASE (α + 1) = V (α) ∪ (℘ (V (α))) VBASE (γ ) = ∪λ<γ V (λ)(γ a limit ordinal) (The intuitive idea is that the members of the basis that we do not want to be sets in the model based on VBASE (κ) can be represented, in the model, by sets of cardinality greater than the cardinality of VBASE (κ).) Letting ⊗ be an arbitrary set not in VBASE (κ), to serve as the Bad extension, we can now construct what we will call standard models (consisting of domain and interpretation function) of NewerV set theory: 14 M(BASE,κ) =< VBASE (κ) ∪ { ⊗ }, I > where for all relation symbols R: I (O RD(R)) = α if α is an ordinal in VBASE (κ) and < α, ∈> is isomorphic to I (R) I (O RD(R)) = ⊗ otherwise; and for any predicate P: I (E XT(P)) = {x ∈ VBASE (κ) : x ∈ I (P)} if {x ∈ VBASE (κ) : x ∈ I (P)} ∈ VBASE (κ) I (E XT(P)) = ⊗ otherwise. Note that it might be the case that ⊗ ∈ BASE, in which case VBASE (κ) ∪ { ⊗ } = VBASE (κ). The fact that models of our neo-logicist iterative set theory “look” like the standard iterative hierarchy is unsurprising. Let us call any two structures M and N extension-isomorphic if there is a one–one onto function f from the domain of M to the domain of N such that f is an isomorphism with respect to E XT (but not necessarily O RD). Theorem 6.1: Any model of SOAP + NewerV of cardinality κ contains a substructure that is extension-isomorphic to M(∅,κ) . Proof: Given a model M of cardinality κ with domain D and interpretation function I , let O ⊂ D be the domain of the “O RD” operator under I . Since D is of cardinality κ, and M is a model of SOAP, O (with its ordering) is 14 Note that “∈ Stg(α)” is defined in terms of “E XT”, so we need not include an additional clause. N
434
The Arché Papers on the Mathematics of Abstraction
isomorphic to κ. We can then construct the copy of V∅(κ) (and thus M(∅,κ) ) recursively using O, where the α th rank is just the collection containing the extension of every property all of whose instances occur in ranks less than α (where V∅(κ) = ∅). The following result is useful in what follows: Theorem 6.2: A standard model M(BASE,κ) is a model of SOAP + NewerV iff |VBASE (κ)| = κ and κ is infinite. Proof: (→) Assume M(BASE,κ) is a model of SOAP + NewerV. If M(BASE,κ) is a model of SOAP, then M(BASE,κ) must be infinite, but again by SOAP, there must be infinitely many ordinal numbers, so κ must be infinite. By an easy induction, |VBASE (γ )| ≥ |γ | for any γ . Assume |VBASE (κ)| > κ. Then, by SOAP, there would be |VBASE (κ)| many ordinal numbers, but then M(B AS E,κ ) could not be a model of NewerV (since NewerV entails a rank for each ordinal number). So |VBASE (κ)| = κ. (←) Assume |VBASE (κ)| = κ and κ infinite. For any κ, M(BASE,κ) is a model of NewerV. If |VBASE (κ)| = κ then SOAP generates κ many ordinal numbers, the right amount for κ ranks, so M(BASE,κ) is a model of SOAP. Restricting our attention further, to models with empty basis, we have the following theorem: Theorem 6.3: For infinite cardinals κ, |V∅(κ)| = κ iff either κ = κ or κ = ω. Proof: Evident from the fact that, first, V∅(ω) is the hereditarily finite sets, and second, for κ > ω, |V∅(κ)| = κ . In other words, M(∅,κ) is a model of SOAP + NewerV if and only if V∅(κ) is the collection of hereditarily-less-than-κ sets.
7.
The axiom of replacement
Our next step is to examine the axiom of replacement: 15 Replacement: (∀x)(∀ f )(∃y)(∀z)(z ∈N x → f (z) ∈N y) In the specific case of standard models of SOAP + NewerV where BASE= ∅ we have: Theorem 7.1: For any cardinal κ, if M(∅,κ) is a model of SOAP + NewerV, then M(∅,κ) satisfies the axiom of replacement iff κ is regular. Proof: (→) Assume that M(∅,κ) is a model of SOAP + NewerV and the axiom of replacement and assume, for reductio, that κ is not regular, i.e. 15 (∀x)(Set(x) → (∀ f )(∀y)((y ∈ x → Set( f (y))) → (∃z)(Set(z) ∧ (∀w)(w ∈ x → f (w) ∈ z)))). N N N
435
Iteration One More Time
cf (κ) < κ. Then there is an ordinal γ < κ and a function f such that f maps γ unboundedly into κ. Let S be the range of f restricted to γ . Then there is no ordinal number α such that, for all x in S, x ∈S Stg(α), so S is Bad, that is, not a set. Contradiction, so cf (κ) = κ and κ is regular. (←) Assume that M(∅,κ) is a model of SOAP + NewerV and that κ is regular. Let x be any set and f any function on V∅(κ), and let S be the range of f restricted to x. Clearly, |S| ≤ |x|, x ∈∅ (κ), and, since M(∅,κ) is a model of SOAP + NewerV, (℘x) ∈ V∅(κ), so it follows that ℘ (x) ⊆ V∅(κ). Thus, |x| < |V∅(κ)|, and, since M(∅,κ) is a model of SOAP + NewerV, |x| < κ. So |S| < κ. Thus, since κ is regular and there is no function from S into κ whose range is unbounded in κ, there must be a γ < κ such that, for all y in S, y ∈S Stg(γ ). Thus, S is not Bad, so E XT(S) is a set. This allows us to prove that the axiom of replacement does not follow from SOAP + NewerV. Define π as follows: 16 π0 = ω πn+1 = πn π = sup{πi : i < ω} M(∅,π) is a model of SOAP + NewerV since π = π 17 (in fact, it is the least such cardinal) but π is not regular, since cf (π ) = ω. Thus, the axiom of replacement does not follow from SOAP + NewerV. The failure of replacement is, in this context, equivalent to the failure of the following Same Size Principle: SS P : (∀P)(∀Q)((¬Bad(P) ∧ P ≈ Q) → (¬Bad(Q))) On the other hand, the following size restriction principle does hold: S R P : (∀P)(¬Bad(P) → ¬(∃Q)((∀x)(Qx) ∧ P ≈ Q)) i.e. no sets are equinumerous with the entire domain. The failure of replacement should not come as too much of a surprise on the iterative conception of set theory, however. Boolos, in “The Iterative Conception of Set”, writes that: There is an extension of the stage theory from which the axioms of replacement could have been derived. We could have taken as axioms all instances . . . of a principle which may be put, “If each set is correlated with at least one stage (no matter how), then for any set z there is a stage s such that for each member w of z, s is later than some stage with which w is correlated”. This bounding or cofinality principle is an attractive further thought about the interrelation of sets 16 Note that M (∅,ω+ω) , although failing to satisfy replacement, also fails to satisfy SOAP. 17 This construction shows that we can prove, in second-order ZFC, that SOAP + NewerV has uncount-
able models, since π , and thus M(∅,π) , can be constructed within second-order ZFC. This is significant since Shapiro and Weir [1999] prove that the claim that NewV has uncountable models is independent of second-order ZFC.
436
The Arché Papers on the Mathematics of Abstraction and stages, but it does seem to us to be a further thought, and not one that can be said to have been meant in the rough description of the iterative conception . . . Thus the axioms of replacement do not seem to us to follow from the iterative conception. ([1971], p. 26–27) 18
In the later [1989] paper, after considering ways in which the iterative conception might be strengthened to secure replacement, he writes that: Whether some such strengthening . . . can be plausibly thought not to involve a new principle that is not really part of the iterative conception seems doubtful. (p. 97)
Although Boolos’ examination is not conducted within the neo-logicist framework, his comments regarding replacement agree both with the results obtained here and with intuition. Unlike the limitation of size conception, the iterative conception regards the size of a collection as irrelevant to whether or not it receives the honorific “set”. All that matters is whether the objects contained in the collection are formed at some point in the hierarchy.
8.
The basis and infinity SOAP + NewerV plus the following Basis Axiom: BAØ : x ∈ BASE ↔ x = x
that is, NewerV set theory with an empty basis, or what we might call pure NewerV set theory, is extremely weak. Of the standard axioms, the only ones that hold here are those proved above. In addition to the failure of the axiom of replacement, the axiom of infinity: 19 Infinity : (∃x)(ØN ∈ N x ∧ (∀y)(y ∈ N x → y ∪N {y}N ∈ N x)) fails: Theorem 8.1: M(Ø,ω) is a model of SOAP + NewerV but fails to satisfy infinity. Proof: M(Ø,ω) is just the collection of hereditarily finite sets (plus the Bad extension), which is countable, so |VØ (ω)| = ℵ0 , and M(Ø,ω) is a model of SOAP + NewerV. There are no infinite sets in M(Ø,ω) , however, so the axiom of infinity is not satisfied. Theorem 8.1, combined with the results of the previous section, suffice to show that, relative to SOAP + NewerV, the axiom of infinity and the axiom of replacement are independent, since M(Ø,ω) satisfies replacement but fails to satisfy infinity, and M(Ø,π) (with π defined as in the last section) satisfies infinity but fails to satisfy replacement. As far as providing the infinite sets needed to construct real and complex analysis, NewerV set theory is no better off than NewV was. 18 Boolos writes of “axioms of replacement” in the plural since he is considering first-order set theory. 19 That is, (∃x)(Set(x)∧ (Ø ∈ x ∧ (∀y)(Set(y) → (y ∈ x → y ∪ {y} ∈ x)))). N N N N N N
Iteration One More Time
437
There seems to be no principled reason why we should not allow ourselves access to some preliminary collection of objects which we can then collect into sets, sets of sets, etc., however. The elements of the basis are, on more traditional approaches to set theory, often ignored since they do not add anything substantially new to the theory. From the present perspective, however, this is not the case. In order to guarantee that we have an infinite set, we need only assume that the basis contains infinitely many objects. Consider: B Aω : BASE(x) ↔ (ON(x) ∧ x < ω) We might justify this axiom by noting that the finite ordinal numbers were guaranteed to exist by SOAP alone, prior to any set theoretic theorizing, so there is no reason why we cannot form sets of these, or sets of sets, and so on. If we let FO be the (countable) collection of finite ordinal numbers, then: Theorem 8.2: For infinite κ, |VFO (κ)| = κ iff κ = κ Proof: Similar to that of Theorem 7.1 above.
With π defined as before, the smallest model 20 of SOAP + NewerV + BAω is M(FO, π) : Theorem 8.3: SOAP + NewerV + BAω implies the axiom of infinity: Proof: For any finite ordinal number α , (i.e. any object in the range of ORD), α ∈ S Stg(0). So for any collection of finite ordinal numbers y, y ∈S Stg(1). In particular, ØN ∈ S Stg(1). Since the ordinal numbers are infinite (see Theorem 4.2), the collection of collections of finite ordinal numbers (and thus Stg(1)) is uncountably infinite, and therefore so is the universe (this can be expressed in second-order logic, see Shapiro [1991], p. 104). Thus, all countably infinite well-orderings are not Big, so there must be a limit ordinal number β (i.e. an ordinal number β such that, for every γ < β , there is a δ such that γ < δ < β.) So there is a set z containing all sets formed before β. Since β > 1, ØN ∈ N z, and, since β a limit ordinal number, for any w, if w ∈ N z then w ∪ N {w}N ∈ N z (since if w ∈ S Stg(n), then w ∪ N {w}N ∈ S Stg(n + 2)). Note that we did not construct the set that is intuitively associated with the axiom of infinity, i.e. ω , the set containing exactly the finite ordinals. The axiom of infinity does not assert the existence of this particular (infinite) set, however, but asserts the existence of a particular kind of infinite set, specifically, some set containing ØN and closed under the operation mapping x onto x ∪ N {x}N . Here we have constructed a set (much) larger that ω 20 The statement that M (FO,π) is the smallest model of SOAP + NewerV + UAω should be not be understood to imply that there are no other, non-isomorphic models of the same cardinality, since M(FO∪{⊗},π) clearly is such a model. Instead, all models have domains at least as large as that of M(FO,π) .
438
The Arché Papers on the Mathematics of Abstraction
satisfying the relevant constraints. ω can be obtained immediately, however, by an application of separation. There are other variants of the axiom of infinity, and the fact that SOAP + NewerV + BAω implies these variants is non-trivial. Gabriel Uzquiano [1999] has shown that, if the axiom of infinity is formulated as above, then secondorder Zermelo set theory 21 does not imply the following variant of the axiom of infinity: 22 Zermelo Infinity : (∃x)(∅N ∈ N x ∧ (∀y)(y ∈N x → {y}N ∈N x)) If we formulate Zermelo set theory using Zermelo Infinity to express the idea that there is an infinite set, then the original axiom of infinity does not follow. One can easily prove that Zermelo Infinity follows from SOAP + NewerV + BAω , however. Thus, SOAP + NewerV + BAω is stronger than Zermelo set theory if the axiom of infinity is formulated in either of these ways. Once we accept SOAP + NewerV + BAω , and see that every model contains at least π ordinal numbers (again with π defined as before), we might be tempted to argue that, since we are guaranteed π ordinal numbers, why not allow all ordinal numbers less than π to be elements of the basis, adopting the following Basis Axiom: BAπ : BASE(x) ↔ (ON(x) ∧ x < π ) Of course, all models of SOAP + NewerV + BAπ will be much larger than M(FO,π) . We could continue, formulating stronger and stronger set theories by allowing more and more of the ordinal numbers into the basis. Once we have started down this route it becomes difficult to know when to stop. Additionally, it seems reasonable to require that we only add instances of the Basis Axiom of the form: BAβ : BASE(x) ↔ (ON(x) ∧ x < β) where the ordinal β is definable in terms of purely logical vocabulary supplemented, if need be, by the abstraction operators O RD and E XT. Unlike the case of the finite ordinal numbers, however, the proof that π exists cannot be formulated within NewerV set theory (even when supplemented by BAω ), since it relies on replacement. Thus, BAπ does not seem to be a promising candidate for a Basis Axiom. 23 There is another option, however – since the ordinal numbers are generated by SOAP which is, in some sense, theoretically prior 24 to NewerV, why not 21 Zermelo set theory is just first-order ZFC without replacement and choice. Second-order Zermelo set theory is second-order ZFC without replacement. 22 (∃x)(Set(x) ∧ (Ø ∈ x ∧ (∀y)(Set(y) → (y ∈ x → {y} ∈ x)))) N N N N N 23 This does not rule out the possibility that π could be definable within some extended, neologicistically acceptable language. 24 One needs to be careful here, since the number of ordinal numbers generated by SOAP depends on the number of objects that exist, which could depend on NewerV.
439
Iteration One More Time
just allow all ordinal numbers to be contained in the basis? In other words: BAORD : BASE(x) ↔ ON(x) We can, in this case, derive a version of the Burali-Forti paradox, however: Theorem 8.4: SOAP + NewerV + BAORD is inconsistent. Proof: If every ordinal number is contained in the basis, and therefore contained in Stg(0), then every collection of ordinal numbers is a set and is in Stg(1). Let us call the set of ordinal numbers O. Let X = {S : S is a set of ordinal numbers and (∀n)(n ∈N S → (∀m)(m < n → m ∈N S))}. X is a set by the powerset and separation axioms. There is an isomorphism between {O, <} and {X, ⊆ }, f : O → X and, for n ∈N O, f (n) = {m : m < n}. Thus, X ordered by ⊆ is a well-ordering, and, since X is a set, this relation is not Big, so there is an ordinal number corresponding to it. But, this ordinal number must be greater than any in O, since f provides a mapping of the order type of each ordinal number in O into, but not onto, {X, ⊆ }. Contradiction. Corollary 8.5: SOAP + NewerV implies “ON(x)” is Bad Thus, the best that we can do (if, in fact, BAω is acceptable) is to accept SOAP + NewerV + BAω and as a result we obtain all of second-order ZFC except for the axiom of replacement. It is worth noting that SOAP + NewerV + BAω seems to provide us with a good approximation to the iterative conception of set attributed to Boolos in Section 2.
9.
NewV + NewerV
Neither NewV nor SOAP +NewerV alone suffices to capture enough of second-order ZFC for us to claim unequivocally that either provides a mathematically adequate abstractionist account of contemporary set theory. Our next task is to determine whether combining the two principles suffices to provide the neo-logicist with such a reconstruction. We will consider the conjunction SOAP + NewerV + NewV where we understand occurrences of “E XT” in each principle to be distinct occurrences of the same operator. Before looking at the formal attributes of SOAP + NewerV + NewV, however, we should take note of an obvious line of objection. For the neoFregean, at least, an abstraction principle for extensions is meant to provide something like an implicit definition of the abstraction operator “E XT”. In the investigation above, NewV was understood as providing one candidate for such a definition, and NewerV as providing an alternative such definition. In considering a theory containing both NewV and NewerV, however, we are faced with a situation in which we have, in effect, simultaneously accepted two such definitions. We might legitimately question whether this is coherent, much less desirable. More pointedly, we might wonder which of
440
The Arché Papers on the Mathematics of Abstraction
the two principles is responsible for truly defining the abstraction operator in question, that is, for introducing the new piece of mathematical language and providing its meaning. If one of the abstraction principles amounts to such an implicit definition, what is the role of the other within the neo-logicist framework? It is tempting to think that we can avoid the objection by replacing the conjunction of NewV and NewerV with a single abstraction principle that combines the formal features of both. For example, we might conclude that sets are extensions that are both reachable in the iterative hierarchy and not too ‘Big’. Let “BadNewV ” be the predicate asserting that a property is Big, that is: BadNewV (P) ↔ (∃ f )((∀x)(∀y)(( f (x) = f (y) → x = y) ∧(∀x)(∃y)(Py ∧ f (y) = x))) and let “BadNewerV ” be the corresponding condition for the iterative conception as developed above, that is: BadNewerV (P) ↔ ¬(∃α)(ON(α) ∧ (∀x)(Px → x ∈S Stg(α))) with ∈S Stg defined as before. Then the idea that sets are the extensions of concepts that are neither too Big nor unreachable in the iterative hierarchy can be formulated as: NewestV : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ ((∀x)(Px ↔ Qx) ∨ ((BadNewV (P) ∨ BadNewerV (P)) ∧ ((BadNewV (P) ∨ BadNewerV (P))))] Unfortunately, this principle is no more powerful than NewerV alone: Theorem 9.1: Any model of NewerV is a model of NewestV Proof: A consequence of the fact that NewerV implies the size restriction principle from Section 7, i.e. in any model of NewerV, every Big concept is Bad. Similar problems plague: NewestV∗ : (∀P)(∀Q)[E XT(P) = E XT(Q) ↔ ((∀x)(Px ↔ Qx) ∨ (BadNewV (P) ∧ BadNewerV (P) ∧ BadNewV (Q) ∧ BadNewerV (Q)))] Thus, some other strategy must be adopted in order to obtain a theory combining the strength of NewV + NewerV. Another option might be simultaneously to adopt the two principles but to treat them as implicit definitions of two distinct abstraction operators, E XTNewV and E XTNewerV , that is, we rewrite NewV and NewerV as: NewV∗ : (∀P)(∀Q)[E XTNewV (P) = E XTNewV (Q) ↔ ((∀x)(Px ↔ Qx) ∨ (BadNewV (P) ∧ BadNewV (Q)))]
441
Iteration One More Time
NewerV∗ : (∀P)(∀Q)[E XTNewerV (P) = E XTNewerV (Q) ↔ ((∀x)(Px ↔ Qx) ∨ (BadNewerV (P) ∧ BadNewerV (Q)))] This approach will, from a formal perspective, accomplish what we want, since the constraint on the cardinality of the domain imposed by NewV ∗ will guarantee that the replacement axiom holds for when interpreted in terms of E XTNewerV , and the constraints imposed by NewerV ∗ will imply that the powerset axiom will hold when interpreted in terms of E XTNewV . 25 There is philosophical problem, however. Do we identify sets as NewV ∗ extensions or as NewerV ∗ extensions? The problem is exacerbated by the following: Theorem 9.2: There is a model M such that M is a model of NewV ∗ , NewerV ∗ and: (∀P)(E XTNewV (P) = E XTNewerV (P)) Proof: Let f : ω → V{⊗1 ,⊗2 } (ω) be any enumeration of V{⊗1 ,⊗2 } (ω). Then (extending the notion of standard model in the obvious way) M =< V{⊗1 ,⊗2 } (ω), I > where: I (E XTNewV (P)) = I (P) if I (P) ∈ V{⊗1 ,⊗2 } (ω) I (E XTNewV (P)) = ⊗1 otherwise. and: I (E XTNewerV (P)) = ( f (2n + 1) if I (P) ∈ V{⊗1 ,⊗2 } (ω) and I (P) = f (2n) I (E XTNewerV (P)) = f (2n) if I (P) ∈ V{⊗1 ,⊗2 } (ω) and I (P) = f (2n + 1) I (E XTNewerV (P)) = ⊗2 otherwise.25
What we have here is a particularly vicious version of the Caesar problem: given two distinct abstraction principles for two distinct extension operators, we cannot even determine whether the empty extension (E XTNewV (x = x)) arising from NewV ∗ is identical to the empty extension (E XTNewerV (x = x)) provided by NewerV ∗ . What we need are necessary and sufficient conditions for the identity of two abstracts that are generated by different abstraction principles. In The Limits of Abstraction Kit Fine discusses this problem at length, and his solution offers us 25 This is a result of the fact that whether or not powerset (replacement) holds in the context of NewV (NewerV) is a function of the cardinality of the domain. 26 In order to get the result in its full generality we require that there be two Bad objects, one for each principle.
442
The Arché Papers on the Mathematics of Abstraction
a way out of our present difficulty. After considering and rejecting the idea that abstracts provided by different abstraction principles must be distinct (since, for example, the finite numbers generated by Hume’s Principle and restrictions of it, such as Finite Hume, 27 ought to be the identical), he suggests that we: . . . face the possibility that the criteria of identity [of distinct abstraction principles] might be different in a way that is not relevant to the identities of the abstracts in question. And this might lead one to . . . take two abstracts to be the same when their associated equivalence classes are the same, regardless of the means of abstraction by which they were obtained. 28 ([2002], p. 49)
The idea is simple: each abstraction principle divides up the collection of concepts on the domain into one or more equivalence classes, and each one of these corresponds to an abstract. If we think of the abstracts as going proxy for the equivalence classes, then any two abstracts that correspond to the same collection of concepts should be identical regardless of what abstraction principle was used to “generate” them. 29 We can formalize this as the General Abstract-Identity Schema: GAS : For any two (legitimate) abstraction operators @1 and @2 : (∀P)(∀Q)(@1 (P) = @2 (Q) ↔ (∀F)(@1 (F) = @1 (P) ↔ @2 (F) = @2 (Q)))29 Of interest at present is the instance governing our two extension operators: (∀P)(∀Q)(E XTNewV (P) = E XTNewerV (Q) ↔ (∀F)(E XTNewV (F) = E XTNewV (P) ↔ E XTNewerV (F) = E XTNewerV (Q))) If we have GAS, we can define set-hood and membership in terms of those abstracts that are the E XTNewV and E XTNewerV of the same (dually) non-‘Bad’ 27 Finite Hume is a restricted version of Hume’s Principle which provides each finite positive integer
and a single pre-Cantorian infinite number (the abstract of all infinite concepts). For the exact formulation see Section 10. 28 Fine immediately rejects this analysis in favor of one that identifies abstracts whose equivalence classes are necessarily identical. He spends the remainder of the chapter exploring the difficulties such a modalized account must face. Fortunately, the simpler non-modal formulation suffices for our present purposes. For further discussion of this aspect of the Caesar problem, see Cook and Ebert [2004]. 29 I do not intend this to be read as a defense of this particular route to solving (this variant of) the Caesar problem but am content merely to briefly sketch how such a solution might proceed. 30 While acceptance of this principle has no impact on the standard version of the Caesar problem, which concerns identities between abstracts and non-abstracts, its acceptance completely solves the analogous problem of determining when two abstracts generated by different abstraction principles are identical. We can call this latter problem the C-R problem, since determining whether the real numbers are a subcollection of the complex numbers is a particular case of the problem, and the term “C-R” has a convenient similarity to the word “Caesar”.
Iteration One More Time
443
concept: Set(x) ↔ (∃P)[x = E XTNewV (P) ∧ x = E XTNewerV (P) ∧¬BadNewV (P) ∧ ¬BadNewerV (P)] x ∈ y ↔ (∃P)[Px ∧ y = E XTNewV (P) ∧ y = E XTNewerV (P)] On these definitions, the conjunction of NewV ∗ , NewerV ∗ and the relevant instance of GAS entails the axioms of extensionality, separation, empty set, pairing, union∗ (but not union), powerset, and replacement axioms. Thus GAS provides us with a means for combining two abstraction principles for extensions that is consistent with neo-Fregean ideas about implicit definition yet delivers the desired results. In what follows, however, we shall as a matter of convenience use NewV+ NewerV, assuming that the same abstraction operator occurs in both principles, since this allows us to straightforwardly adopt results that were proved earlier (or elsewhere) for one or the other of these two principles. The reader should keep in mind that this conjunction of ‘definitions’ of the extension operator “E XT” can be replaced by a richer (but formally less tractable) account of identity conditions across distinct abstraction principles. 31
10.
NewV + NewerV and infinity
With SOAP + NewV + NewerV in place, we define notions such as set, membership, ordinal, Boolos-pure set, and hereditary set as before. First off, we note that SOAP + NewerV + NewV is consistent: Theorem 10.1: SOAP + NewerV + NewV is satisfied by M({⊗},ω) Proof: Straightforward, left to the reader.
This also shows that the axiom of infinity fails to follow from SOAP + NewerV + NewV. The results of previous sections suffice to show that SOAP + NewerV + NewV proves the axioms of extensionality, separation, empty set, pairing, union∗ , powerset, and replacement. As one might expect, interesting consequences follow from the conjunction of these two ‘definitions’ of set that do not follow from either alone. As an example we have the following: Lemma 10.2: SOAP + NewerV + NewV implies that, for all P, P is Big if and only if P is Bad. Proof: Straightforward, left to the reader.
31 In addition, we can in the present context eliminate SOAP altogether, reformulating NewerV (or NewestV) so that the stages are ordered by the ordinals (i.e. the transitive pure sets well-ordered by ∈) provided by NewV (or NewestV). Although this would be a bit more elegant than the methods employed in the text, ironing out the details of the relevant reformulation of NewerV (or NewestV) would add considerable length to this paper without any significant gain.
444
The Arché Papers on the Mathematics of Abstraction
More significantly, SOAP + NewerV + NewV proves that every object is either a set or is in the basis (although as we shall see some objects might be both): Theorem 10.3: SOAP + NewerV + NewV implies (∀x) (UR(x) → BASE(x)) Proof: Assume for arbitrary a that a is an urelement, i.e. a is not a set. Since both NewV and NewerV are only satisfied on infinite models, the property corresponding to “x = a” is not ‘Big’, and thus not ‘Bad’, that is (∃α) (ON(α) ∧ (∀x)(x = a → x ∈S Stg(α))). Let β be the least ordinal such that (∀x)(x = a → x ∈S Stg(β)), that is, β is the least ordinal such that a ∈S Stg(β). Assume that β > 0. Then by the definition of stages (∃P)(a = E XT(P) ∧ (∃δ)(δ < β ∧ (∀y) (Py → y ∈S Stg(δ)))). So (∃P)(a = E XT(P) ∧ (∃δ)(δ is an ordinal number ∧ (∀y)(Py → y ∈S Stg(δ)))). Thus, a is a set. Contradiction, so β = 0 and a is in the basis. As a result of this we have the following: Corollary 10.4: SOAP + NewerV + NewV implies the Urelement Axiom: (∃x)(Set(x) ∧ (∀y)(UR(y) ↔ y ∈N x)) Corollary 10.5: Every model of SOAP + NewerV + NewV is isomorphic to M(BASE,κ) for some cardinal κ such that either κ = ω and BASE is finite or κ is inaccessible and κ > |BASE|. Corollary 10.6: If M(BASE,κ) is a model of SOAP + NewerV + NewV, then ⊗ ∈ BASE. Thus, SOAP + NewerV + NewV captures all of ZFC except for the axiom of infinity and the axiom of foundation. 32 We set aside foundation until Section 11. As already noted, SOAP + NewerV plus the claim that there are infinitely many elements in the basis implies the axiom of infinity. NewV plus the claim that there are uncountably many objects implies the axiom of infinity. Here we consider a principle that is independent of each of these assumptions: 33 InfNonSets: “There are infinitely many urelements”. NewV has models with uncountably many sets but only finitely many nonsets, and models with infinitely many non-sets but only a countable infinity of sets. Similarly, SOAP + NewerV has models with infinitely many non-sets but only finitely many objects in the basis and, as we shall see in Section 11, it 32 In fact, SOAP + NewerV + NewV is strictly stronger than second order ZFCU minus the axioms of infinity and foundation since the former, but not the latter, implies that there is a set containing all urelements. Thanks go to an anonymous referee for pointing this out. 33 That is, (∃P)((∀x)(Px → ¬ Set(x)) ∧ (∃ f )((∀x)(∀y)( f (x) = f (y) → x = y) ∧ (∀x)(P(x) → P( f (x))) ∧ (∃x)(Px ∧ (∀y)( f (y) = x)))).
445
Iteration One More Time
also has models with finitely many non-sets but infinitely many objects in the basis. Thus, the following is not trivial, even if its proof is: Theorem 10.7: SOAP + NewerV + NewV + InfNonSets implies the axiom of infinity. Proof: Assume that there are infinitely many urelements. Then, by Theorem 9.2, there are infinitely many objects in BASE. NewerV plus the claim that BASE has infinitely many members implies the axiom of infinity. This provides the following: Corollary 10.8: SOAP + NewerV + NewV + InfNonSets is satisfied by M({⊗},κ) iff κ is an inaccessible cardinal. Thus, every model of SOAP + NewerV + NewV + InfNonSets is also a model of second-order ZFC-foundation. Of course, the crucial question is not what principles can be added to SOAP + NewerV + NewV in order to derive the axiom of infinity, but what additional neo-logicistically acceptable principle will provide the axiom of infinity. Provided that GAS provides the correct account of identity between abstracts arising from distinct abstraction principles, however, a neo-logicist justification of InfNonSets is straightforward. The neo-logicist need only accept, in addition to SOAP + NewerV + NewV, a restricted version of Hume’s Principle such as Finite Hume: 34 FHP : (∀P)(∀Q)[N UM(P) = N UM(Q) ↔ (P ≈ Q ∨ (Finite(P) ∧ Finite(Q)))] (“Finite(P)” is an abbreviation of the second-order formula asserting the existence of a 1–1 correspondence from the P’s into, but not onto, the P’s.) 35 If we combine FHP with SOAP + NewerV + NewV and the relevant instance of GAS: (∀P)(∀Q)(N UM(P) = E XT(Q) ↔ (∀F)(N UM(F) = N UM(P) ↔ E XT(F) = E XT(Q))) we obtain the following theorem: Theorem 10.9: FHP + SOAP + NewerV + NewV + GAS implies InfNonSets 34 One drawback to this general approach is that Hume’s Principle or even Small Hume:
(∀P)(∀Q)[N UM(P) = N UM(Q) ↔ (Pis‘Big’ ∧ Qis‘Big’) ∨ P ≈ Q] is inconsistent with NewV + NewerV + GAS: If all the numbers (or all the small numbers) other than 0 are not extensions, then they must be urelements, but then there must be a set of all numbers, and further, the powerset of this set must exist. But, since the universe must be the size of a strong inaccessible, there must be exactly as many numbers as there are objects in the universe. Contradiction. 35 That is, (∃ f )((∀x)(Px → Pf(x) ∧ (∀y)(∀z)( f (y) = f (z) → y = z) ∧ (∃w)(Pw ∧ (∀n)(Pn → ¬ f (n) = w)))).
446
The Arché Papers on the Mathematics of Abstraction
Proof: Combine the standard proof of (part of) Frege’s Theorem (i.e. the claim that there are infinitely many numbers) with the fact that, since for each FHP number x where x = 0, (∃P)(∃Q)(x = N UM(P) = N UM(Q) ∧ ¬(∀y)(Py ↔ Qy)), all numbers (other than 0) are not extensions. 36 This provides the necessary: Corollary 10.10: FHP + SOAP + NewerV + NewV + GAS implies the axiom of infinity.
11.
Foundation and non-well-founded sets
In this section we will examine the status of the second-order axiom of foundation: Foundation: (∀P)((∀x)(Px → Set(x)) → ((∃y)(Py) → (∃y)(Py ∧ ¬(∃z)(Pz ∧ z ∈N y)))) and the prima facie weaker axiom of regularity: Regularity: (∀x)((Set(x)) → ((∃y)(y ∈N x) → (∃y)(y ∈N x ∧ ¬(∃z)(z ∈N x ∧ z ∈N y)))) within NewerV + NewV set theory. 37 As was the case with NewV or NewerV alone, SOAP + NewerV + NewV proves foundation when the quantifiers are restricted to Boolos-pure sets, but foundation, and the weaker axiom of regularity, can fail to hold of all sets, or even all hereditary sets. To show that regularity restricted to the hereditary sets fails to follow from SOAP + NewerV + NewV (and thus the unrestricted versions fail to follow as well), it suffices to show that the Axiom: Axiom : (∃x)(∀y)(y ∈ x ↔ y = x) 38 can be consistently added to SOAP + NewerV + NewV. To show this we will construct models within Aczel’s (first-order) Non-wellfounded set theory (see Aczel [1988]). Since Aczel’s systems are all interpretable within first-order ZFC-foundation, the results below can be proven within ZFC-foundation directly, although the presentation is less straightforward. Thus, our adoption of non-well-founded set theory is a matter of convenience only, and our ‘official’ metatheory remains first-order ZFC-foundation. (It should be noted that none of the results used below depend on the particular 36 Interestingly, GAS implies that N UM(x = x) = E XT(x = x), i.e., 0 = ∅. Significantly, anti-zero, the number of the universe, which has been a topic of controversy since Boolos [1997] (p. 314), is (provably) not identical to the Bad extension ⊗. 37 Uzquiano [1999] proves that second-order Zermelo set theory with the axiom of regularity has models where second-order foundation fails. This provides us with another way in which NewerV is stronger than Zermelo set theory, since NewerV implies the equivalence of foundation and regularity. 38 That is, (∃x)(Set(x) ∧ (∀y)(y ∈ x ↔ y = x)). N
Iteration One More Time
447
formulation of the Anti-foundation Axiom – any of the variants discussed in the literature will suffice. See Rieger [2000] for a nice discussion of the popular variants.) The following demonstrates that we can, in a sense, have arbitrarily many non-well-founded sets in NewV + NewerV set theory: Theorem 11.1: SOAP + NewerV + NewV + InfNonSets is satisfied by M(BASE, κ) where BASE is the pairwise union of any transitive set of non-wellfounded sets and {⊗} and κ is any inaccessible such that κ > |BASE|. Proof: The transitivity of BASE guarantees that for any set that is also in the basis, all of its members are in the basis. The remainder is straightforward. The consistency of Axiom is immediate: Corollary 11.2: SOAP + NewV + NewerV + InfNonSets + Axiom is consistent. 39 Proof: Since, letting be a set such that = { }, { } is a transitive set of non-well-founded sets, M( ⊗},κ) is a model of SOAP + NewerV + NewV + InfNonSets + Axiom. Since is a hereditary set, we have the desired: Corollary 11.3: SOAP + NewV + NewerV fails to imply foundation or regularity restricted to hereditary sets. If we combine Theorem 10.9 with the following lemma we obtain a corollary promised in Section 10: Lemma 11.4: SOAP + NewerV + NewV implies that if x ∈N x, then x is an element of the basis. Proof: Assume for an arbitrary a that a ∈N a. Either a is a set or a is not a set. By Theorem 12, if a is not a set, then a is in the basis. So assume that a is a set. Thus “x ∈N a” is not ‘Bad’, so (∃α)(ON(α) ∧ (∀x)(x ∈N a → x ∈S Stg(α))). Let β be the least ordinal such that (∀x)(x ∈N a → x ∈S Stg(β)). Assume that β > 0. It follows from the definition of stages (∃δ)(δ < β ∧ (∀y)(y ∈N a → g ∈S Stg(δ))). Since a ∈N a, we have (∃δ)(δ < β ∧ a ∈S Stg(δ)). Contradic tion, so β = 0 and a is in the basis. Corollary 11.5: SOAP + NewerV + NewV has models with finitely many nonsets but infinitely many members of the basis. Proof: Let BASE be any infinite transitive set of non-well-founded sets, and κ an inaccessible cardinal such that κ > |BASE|. Then M(B AS E∪{⊗},κ) is a model of SOAP + NewerV + NewV. 39 Note that is a member of the basis but not an urelement.
448
The Arché Papers on the Mathematics of Abstraction
Thus, SOAP + NewerV + NewV does not rule out the existence of non-wellfounded sets. SOAP + NewerV + NewV does rule out the simultaneous existence of all non-well-founded sets, however. The following is a theorem of Non-wellfounded Set Theory: (∀x)(∃y)(∀z)(z ∈ y ↔ (z = y ∨ z ∈ x)) In other words, given any set x there is a set y (not necessarily distinct) that contains exactly the members of x and itself. This can be expressed within NewV + NewerV set theory as: W eak N W F : (∀P)(¬Bad(P) → (∃Q)(¬Bad(Q) ∧ (∀x)(Qx ↔ (Px ∨ x = E xt(Q))))) This principle, far weaker than any of the popular formulations of the Antifoundation Axiom, is nevertheless incompatible with NewerV + NewV set theory: Theorem 11.6: SOAP + NewerV + NewV + WeakNWF is inconsistent. Proof: There is an obvious correspondence between the ordinal numbers as provided by SOAP and the sets usually referred to as “ordinals” (i.e. the transitive pure sets well-ordered by ∈). By previous results, “x is an ordinal number” is Bad, and thus Big, so the collection of ordinals is Big. Consider a function f such that f (x) = y iff for all z, z ∈N x or z = y iff z ∈N y (the existence of such a function is guaranteed by WeakNWF and Choice). Note that if f (x) = y, it follows that y ∈N y. Assume that f (x) = y = f (z) for x, y ordinals. Then for all w, w ∈N x or w = y iff w ∈N y iff w ∈N z or w = y. So, for an arbitrary w, if w ∈N x then either w ∈N z or w = y. But since foundation holds of the ordinals, w = y. Thus x ⊆N z, and similarly z ⊆N x. Thus, x = z, so f restricted to the ordinals is 1–1. So the image of the ordinals under f is Big. But for any y, if y is in the image of the ordinals under f , then y is in the basis. Thus, the property corresponding to “x ∈ BASE” is Big. Contradiction. Thus, although set theory based on NewV plus NewerV is consistent with the existence of (some) non-well-founded sets, it cannot tolerate the addition of all of them. 40 40 It is worth noting that the following principle of Urelement-Basis:
Ur − Base :
(∀x)(BASE(x) ↔ U R(x))
provides us with the following: Fact : S O A P + N ewer V + N ewV + Ur -Base implies the axiom o f f oundation.
Iteration One More Time
12.
449
Philosophical lessons
There are four main areas of interest that arise in the comparison of the limitation of size conception of set (as codified in NewV) and the iterative conception of set (as codified in SOAP + NewerV). Each is associated with the status of one of the standard set theoretic axioms. The axioms at issue are the axioms of powerset, replacement, infinity, and foundation. As we have seen, NewV implies the replacement axiom but fails to secure the truth of powerset; alternatively SOAP + NewerV implies the powerset axiom but fails to secure replacement. Thus, if we are forced to choose a single abstraction principle that provides the “definition” of set, then we are left here with a dilemma – given a choice, would we rather have the powerset axiom and forego replacement, or have replacement and forego powerset? Of course, this choice is not just a matter of personal preference. The powerset axiom is necessary for the formalization of much of contemporary mathematics within set theory, taking us from the naturals (modelled, e.g., by the finite ordinals) to the reals (modelled by the sets of finite ordinals), from the reals to the theory of functions on the reals (modelled by sets of sets of finite ordinals), and so on. The axiom of replacement, however, is for the most part only used in rather obscure and esoteric branches of pure set theory such as accounting for the behavior of transfinite ordinals and cardinals. 41 If we are interested in using our set theory to provide a foundation for much or all of mathematics, including but not limited to the mathematics necessary for doing science, then when faced with option of having powerset or replacement but not both, the appropriate choice seems clear – powerset is more crucial to formulating modern mathematics within set theory. Nevertheless, in failing to imply one or the other of these central axioms, neither abstraction principle seems to satisfy the demand for mathematical adequacy. Adopting SOAP + NewerV + NewV (or, perhaps more plausibly, by adopting SOAP + NewerV* + NewV* + GAS as discussed in Section 9), the worries about replacement and powerset evaporate – both are derivable from the conjunction of these abstraction principles. Additionally, this approach dovetails nicely with the historical development of axiomatic set theory, motivated as it was by two competing conceptions of set each corresponding to one of the abstraction principles in question. Of course, we have not dissolved all problems for a neo-logicist set theory, or even all problems relating to replacement 41 I am not arguing either for the claim that replacement is unneeded in mathematics (the fact that we cannot prove that every Borel game is determined without replacement rules this out) nor for the claim that powerset is necessary to reconstruct modern mathematics (the fact that many mathematicians, either out of constructivist scruples or mathematical curiosity, have formulated interesting versions of analysis and other theories that do not depend on uncountable infinities rules this out). The point is merely that, given the situation as it stands now, if the neo-logicist can have a set theory with one or the other but not both of these principles, choosing powerset over replacement seems well motivated given that powerset allows for elegant and natural reconstructions of the continuum and other central mathematical structures while there are comparably fewer constructions and (currently identified) results that depend on replacement.
450
The Arché Papers on the Mathematics of Abstraction
and powerset, in this paper. Nevertheless, the theory based on NewV + NewerV seems to be the most promising candidate so far for a neo-logicist account of sets. 42 The most pressing problem for a neo-logicist reconstruction of set theory, however, whether based on one abstraction principle or many, is the axiom of infinity. Unfortunately, the axiom of infinity does not follow from NewV, SOAP + NewerV, or SOAP + NewerV + NewV. Thus, the neo-logicist needs to find some additional principle that implies that there is an infinite set. Some possibilities have been explored above. Again, NewV + NewerV set theory comes out on top, as the axiom of infinity follows merely from the additional assumption that there are infinitely many non-sets (i.e. InfNonSets), which in turn follows from other neo-logicist principles plus our identity principle GAS. Even if GAS turns out not to be neo-logicistically acceptable, InfNonSets certainly seems less problematic than the assumptions needed to obtain the axiom of infinity when working within NewV set theory or the theory of SOAP + NewerV. SOAP + NewerV + NewV + InfNonSets is a promising candidate for a sufficiently powerful neo-logicist account of sets. With regard to the foundation axiom, however, SOAP + NewerV + NewV is, in one sense, little better than the weaker theories. Foundation holds of the Boolos-pure sets, but can fail on the hereditary sets. Perhaps this is as it should be, since the axiom of foundation is of a different character than the other standard axioms of ZFC. Each of the other axioms is of one of two forms. First we have straight existential claims: (∃y)(∀z)(z ∈ y ↔ (z)) Infinity and empty set are axioms of this type. Second, we have conditional existence claims: (∀x1 )(∀x2 ) · · · (∀xn )(∃y)(∀z)(z ∈ y ↔ (z, x1 , x2 , . . ., xn )) These axioms state that, given any sequence of sets (or objects), a second set with a certain relation to the given sequence of sets (objects) also exists. Separation, powerset, replacement, union, and pairing are all conditional existence axioms. Foundation, on the other hand, is of a very different logical character, displaying the following logical form: (∀x)x 42 Of course, there are abstraction principles, such as the ‘distractions’ found in Shapiro and Weir [1999] and Weir [2004] that provide all of ZFC and more. For example, we can define “Bad(P)” as “P is the size of an inaccessible” (a notion definable in second-order logic) and then consider:
(∀P)(∀Q)[E XT(P) = E XT(Q) ↔ ((∀x)(Px ↔ Qx) ∨ (P is ‘Bad’ ∧ Q is ‘Bad’))] This principle will give us all of ZFCU. The point, however, is that SOAP + NewerV + NewV is likely the best candidate for a theory based on abstraction principles that defines extensions in terms of conditions that are well-motivated and can be justified independently of an extensive prior knowledge of set theory.
Iteration One More Time
451
Foundation does not imply the existence of any new sets, but instead imposes a restriction on what sorts of characteristics sets can have, and thereby restricts what sets can in fact exist. This restriction was originally motivated by a certain view about how sets should be structured, a view motivated by the idea that the set theoretic paradoxes were caused by circularity. Even if this restriction is motivated by an intuitive picture that also underlies the iterative conception of set, a conception that we have accepted in the form of NewerV, we need not feel forced to thereby accept a wholesale ban on circularity. Instead we can view the neo-logicist as replacing one response to the set theoretic paradoxes (an overwhelming fear and avoidance of anything circular) with another response (the idea that acceptable abstraction principles provide secure foundations for mathematical theories irrespective of circularity). In reconstructing contemporary set theory he is forced to adopt some of the restrictions that evolved from the prior view of the nature of sets (i.e. those following from the iterative picture of sets as codified by NewerV), but he is free to countenance circular sets to the extent that his new set theory allows. On this way of viewing things it is perfectly reasonable that neo-logicist set theory implies that certain sorts of sets obey the axiom of foundation, but leaves open the question of whether all do. Another way of making the same point is to note that the neo-logicist can construct a set theory that implies all of the axioms of second-order ZFC if he merely modifies his definition of “set”. Instead of having sets be the extensions of properties that are not Big (or Bad), let sets be those objects that are both extensions of properties that are not Big and contained in every Boolos-closed concept (i.e. the Boolos-pure sets). As we have seen, SOAP + NewerV + NewV + InfNonSets proves all of the axioms of secondorder ZFC restricted to these objects. Thus, assuming that all of these principles are neo-logicistically acceptable, NewerV + NewV set theory (with a principle guaranteeing an infinite set) in a sense “contains” full secondorder ZFC, but also leaves room for the existence of other, non-well-founded sets. To sum up: SOAP +NewerV + NewV + InfNonSets provides the neo-logicist with a set theory that is (roughly) as strong as full second-order ZFC. As already noted, detailed philosophical defense of the acceptability of these principles is still necessary. Nevertheless, the mathematical problem – determining whether there is a mathematically adequate neo-logicist set theory–seems to be solved.
Appendix Although I have given these proofs (and the ones in the main body of the text) rather informally, each of them can be straightforwardly (though tediously) rewritten as a formal deduction within second-order logic.
452
The Arché Papers on the Mathematics of Abstraction
Lemma A.1: (¬ Bad(P) ∧ (∀x)(Qx → P x)) → ¬ Bad(P) Proof: Assume that P is not Bad and that (∀x)(Qx → P x) holds. Then (∃α) (ON(α) ∧ (∀x)(P x → x ∈S Stg(α))). So (∃α)(ON(α) ∧ (∀x)(Qx → x ∈S Stg(α))). So Q is not ‘Bad’. Lemma A.2: (¬ Bad(P) ∧ E XT(P) = E XT(Q)) → ¬ Bad(P) Proof: Assume P is not ‘Bad’ and E XT(P) = E XT(Q). Then either P is Bad and Q is Bad, or (∀x)(Px ↔ Qx), so, by Lemma A.1, since P is not Bad, Q is not Bad. Lemma A.3: ¬ Bad(P) → (∀x)(x ∈N E XT(P) ↔Px) Proof: Assume that P is not Bad. Given an arbitrary x, if x ∈N E XT(P) then (∃Q)[Qx ∧ E XT(P) = E XT(Q)]. Since P is not Bad, this implies that (∃Q)[Qx∧ (∀y)(Py ↔ Qy)], that is, Px. Similarly, given an arbitrary x such that Px, it follows that [Px ∧ E XT(P) = E XT(P)], so (∃Q)[Qx ∧ E XT(P) = E XT(Q)], i.e., x ∈N E XT(P). Theorem A.4: NewerV entails Extensionality: (∀x)(Set(x) → (∀y)(Set(y) → (∀z)((z ∈N x ↔ z ∈N y) → x = y))) Proof: Let x and y be sets. Then x = E XT(P) and y = E XT(Q) where P and Q are not Bad. It follows, by Lemma A.3, that (∀z)(z ∈N E XT(P) ↔ P z) and (∀z)(z ∈N E XT(Q) ↔ Qz). Assume that (∀z)(z ∈ x ↔ z ∈ y) holds, that is, (∀z)(z ∈N E XT(P) ↔ z ∈N E XT(Q)). Then (∀z)(Pz ↔ Qz), so E XT(P) = E XT(Q), or x = y. Theorem A.5: NewerV entails Empty Set: (∃x)(Set(x) ∧ (∀y)(y ∈ / N x)) Proof: Let x = E XT(y = y). Then Set(x), since 1 is an ordinal number and / x), since for no z is it the (∀y)(y = y → y ∈S Stg(1)). Additionally, (∀y)(y ∈ case that z = z. Theorem A.6: NewerV entails Separation: (∀P)(∀x)(Set(x) → (∃y)(Set(y) ∧ (∀z)(z ∈N y ↔ (z ∈N x ∧ Pz)) Proof: Let P be a property and x a set. Then x = E XT(Q) where Q is not Bad. Let y = E XT(P ∧ Q). Then Set(y), since P ∧ Q is not Bad by Lemma A.1. Also, for any z, z ∈N y iff (Pz and Qz) iff Pz and z ∈N x. Theorem A.7: NewerV entails Union∗ : (∀x)(Set(x) → (∃y)(Set(y) ∧ (∀z)(z ∈N y ↔ (∃w)(Set(w) ∧(z ∈N w ∧ w ∈N x)))))
453
Iteration One More Time
Proof: Let x be a set, so that x = E XT(P). Thus, (∃α)(ON(α) ∧ (∀x)(Px → x ∈S Stg(α))). If w ∈N x, then by Lemma A.3, Pw, so w ∈S Stg(α). In other words, either w is in the basis (w ∈S Stg(0)), or (∃Q)(w = E XT(Q) ∧ (∃β)(β < n∧ (∀y) (Qy →∈S Stg(β)))). So, if w is a set, then for any z ∈N w, Qz, so z ∈S Stg(β), and thus, for all z and w such that z ∈ w ∈ x, z ∈S Stg(α). Let S be the property holding of just the members of members of x. Then (∀y)(Sy → y ∈S Stg(α)), so E XT(S) is a set, and is the union of x. Theorem A.8: NewerV entails Pairing: (∀x)(Set(x) → (∀y)(Set(y) → (∃z)(Set(z) ∧ (∀w) (w ∈N z ↔ (w = x ∨ w = y))))) Proof: Let x and y be sets. Then there is a P such that x = E XT(P) where (∃α)(ON(α) ∧(∀x)(Px → x ∈S Stg(α))) and there is a Q such that y = E XT(Q) where (∃β) (ON(β) ∧ (∀x) (Qx → x ∈S Stg(β))). Let δ = max(α, β). Then x ∈S Stg (δ +1) and y ∈S Stg(δ + 1). Let F be the property that holds of exactly x and y. Then E XT(F) is a set, and is the pair set of x and y. Theorem A.9: NewerV entails Powerset: (∀x)(Set(x) → (∃y)(Set(y) ∧ (∀z)(z ∈N y ↔ (∀w)(w ∈N z → w ∈N x)))) Proof: Let x be a set, i.e., x = E XT(P) where (∃α) (ON(α) ∧ (∀x) (Px → x ∈S Stg(α))), and let y be a subset of x, i.e., there is a Q where y = E XT(Q) and (∀x) (Qx → Px). So (∀x) (Qx → x ∈S Stg(α))) and it follows that y ∈ S Stg (α + 1). Let S be the property holding of exactly the subsets of x. E XT(S) is a set, and is the powerset of x. 43
References Aczel, P. [1988], Non-well-founded Sets, Stanford CA, CSLI. Boolos, G. [1971], “The Iterative Conception of Set”, The Journal of Philosophy 68: 215–232, reprinted in Boolos [1998]: 13–29. Boolos, G. [1989], “Iteration Again”, Philosophical Topics 17: 5–21, reprinted in Boolos [1998]: 88–104. Boolos, G. [1997], “Is Hume’s Principle Analytic?”, in Heck [1997]: 245–261, reprinted in Boolos [1998]: 301–314. Boolos, G. [1998], Logic, Logic, and Logic, Cambridge, Mass., Harvard University Press. Cook, R. and P. Ebert, review of The Limits of Abstraction, by K. Fine, The British Journal for the Philosophy of Science 55, [2004]: 791–800. 43 This paper was written during the tenure of an AHRB research fellowship at Arché: The Centre for
the Philosophy of Logic, Language, Mathematics, and Mind at the University of St. Andrews. A version of the paper was presented to the members of Arché and benefited greatly from the resulting discussion. Thanks go to Peter Clark, Roy Dychoff, Philip Ebert, Fraser MacBride, Jon Mayberry, Cyrus Panjvani, Nikolaj Pedersen, Agustín Rayo, Marcus Rossberg, Stewart Shapiro, Robbie Williams, and Crispin Wright for illuminating discussions on this topic. The paper was greatly improved due to extensive discussions with, and comments from, Gabriel Uzquiano, to whom a special debt is owed.
454
The Arché Papers on the Mathematics of Abstraction
Fine, K. [2002], The Limits of Abstraction, Oxford, Oxford University Press. Frege, G. [1884], Die Grundlagen der Arithmetik, Breslau, Koebner; The Foundations of Arithmetic, Tr. by J. Austin, 2nd ed., New York, Harper, 1960. Frege, G. [1893], Grundgezetze der Arithmetik I, Hildesheim, Olms. Hale, R. [2000], “Reals by Abstraction”, Philosophia Mathematica 3: 100–123. Hallett, M. [1984], Cantorian Set Theory and Limitation of Size, Oxford, Clarendon Press. Heck, R. [1997], Logic, Language, and Thought, Oxford, Oxford University Press. Kunen, K. [1980], Set Theory: An Introduction to Independence Proofs, Amsterdam, North Holland. Rieger, A, [2000], “An Argument for Finsler–Aczel Set Theory”, Mind 109: 241–253. Shapiro, S. [1991], Foundations without Foundationalism: A Case for Second-Order Logic, Oxford, Oxford University Press. Shapiro, S. [forthcoming], “Prolegomenon to any Future Neo-Logicist Set Theory: Abstraction and Indefinite Extensibility”, The British Journal for the Philosophy of Science. Shapiro, S. and A. Weir [1999], “NewV, ZF, and Abstraction”, Philosophia Mathematica 7: 901–929. Uzquiano, G. [1999], “Models of Second-Order Zermelo Set Theory”, The Bulletin of Symbolic Logic 5: 289–302. Uzquiano, G. & I. Jané [2004], “Well- and Non-Well-Founded Extensions”, Journal of Philosophical Logic 33: 437–465. Weir, A. [2004], “Neo-Fregeanism: An Embarassment of Riches”, Notre Dame Journal of Formal Logic 44: 13–48, reprinted below as chapter 19. Wright, C. [1983], Frege’s Conception of Numbers as Objects, Aberdeen, Aberdeen University Press.