Vagueness and Degrees of Truth
This page intentionally left blank
Vagueness and Degrees of Truth Nicholas J. J. Smi...
63 downloads
1209 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Vagueness and Degrees of Truth
This page intentionally left blank
Vagueness and Degrees of Truth Nicholas J. J. Smith
1
1
Great Clarendon Street, Oxford Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York Nicholas J. J. Smith 2008 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2008 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by Laserwords Private Limited, Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–923300–7 10 9 8 7 6 5 4 3 2 1
For my parents, Sybille and Vivian Smith, with love and gratitude
Acknowledgements Thanks to Huw Price, who since I was an undergraduate has been an unfailing source of support, philosophical insight, and good advice. For valuable comments, discussions, and encouragement, I thank David Armstrong, Paul Benacerraf, David Braddon-Mitchell, Bob Brandom, John P. Burgess, Keith Campbell, Mark Colyvan, John Cusbert, Kit Fine, Alan H´ajek, Amitavo Islam, Jenann Ismael, the late Dick Jeffrey, Mark Johnston, Ed Mares, Ken Perszyk, Graham Priest, Gideon Rosen, and Scott Soames. Thanks to Peter Momtchiloff and Kate Walker of Oxford University Press for their guidance, to the three readers for their very useful comments, and to Jean van Altena for her helpful copy-editing. Thanks to my wife, Cath Vidler—for everything. Parts of Smith (2003), (2004), and (2005b) show up in reworked form in §5.1, §§5.2 and 5.6, and Chapters 3 and 4 respectively. Thanks to Oxford University Press, Springer, and Taylor & Francis for permission to reuse this material.
Contents Introduction
1
Part I. Foundations . Beginnings 1.1 Toolkit 1.2 The Classical Semantic Picture . The Space of Possible Theories of Vagueness 2.1 2.2 2.3 2.4 2.5 2.6 2.7
Epistemicism (Deny Nothing) Additional Truth Values (Deny 1a) Truth Gaps (Deny 1b) Supervaluationism (Deny 1c) Plurivaluationism (Deny 2) Contextualism Intuitionism (Assert Nothing)
15 17 24 33 34 50 71 76 98 113 122
Part II. Vagueness . What is Vagueness? 3.1 What Should We Want from a Definition of Vagueness? 3.2 Existing Definitions of Vagueness 3.3 Closeness 3.4 Vagueness as Closeness 3.5 The Advantages of Closeness . Accommodating Vagueness 4.1 Epistemicism 4.2 Additional Truth Values and Truth Gaps
127 127 133 140 145 159 175 175 186
viii
4.3 Supervaluationism 4.4 Plurivaluationism 4.5 Contextualism
191 197 200
Part III. Degrees of Truth . Who’s Afraid of Degrees of Truth? 5.1 5.2 5.3 5.4 5.5 5.6 5.7
209
On the Very Idea of Degrees of Truth Classical Logic A Gated Community in Theory Space? Truth and Assertibility Truth-Functionality Denying Bivalence Different Senses of ‘Fuzzy Logic’
210 220 224 248 251 274 275
. Worldly Vagueness and Semantic Indeterminacy
277
6.1 Artificial Precision 6.2 Sharp Boundaries
277 304
Conclusion
317
References Index
321 333
Introduction Ordinarily we say that persons are vague if they regularly misplace their car keys, stare blankly into the fridge having forgotten why they opened it, and so on. In philosophy, however, the term ‘vague’ has a different use. It applies primarily (although not exclusively) to predicates.¹ Amongst all predicates, the vague ones are usually singled out in one or more of three ways: Borderline cases. There are some persons to whom the predicate ‘is tall’ clearly applies (e.g. most professional basketball players) and some to whom it clearly does not apply (e.g. most professional jockeys), but then there are other persons to whom it is unclear whether or not the predicate applies (I’m sure you know some of them). When asked whether such a person is tall, we tend to react with some sort of hedging response: ‘‘sort of’’, a shrug and a certain sort of scowl or exhalation of breath, a blank look, etc. Call these persons borderline cases for ‘tall’. Other predicates—for example, ‘is a prime number’—do not have borderline cases. Julius Caesar is not a prime number, seven is a prime number, eight is not a prime number, and so on: there is nothing at all of which it is unclear whether it is a prime number.² One standard way of drawing the distinction between vague and non-vague (or precise) predicates is to say that vague predicates have borderline cases, while precise predicates do not.³ Blurred boundaries. Imagine a line drawn around all the things to which a given predicate applies. One typical characteristic of vague predicates is that ¹ Predicates are items of language—e.g. ‘is tall’, ‘is happy’, ‘is running’—which go together with names—e.g. ‘Bill’, ‘Ben’, ‘Alice’—or definite descriptions—e.g. ‘the tallest woman in the room’, ‘the inventor of post-it notes’, ‘the man who threw the egg’—to form sentences—e.g. ‘Bill is tall’, ‘The inventor of post-it notes is happy’, ‘Alice is running’. ² For large numbers, we may suppose that we can use a computer to help us determine the answer. ³ The borderline case characterization can be traced at least as far as Peirce 1902.
2 we get a blurry, ill-defined line. For example, there is no sharp boundary between the things to which ‘is heavy’ applies and the things to which it does not apply. For a precise predicate, by contrast, we get a sharp line, cleanly separating the things to which the predicate applies from the rest.⁴ Sorites paradoxes. Consider a hirsute person. He is not bald. Were we to remove a single one of his hairs, he still would not be bald. So too if we removed another, and so on—because one hair cannot ever make the difference between baldness and lack thereof. But our man has only a finite number of hairs, so eventually he will have none left. By our reasoning, he will still not be bald—whereas of course a person with no hairs on his head is bald. So something has gone wrong somewhere. This is an instance of the Sorites paradox. Note that the problem does not arise for all predicates. Instead of ‘is bald’, take the predicate ‘has ten or more hairs on his head’. Consider a hirsute person. He has ten or more hairs on his head. Were we to remove a single one of his hairs, he would still have ten or more hairs on his head. So too if we removed another, and another—but not ‘and so on and on’, because one hair can make the difference between having ten or more hairs, and having nine or less. In this case we get no paradox. One standard way of drawing the distinction between vague and precise predicates is to say that vague predicates generate Sorites paradoxes, while precise predicates do not. That is, where ‘is P’ is vague, we will typically be able to imagine a so-called Sorites series of objects, ranging from one which is clearly P to one which is clearly not P, but where we also feel strongly inclined to say that for any object in the series, if it is P, then so are its neighbours.⁵ So we are interested in predicates which admit of borderline cases, which draw blurred boundaries, and which generate Sorites paradoxes. Why are we interested in such predicates? For at least two reasons. First, vagueness is ubiquitous. Most of the predicates in our language are vague. Indeed, it is very hard to think of a predicate which is not at all vague. Some predicates have a wider class of borderline cases than others, but it is quite a challenge to think of predicates outside mathematics and the hard sciences which ⁴ The blurred boundaries characterization can be traced at least as far as Frege’s statement that if we represent concepts in extension by areas on a plane, then vague concepts do not have sharp boundaries, but rather fade off into the background (Grundgesetze, ii, §56; trans. in Beaney 1997, 259). ⁵ More precise characterizations of Sorites series and Sorites paradoxes will be presented below in §3.5.4.
3
admit of no possibility of borderline cases—whose boundaries are not in the slightest bit blurry. (Try it now!) So vagueness is something we cannot avoid. If we want to understand our language and our world, we need to understand vagueness. Second, there is immense potential for many other research programmes to be significantly advanced by progress on vagueness. When a key concept in a certain area is vague—and thanks to the ubiquity of vagueness, key concepts in many areas are vague—understanding how its vagueness works can have crucial payoffs. So we know which predicates we want to study, and why we want to study them. What exactly do we hope to gain from the study? What is our goal? What we ultimately want to gain from the study of vagueness is an account of the semantics of vague predicates—an account of what the meaning of such a predicate consists in, and of how (if at all) it differs from the meaning of a precise predicate. The development of modern logic (also known as classical logic) and model theory in the nineteenth and twentieth centuries led to immense progress in our understanding of language. However, the pioneers of these developments—notably Gottlob Frege and Alfred Tarski—were primarily concerned with mathematical language, and they ignored vagueness. Frege in particular set vagueness to one side as not worthy—or not susceptible—of treatment. He considered vagueness a defect of natural language, to be banished from a logically perfect scientific language. The problem with this attitude is that mathematical language is a special case: it is completely precise. Outside mathematics, virtually all of our language is vague, to a greater or lesser extent. Beginning in real earnest in the second half of the twentieth century, this realization that vagueness is ubiquitous in natural language led to a desire to approach ordinary language in the same spirit in which Frege approached mathematical language—in a bid not to eliminate vagueness, but to understand and adequately represent it. This led to the development of a large number of non-classical logics, and associated theories of vagueness (where by a theory of vagueness, I mean a formal logical core, together with a package of motivating and explanatory philosophical views). There are now many non-classical logics, and associated theories of vagueness, on the market. So in one sense we have advanced greatly since Frege’s time: we now have many systems of semantics designed specifically to accommodate vague language. In another sense, however, we are still no closer to our goal of obtaining an account of what the meaning of a vague predicate consists in—for we have
4 many accounts, and no apparent way of deciding which (if any) of them is correct.
Part I: Foundations As a first step towards rectifying this situation, in Chapter 2—building on basic preliminary material which it is the purpose of Chapter 1 to present—I give an overview of the space of possible theories of vagueness, and show where existing theories live in this space. For we cannot begin to select a theory of vagueness until we fully understand how the different theories differ from one another: not just over particular matters of detail, but at a more fundamental and illuminating level of analysis. Accordingly, Chapter 2 provides a conceptual map of theories of vagueness, setting out the important axes along which different theories vary, and then giving the coordinates of existing and future theories on these axes. At the centre of the map is the classical view of vagueness, according to which vague language is—from a semantic point of view—exactly the same as precise language. This view applies the standard model theory for precise mathematical language to vague language. I begin by isolating the key elements of this classical model theory: these key elements give the axes of the space. With the classical view at the origin, other theories are located according to which of the key elements they modify, and then, at a finer level of analysis, according to how they modify them. It is in this chapter that a distinction which plays a central role in this book first emerges: the distinction between worldly vagueness and semantic indeterminacy. Here is a quick sketch of how it emerges. Model theory involves two aspects: a part which represents a language, and a part—a model —which represents the world. The process whereby items of language gain meaning is represented by matching up a language with a model—which is called giving an interpretation of the language. The underlying thought is that our words gain meaning when we use them to talk about the world. But language is versatile, and words could mean many things. Consider the sentence ‘The house is on the hill’. The symbol ‘house’ could have been used to mean what the symbol ‘cat’ actually means, and the symbol ‘hill’ could have been used to mean what the symbol ‘mat’ actually means, so there is an interpretation of the sentence on which it means
5
that the cat is on the mat. But the interpretation on which the sentence means that the house is on the hill is privileged: it corresponds to what the sentence actually means. Such an interpretation is called the intended interpretation. In isolating the key elements of the classical model theory for precise language, I make a broad distinction between the internal nature of classical models, and the external part of the classical story, according to which each discourse has a unique intended interpretation. The classical picture tells us on the one hand that the world is of such and such a sort (it has a structure which can be represented by a classical model, and classical models have such and such internal features), and on the other hand that the meaning of anything we say is correctly specified by giving a single such model. This distinction yields two broad ways in which a theory of vagueness can differ from the classical view: it can replace classical models with non-classical models with different internal features, or it can alter the external story according to which only one model comes into play in describing what some utterance actually means. This distinction between two ways of departing from classical model theory underlies the distinction between worldly vagueness and semantic indeterminacy. Vagueness and indeterminacy are ruled out of the classical picture in two places. First, there is no vagueness in the world. Classical models are entirely precise: they represent the world as a crisp set of objects and properties such that for each property and each object, the object either definitely possesses the property, or definitely does not possess it. Second, there is no semantic indeterminacy—no vagueness in the relationship between language and the world. Each sentence has a unique intended model: so there is no vagueness about what any sentence means. A view which differs from the classical picture in the internal way (i.e. replaces classical models with non-classical models with different internal features) opens up the possibility of worldly vagueness, while a view which differs in the external way (i.e. denies that only one model comes into play in describing what some utterance actually means) opens up the possibility of semantic indeterminacy. For example, the view of vagueness built on fuzzy set theory (see §2.2.1) countenances worldly vagueness. It tells us that when we utter a vague predicate, we pick out a unique property in the world—the extension of our predicate on the unique intended interpretation of our utterance—but this property is inherently vague, in the sense that some objects possess it, some do not possess it, and others possess it to various intermediate degrees. This is a
6 matter of how things are out there in the world (on this view)—it is not a matter of the relationship between language and the world. In contrast, semantic indeterminacy enters when we have a view which denies that an utterance has a unique intended interpretation. On this sort of view, when we utter a vague predicate, there are several properties we might be picking out—the extensions of this predicate on the various equally intended (or equally not-unintended) interpretations—and there is no fact of the matter as to which, in particular, we mean. Here we have indeterminacy in the relationship between language and the world.⁶ Some surprising and significant results emerge from the survey of theories of vagueness. One is that two quite different views have been conflated in the literature under the heading ‘supervaluationism’. I call these two views ‘supervaluationism’ and ‘plurivaluationism’. This distinction is particularly important, because ‘supervaluationism’ is widely regarded as the front runner amongst existing theories of vagueness—but once we clearly distinguish the two views that have been run together under this heading, we will be in a much better position to discern the real disadvantages of each.
Part II: Vagueness With a clear overview of theories of vagueness in hand at the end of Chapter 2, we still face the question of deciding which theory is correct. The literature abounds with objections to particular theories, and these have led to the outright rejection of a number of views—but still, several broad types of theory remain viable (for example, epistemicism and degreetheoretic approaches, to name just two). Within each type, there is general agreement that some particular versions are better than others—but there are no widely accepted criteria which can decide between the types themselves. Theorists of vagueness have thus found themselves divided into ⁶ In order to draw substantive conclusions about the location of vagueness—in the world itself, or in the relationship between language and the world—from a system of model theory, we must take a literal attitude towards model theory. That is, we must regard a model theory for (part of ) a language as giving a literal (although not necessarily complete) description of the relationship between that language and the world. I present reasons for taking such a literal attitude towards model theories for vague language (in the context of a study such as this one) in §2.1.3.1.
7
camps. This is a very unsatisfactory situation. Given that the different types of view present completely different pictures of the relationship between vague language and the world, they cannot all be correct. We need some way of deciding which type of theory is right. We need some criteria for determining what the general form of the correct theory of vagueness should be. There is another loose end that also cries out to be tied up at this point. We have three informal characterizations of vague predicates: they give rise to borderline cases, their extensions have blurry boundaries, and they generate Sorites paradoxes. These three characterizations seem to be quite closely related to one another—yet not so closely related that they are merely three ways of saying essentially the same thing. So what is the relationship between them? Rather than three piecemeal characterizations of vagueness, it would be desirable to have a proper definition of vagueness: a crisp statement of what is of the essence of vagueness—of what vagueness ultimately consists in—given which, we can see why vague predicates have borderline cases, generate Sorites paradoxes, and draw blurred boundaries. My strategy is to link these two tasks: the task of finding the correct theory of vagueness and the task of finding an adequate definition of vagueness. Chapter 3 is concerned with the latter task. In that chapter, I discuss what we should expect from a definition of vagueness, critically discuss existing definitions, and then present a definition of vagueness according to which—roughly—a predicate F is vague just in case for any objects a and b, if a and b are very similar in respects relevant to the application of F, then the sentences Fa and Fb are very similar in respect of truth. So, for example, ‘is tall’ will be vague just in case for any persons a and b, if a and b are very similar in height, then the sentences ‘a is tall’ and ‘b is tall’ are very similar in respect of truth. The import of this definition can be grasped by comparing it with the claim that vague predicates are tolerant. A predicate F is tolerant with respect to φ if there is some positive degree of change in respect of φ that things may undergo, which is ‘‘insufficient ever to affect the justice with which F is applied to a particular case’’ (Wright 1975, 334). So, for example, ‘is tall’ will be tolerant with respect to height just in case for any persons a and b, if a and b are very similar in height, then there is no difference in the applicability of the predicate ‘is tall’ to a and b—that is, the sentences ‘a is tall’ and ‘b is tall’ are exactly the same in respect of truth. The great problem with
8 the claim that any predicate F is tolerant is that, when conjoined with the claim that we can construct a Sorites series for the predicate F, it leads to contradiction—in particular, to the claim that each object in the Sorites series both is and is not F. My definition of vagueness is a weakening of the claim that vague predicates are tolerant: if a and b are very similar in height, then there need not be no difference in the applicability of the predicate ‘is tall’ to a and b—that is, the sentences ‘a is tall’ and ‘b is tall’ need not be exactly the same in respect of truth—but there cannot be much difference in the applicability of the predicate ‘is tall’ to a and b, and so the sentences ‘a is tall’ and ‘b is tall’ must be very similar in respect of truth. In the remainder of Chapter 3 I discuss the question of extending this definition to cover vagueness of many-place predicates, of properties and relations, and of objects, and I then explore the advantages of this definition, some of the most important of which are that it captures the intuitions which motivate the thought that vague predicates are tolerant, without leading to contradiction, and that it yields a clear understanding of the relationships between Sorites-susceptibility, blurred boundaries, and borderline cases. In Chapter 4, I turn to the task of determining what type of theory of vagueness we need. Having clearly presented the different types of theory in Chapter 2, and having presented a crisp definition of vagueness in Chapter 3, my strategy is to ask which types of theory can accommodate vague predicates—that is, predicates which satisfy the definition given in Chapter 3. It is here, then, that the link is made between the task of finding an adequate definition of vagueness and the task of finding the correct theory of vagueness. When vagueness is characterized informally in terms of borderline cases, blurred boundaries, and Sorites-susceptibility, all the main existing types of theory of vagueness can be seen as accommodating vagueness—simply because the informal characterizations are so loose. This leads to the ‘too many theories’ problem: the existing theories cannot all be right, as they conflict with one another; yet how to choose between them, if they all accommodate vagueness perfectly well? The situation changes, however, once we have to hand a sharp definition of the core property underlying the various surface phenomena standardly used to characterize vagueness. When we now ask whether each type of theory allows for the existence of predicates possessing the feature isolated in Chapter 3 as being of the essence of vagueness, it turns out that the answer
9
is No. Only one type of theory does: the type which countenances degrees of truth. The basic idea of degrees of truth is that while some sentences are true and some are false, others possess intermediate truth values: they are truer than the false sentences, but not as true as the true ones. So, for example, if we line up a series of persons, ranging from one who is 7 feet in height to one who is 4 feet in height, in increments of a fraction of an inch, and then move along the series saying of each person in turn, ‘This person is tall’, the idea is that our statements start out quite true, then gradually get less and less true, until they end up quite false. Thus, from the project of defining vagueness, an answer emerges to the question of what the general form of the correct theory of vagueness must be: it must be one which countenances degrees of truth.⁷ In order to reach the overall conclusion of Chapter 4—that we need a theory of vagueness that countenances degrees of truth—I need to do several things in the course of the chapter. First, I show that of the types of theory of vagueness distinguished in Chapter 2, only those which countenance degrees of truth can accommodate predicates which satisfy the definition of vagueness proposed and defended in Chapter 3. Second, I consider and reject two different strategies which non-degree theorists might employ to avoid my conclusion. The first strategy is to propose an error theory of vagueness. Non-degree theorists might agree with the definition of Chapter 3, but still maintain the correctness of their semantic theory. The resulting position would be as follows: ‘‘For a predicate to be vague, it has to satisfy the definition of Chapter 3; my semantic theory does not allow for the existence of predicates that satisfy the definition of Chapter 3; but my semantic theory is correct; therefore there are no vague predicates.’’ I offer reasons for rejecting such an error theory of vagueness. The second strategy is to reject the definition of Chapter 3, to propose in its place an alternative definition which is compatible with a given non-degree theory, and to argue that this alternative definition has all the advantages which I claim for my definition in Chapter 3. I consider, and give reasons for rejecting, several proposals along these lines, including the proposal that a ⁷ Clearly, in order for the definition of vagueness to play the role of deciding between theories of vagueness, we must reject the widespread idea that a definition should be theory-neutral—that is, that the definition of P must be something on which all the candidate theories of P can agree. I discuss this issue in Ch. 3.
10 predicate F is vague just in case for any objects a and b, if a and b are very similar in respects relevant to the application of F, then the sentences Fa and Fb are very similar in respect of assertibility (rather than truth).
Part III: Degrees of Truth The upshot at this point is that we need a theory which countenances degrees of truth. This tells us the type of theory we need, but does not lead us to a particular theory. The best-known degree theory is the one based on fuzzy logic and set theory. It replaces the two classical truth values True and False with infinitely many degrees of truth, which may be thought of as percentage values: thus a sentence might be only 45 per cent true, or 98 per cent true, and so on. In Chapters 5 and 6, I consider objections to the fuzzy view of vagueness in particular, and to degree-theoretic treatments of vagueness in general. In some cases, I argue that the objections do not carry weight. In other cases, I propose modifications and/or additions to the standard fuzzy view, in order to overcome the objections. The main objections covered in Chapter 5 are as follows. The very idea of truth coming in degrees is in some way a mistake! I show that it is not. The fuzzy theory involves an objectionable violation of classical logic! In response to this objection, I propose a new account of logical consequence in the fuzzy setting that allows us to derive a classical consequence relation from fuzzy semantics. That is, while the semantics for vagueness that I propose is non-classical, it validates classical logic. Degrees of truth cannot be integrated with key developments elsewhere in philosophy of language, outside the study of vagueness! I show that this is not correct. As part of my response to this objection, I propose a detailed new account of the relationship between degrees of truth and degrees of belief. Degree theories, such as the fuzzy theory, which treat the logical connectives as truth functions, cannot account for ordinary usage of sentences about borderline cases! I show that this objection is mistaken. Finally, denying bivalence—the view that every sentence is true or false, with no middle way (or ways, in the case of degrees of truth)—leads to contradiction! I show that it does not. In Chapter 6, I turn to what is arguably the major objection to the fuzzy view: namely, that it ‘‘imposes artificial precision . . . though one is
11
not obliged to require that a predicate either definitely applies or definitely does not apply, one is obliged to require that a predicate definitely applies to such-and-such, rather than to such-and-such other, degree (e.g. that a man 5 ft 10 in. tall belongs to tall to degree 0.6 rather than 0.5)’’ (Haack 1979, 443). In response to this objection, I propose a new view—which I call fuzzy plurivaluationism—which departs from the standard fuzzy view by denying that each discourse has a unique intended (fuzzy) interpretation. This view thus incorporates both worldly vagueness (in the fuzzy models) and semantic indeterminacy (in the lack of a unique intended interpretation). The overall conclusion of the book is that fuzzy plurivaluationism is the correct theory of vagueness.
This page intentionally left blank
P A RT I
Foundations
This page intentionally left blank
1 Beginnings The purpose of Part I of this book is to survey the field of theories of vagueness. Good surveys already exist—most notably Williamson (1994) and Keefe (2000)—so why do we need a new one? Partly because some theories, such as contextualism, have come to prominence in the years since Williamson’s and Keefe’s books appeared. However, the chief thing that distinguishes my survey and gives it its point is not what theories I cover, but the way I cover them. Existing surveys spend a relatively small amount of time on the formal details of each theory, and a relatively large amount of time looking at objections and replies; furthermore, the formal details that are given are generally presented in the manner in which the theory in question was first presented—whatever that happens to be—i.e. no attempt is made to place the various theories within a common overarching theoretical framework. In contrast, my survey focuses on the formal details of the various theories, and it presents these details in a thematically unified way. More specifically, I focus on model theory, and on how each theory of vagueness differs from the semantic picture given by classical model theory. I thus aim to provide not a catalogue, but a conceptual map of theories of vagueness. At the centre of the map is the classical view of vagueness, according to which the semantics of vague language is exactly the same as the semantics of precise language. This view applies the standard model theory for precise mathematical language to vague language. The key elements of this classical model theory then give the axes of the space of possible theories of vagueness. With the classical view at the origin, other theories are located according to which of the key elements they modify, and then, at a finer level of analysis, according to how they modify them. There are two reasons why a survey of this sort is needed. First, I think that this unified approach is genuinely illuminating: it enables us to
16 understand clearly how the different theories of vagueness differ from one another, not just over particular matters of detail, but at a fundamental level—and conceptual clarification and illumination of this sort is an end in itself in philosophy. Second, just as Williamson’s and Keefe’s dialectics require them to survey objections and replies comprehensively—for they each argue for their favoured theory by way of some form of cost/benefit analysis of all the available theories (see pp. 129–30 below)—my dialectic requires me to explore carefully the inner workings of each theory. As discussed in the Introduction, my strategy for selecting amongst theories of vagueness is to ask of each theory whether it can allow for the existence of vague predicates—that is, first to present a fundamental definition of vagueness, and then to ask of each theory whether it can allow that there are any predicates which satisfy this definition. In order to pursue this strategy, we need not only a very clear definition of vagueness, but a very clear understanding of the workings of each theory. Hence the need to focus on formal details, rather than surveying all known objections and replies to each theory. That said, I do note what I regard as the major objections and replies to each theory, in order to give a feel for these views to newcomers to the vagueness literature—and of course I explore objections and replies to degree-theoretic treatments of vagueness in great detail, in Part III. My emphasis on formal details in the foregoing should not trigger alarm bells in readers with little or no background in logic. The kind of reader that I had in mind when writing the book was one with no formal background other than a first introductory logic course of one of the standard sorts. Indeed, the book is self-contained, and someone without even this much logic will—with a little more effort—be able to follow the discussion.¹ (Readers with more formal background might, accordingly, find that I am labouring certain points, and should of course skim forwards at such places.) Chapter 2 maps the space of possible theories of vagueness. The present chapter deals with preliminaries. §1.2 presents the classical semantic picture which will be located at the origin of the map. Prior to that, §1.1 introduces some basic formal tools. ¹ I have taught several honours/postgraduate seminars based on drafts of this book: most of the students had taken no more than a first course in logic, and some had no formal background at all.
17
1.1 Toolkit This section can be read through as a first introduction to the notions discussed, suitable for readers with little or no formal background. Alternatively, it can be treated as a glossary, to be skipped over entirely, and then consulted only in the event that one encounters a notion later on with which one is unfamiliar. 1.1.1 Sets A set is a bunch of objects; these objects are said to be members or elements of the set. We denote the set containing a, b, and c as members as follows: {a, b, c}. We denote the set of all things such that some condition C holds as follows: {x : C}. For example, the set of all red things is denoted {x : x is red}. We use the symbol ∈ (epsilon) to denote membership, as in a ∈ {a, b, c}. To say that something is not a member of a set we use the symbol ∈, / as in d ∈ / {a, b, c}. Sets are individuated by their members. That is, if ‘two’ sets have exactly the same members, they are identical; i.e. they are in fact one and the same set. So, for example, {a, b} = {b, a} (where = means is identical to, i.e. is one and the same object as). Note that sets are objects. The set containing my pen, my retractable pencil, my packet of polo mints, and my gonk is another object in its own right, distinct from the four objects in it. That is why sets can be members of other sets: the set just mentioned is, for example, a member of the set of all four-membered sets. The set just mentioned is not a visible, kickable object—unlike my retractable pencil or the other objects in the set. For this reason, sets are often referred to as ‘abstract objects’. But this does not make them any less objects: they are just objects which are not visible or kickable or located in space or time. This is the guiding idea of set theory: to treat a bunch of objects as an object in its own right, which we can then do things with—for example, make it the argument or value of a function (see below), put it in another set, and so on. 1.1.2 Subsets We say a set S is a subset of a set T (in symbols, S ⊂ T or S ⊆ T) iff every member of S is a member of T. Note that this leaves open whether or not
18 S = T: this depends upon whether there is anything in T which is not in S. If there is nothing in T which is not in S, i.e. if T ⊂ S as well as S ⊂ T, then S = T. If there is something in T which is not in S, i.e. S ⊂ T but not T ⊂ S, then we say S is a proper subset of T, sometimes symbolized S T. Note that every set is a subset of itself, but no set is a proper subset of itself. There is a set called the empty set or null set, symbolized ∅, which has no elements. We say that the null set is a subset of every set. A set containing just one element, for example {1}, is called a singleton, or unit set. Note that 1 is an element of the set {1, 2, 3} but is not a subset of it, whereas {1} is a subset of the set {1, 2, 3} but is not an element of it. Sometimes we are given a set S and we want to consider a set of subsets of S. Suppose we have the set S = {1, 2, 3, 4}, and we want to consider the set S2 of all two-membered subsets of this set: S2 = { {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4} } (Note that {2, 3} = {3, 2}, so we do not list {3, 2} separately. Similarly for {2, 1}, etc.) Note: {1, 2} ⊂ S {1, 2} ∈ S2 That is, an element of S2 is a subset of S. One very important set of subsets of any set X is the power set of X —the set of all subsets of X —symbolized PX. For example, where X = {1, 2, 3}, PX = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3} } 1.1.3 Set Operations and Relations When we are talking about some sets, we will generally be talking about some subsets of a given ‘background’ set—that is, we will be talking about some (maybe all) of the elements of the power set of this background set. We might not mention the background set explicitly, but it should be obvious from the context what it is. (If this ever isn’t obvious, take the background set to be some set that is sufficiently large to include all the things we are talking about in the context.) The complement of a set A, denoted A, is the set of all things which are not in A. (Here it is important that we are restricting ourselves to the content of some background set: A
19
contains everything in the background set that is not in A, not everything at all that is not in A.) In symbols: A = {x : x ∈ / A} The union of two sets A and B, denoted A ∪ B, contains everything which is in either A or B (or both): A ∪ B = {x : x ∈ A or x ∈ B} The intersection of two sets A and B, denoted A ∩ B, contains everything which is in both A and B: A ∩ B = {x : x ∈ A and x ∈ B} Two sets A and B are said to be disjoint if they have no members in common, i.e. if A ∩ B = ∅. The set A\B is the set of things which are in A but not in B: A\B = {x : x ∈ A and x ∈ / B} = A ∩ B If B ⊂ A, it is common to write A − B instead of A\B. 1.1.4 Ordered Pairs An ordered pair consists of two objects, given in a particular order: one first, the other second. The ordered pair consisting of a first and b second is represented as (a, b). (It is also common to use angle brackets to denote ordered pairs, i.e. a, b.) Order matters here, thus (a, b) = (b, a). In general, an ordered n-tuple (triple, quadruple, etc.) consists of n objects, given in a particular order: 1st, 2nd, . . . , nth; for example (a, b, c) and (a, c, d, e, b). (Again, it is also common to use angle brackets to denote ordered n-tuples, i.e. a, b, c, a, c, d, e, b.) For any sets X and Y , their Cartesian product X × Y is the set of all ordered pairs whose first member is an element of X and whose second member is an element of Y . For example if X = {1, 2} and Y = {3, 4} then X × Y = {(1, 3), (1, 4), (2, 3), (2, 4)} Where X and Y are the same set, the Cartesian product X × X is denoted X 2 . The set of all ordered triples of elements of X is denoted X 3 , and in general the set of all ordered n-tuples of elements of X is denoted X n .
20 1.1.5 Relations A relation from a set X to a set Y is a subset of X × Y , that is, a set of ordered pairs whose first elements are in X and whose second elements are in Y . For example, if X is the set of females {Alice, Ethel} and Y is the set of males {Bill, Charles, Dennis}, then if Alice, Bill, and Charles are siblings, and Dennis and Ethel are siblings, the ‘brother of’ relation from X to Y is {(Alice, Bill), (Alice, Charles), (Ethel, Dennis)} and the ‘sister of’ relation from Y to X is {(Bill, Alice), (Charles, Alice), (Dennis, Ethel)}. Where X and Y are the same set, what I have called a relation from X to Y is usually called a binary relation on X, i.e. a subset of X 2 . If R is a binary relation on X, and x and y are elements of X, then it is common to express the fact that x stands in the relation R to y in any of the following ways: (x, y) ∈ R, R(x, y), Rxy, xRy. A ternary relation on X is a subset of X 3 (i.e. a set of ordered triples of elements of X), and in general an n-place relation on X is a subset of X n . A binary relation R on X is said to be reflexive if it satisfies the condition that ∀x ∈ X(xRx); transitive if ∀x, y, z ∈ X(xRy ∧ yRz → xRz); symmetric if ∀x, y ∈ X(xRy → yRx); antisymmetric if ∀x, y ∈ X(xRy ∧ yRx → x = y); and connected if ∀x, y ∈ X(xRy ∨ yRx). If R is reflexive, symmetric, and transitive, it is said to be an equivalence relation. 1.1.6 Functions A function (aka map, mapping, operation) f from a set A to a set B, written f :A→B assigns particular objects in B to objects in A. A is called the domain of the function, and B the codomain. The essential feature of a function is that it never assigns more than one object in B to any given object in A. If x is a member of A, f (x) is the object in B which the function f assigns to x. We say that f (x) is the value of the function f for the argument x, or is the value at x; we also say x is sent to f (x) and that f (x) is hit by x. Note that A and B may be the same set. For example, the ‘mother of’ function assigns to each person the person who is his or her mother; the ‘successor’ function assigns to each natural number the number which comes directly after it in the sequence of natural numbers. A function f : A → B is commonly identified with the set of all ordered pairs (x, f (x)), where x is an object in A which is sent to some object in
A
f
21
B
Figure 1.1. Picturing a function as a bunch of arrows.
B by f , and f (x) is the object in B to which x is sent. For example the successor function is the set of ordered pairs {(0, 1), (1, 2), (2, 3), . . .}. The crucial feature of a function—that is, that it never assigns more than one object in B to any given object in A—emerges here as the requirement that no element of A appears more than once as the first element of an ordered pair in the set. Another useful way to picture a function f : A → B is as a bunch of arrows, pointing from objects x in A to objects f (x) in B (Fig. 1.1). Binary relations can also usefully be pictured as bunches of arrows. In this depiction, functions are distinguished from relations in general by the requirement on functions that no object has more than one arrow departing from it. A function f : A → B is said to be total if it satisfies the condition that every member of A gets sent to some member of B. A function which is not total is called partial. Such a function assigns nothing at all to some member(s) of A. In the representation of a function as a set of ordered pairs, to say that a ∈ A is assigned no value by the partial function f : A → B is to say that a does not appear as the first element of any ordered pair in the set; in the representation of a function as a bunch of arrows, it is to say that a has no arrow leading from it (it does not have an arrow leading to some special object, The Nothing!). A function f : A → B is said to be onto (aka surjective, a surjection) if it satisfies the condition that every member of B gets hit at least once; one–one (aka one-to-one, into, injective, an injection) if no member of B gets hit more than once; and a correspondence (aka bijective, a bijection) if it is total, onto, and one–one. Given a function f : A → B and a function g : B → C, the composite function g ◦ f (read this as ‘‘g after f ’’) from A to C is defined thus: for
22 every x in A, (g ◦ f )(x) = g( f (x)). Visually, you get the composite function by taking each f arrow from an object x in A to an object y in B and extending it so that it hits whatever object z in C the g arrow from y hits. (If there is no g arrow from y, then in the composite function there is no arrow from x.) Given a function f : A → B, that is, a set of ordered pairs/a bunch of arrows, we can invert the function (i.e. switch the first and second members of each ordered pair/make each arrow point the opposite way). If the result of this process is a function (from B to A), this resulting function is called the inverse function of f and is denoted f −1 . So far we have considered functions which take a single object as argument and assign to it an object as value. What about functions such as addition or multiplication, which take two objects as arguments, and assign to them an object as value? In general, what about functions which take n = 1 objects as values? A standard move is to treat an apparently n-place function from A to B as a function from the set of n-tuples of members of A to B. So, for example, the addition function, commonly thought of as a two-place function from the set of integers to the set of integers, can be thought of as a 1-place function from the set of ordered pairs of integers to the set of integers. 1.1.7 Structures and Algebras A structure is a set of objects together with some relations on that set. The relations are required to satisfy some conditions: the particular conditions imposed determine the type of structure in question. An algebra is a structure where the relations are functions.² Examples follow. A poset (aka partial order, or partially ordered set) (X, ≤) is a set X together with an ordering ≤ (i.e. a binary relation on X) which is reflexive, transitive, and antisymmetric. A linearly ordered set (aka linear order) is a partially ordered set where, in addition, ≤ is connected. A lattice may be defined in two equivalent ways. According to the first, order-theoretic definition, a lattice (X, ≤) is a poset in which every pair of elements of X has a supremum and an infimum in X, where: •
y ∈ X is the supremum of (a, b) iff a, b ≤ y, and y ≤ z for any z with a, b ≤ z ² This usage of the terms ‘structure’ and ‘algebra’ is neither non-standard nor universal.
•
23
y ∈ X is the infimum of (a, b) iff y ≤ a, b, and z ≤ y for any z with z ≤ a, b
According to the second, algebraic definition, a lattice (X, ∨, ∧) is a set X together with two binary operations ∨ and ∧ on X which satisfy the following conditions: 1. 2. 3. 4.
idempotence: x ∨ x = x and x ∧ x = x commutativity: x ∨ y = y ∨ x and x ∧ y = y ∧ x associativity: x ∨ (y ∨ z) = (x ∨ y) ∨ z and x ∧ (y ∧ z) = (x ∧ y) ∧ z absorption: x ∨ (x ∧ y) = x and x ∧ (x ∨ y) = x
It is easily verified that a lattice (X, ≤) yields a lattice (X, ∨, ∧) if we set x ∨ y equal to the supremum of x and y with respect to ≤, and set x ∧ y equal to the infimum of x and y with respect to ≤; and that a lattice (X, ∨, ∧) yields a lattice (X, ≤) if we set x ≤ y iff x ∧ y = x iff x ∨ y = y. The following are some additional properties which a lattice may or may not have: 1. There is an identity for ∨, i.e. an element 0 ∈ X such that 0 ∨ x = x for all x ∈ X; and there is an identity for ∧, that is an element 1 ∈ X such that 1 ∧ x = x for all x ∈ X. Equivalently: the lattice has a maximum element 1, with x ≤ 1 for all x ∈ X, and a minimum element 0, with 0 ≤ x for all x ∈ X. A lattice satisfying this condition is said to be bounded. 2. The lattice has identities, and each element x ∈ X has a complement, i.e. an element x ∈ X with x ∨ x = 1 and x ∧ x = 0. A lattice satisfying this condition is said to be complemented. 3. The operations ∨ and ∧ satisfy the distributive laws: x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) and x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z). A lattice satisfying this condition is said to be distributive. 4. Each nonempty subset Y of X has a supremum (written Y ) and an infimum (written Y ) in X, where: • y ∈ X is the supremum of Y iff (i) ∀x ∈ Y , x ≤ y and (ii) for any z that satisfies (i), i.e. ∀x ∈ Y , x ≤ z, we have y ≤ z • y ∈ X is the infimum of Y iff (i) ∀x ∈ Y , y ≤ x and (ii) for any z that satisfies (i), i.e. ∀x ∈ Y , z ≤ x, we have z ≤ y A lattice satisfying this condition is said to be complete. A Boolean algebra is a bounded, distributive, complemented lattice. An equivalent definition is that a Boolean algebra is a nonempty set together
24 with a unary operation and two binary operations ∨ and ∧, satisfying the following identities: 1. 2. 3. 4. 5.
x ∨ y = y ∨ x and x ∧ y = y ∧ x x ∨ (y ∨ z) = (x ∨ y) ∨ z and x ∧ (y ∧ z) = (x ∧ y) ∧ z x ∨ (x ∧ y) = x and x ∧ (x ∨ y) = x x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) and x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z) x ∨ (y ∧ y ) = x and x ∧ (y ∨ y ) = x
A unary operation on a bounded lattice which satisfies the following conditions is an involution (or duality): • •
(x ) = x if x ≤ y then y ≤ x
If is an involution, then the following identities—the De Morgan laws— may or may not hold: • •
(x ∨ y) = x ∧ y (x ∧ y) = x ∨ y
A bounded distributive lattice with an involution that satisfies the De Morgan laws is a De Morgan algebra; if the underlying lattice is complete, it is a complete De Morgan algebra. If a De Morgan algebra satisfies the following condition: •
x ∧ x ≤ y ∨ y
then it is a Kleene algebra; if the underlying lattice is complete, it is a complete Kleene algebra.
1.2 The Classical Semantic Picture Classical logic, as taught in introductory logic classes, can seem to be a melting-pot of miscellaneous tools, from a host of connectives each with its own truth table, through to quantifiers with strange introduction and elimination rules, whose semantics are given in terms of models rather than truth tables. There is, however, a unifying perspective which grounds all this diversity in a single simple structure. This structure—the fundamental structure underlying classical logic and set theory—is the set {0, 1}, or
25
{False, True}, of classical truth values, together with certain operations defined on this set. The clearest way to think of these operations is in terms of their relations to the fundamental ordering of the two truth values. This ordering is arrived at by first considering under what conditions one sentence is at least as true as another sentence. First, any sentence, whether it is true or false, is at least as true as itself. Second, suppose sentence S is false and sentence T is true; then S is strictly less true than T (i.e. T is at least as true as S, and S is not at least as true as T). Now in all the cases just considered, what the sentences involved mean is not important: the same relationships would hold between any sentences which were respectively both true, both false, or true and false. So we may transfer our ordering of sentences in respect of truth onto the truth values that sentences possess. The result is 0 ≤ 0, 0 ≤ 1, 1 ≤ 1.³ Now with this ordering in place, we want to define some operations on {0, 1}. The basic set of operations is , , and : the idea is that these will be used to define, respectively, logical disjunction (‘or’) and set-theoretic union, logical conjunction (‘and’) and set-theoretic intersection, and logical negation (‘not’) and set-theoretic complementation. This is the ultimate point of defining the operations—but at this stage we are defining operations on {0, 1}, the set of truth values. Now, if S and T have the same truth value, then we want ‘S or T’ to have that truth value also, and if S and T have different truth values, then we want ‘S or T’ to have the greater of these truth values. Thus we want to say that for any truth values x and y in {0, 1}, x y is the supremum of x and y with respect to our ordering of {0, 1}. For conjunction, if S and T have the same truth value, then we want ‘S and T’ to have that truth value also, and if S and T have different truth values, then we want ‘S and T’ to have the lesser of these truth values. Thus we want to say that for any truth values x and y in {0, 1}, x y is the infimum of x and y with respect to our ordering of {0, 1}. For negation, if S is less true than T, then we want ‘not T’ to be less true than ‘not S’, and we also want the negation of the negation of S to have the same truth value as S; thus we set 0 = 1 and 1 = 0. We now have an algebra ({0, 1}, , , ), consisting of a set together with some operations on that set. It is a straightforward matter to verify that ³ This is exactly the ordering we get on {0, 1} if we restrict the standard ordering of the real numbers to the subset {0, 1}—and this is in fact one reason why we use 0 to represent falsity and 1 to represent truth.
26 this is a Boolean algebra. This Boolean algebra is the fundamental structure underlying classical logic and set theory. Let us look first at classical propositional logic. We have a language whose vocabulary consists in a set P of propositional constants {p, q, r, . . .} and three logical connectives ∨ (or), ∧ (and), and ¬ (not), as well as the parentheses ( and ). The set F of well-formed formulae (wfs) is defined as follows: • • •
Propositional constants are wfs. If A and B are wfs, then so are (A ∨ B ), (A ∧ B ), and (¬A ).⁴ Nothing else is a wf.
Thus, we consider a sentence (wf) to be a sequence of symbols. Such a sequence gets a meaning through being given an interpretation. An interpretation of the language consists in an assignment of a truth value [p] to every propositional constant p ∈ P. Given an interpretation, every wf A ∈ F is assigned a truth value [A ] by the following recursive definition: • • •
[A ∨ B ] = [A ] [B ] [A ∧ B ] = [A ] [B ] [¬A ] = [A ]
Thus we are saying that the truth value of the wf A ∨ B —the object assigned to this sequence of symbols on the interpretation in question—is found by locating the object assigned to A and the object assigned to B , and then performing the operation on these two objects. The result of this operation is an object in our algebra of truth values—and this object is the truth value of A ∨ B . Note that the foregoing information is often presented by means of truth tables (see Table 1.1). These tables combine our earlier definitions of the operations , , and on the set {0, 1} of truth values (these now occur in the body of the table—for example, the definition of is in the final column) with our recursive definition, in the previous paragraph, of the truth values of compound propositions. I think it is more perspicuous to keep the two definitions apart. First, it makes us less likely to confuse the symbols of our formal language (∨, ∧, and ¬) with the operations on ⁴ Parentheses will be omitted when this will cause no confusion.
27
Table 1.1. Truth tables for classical propositional logic A
B
(A ∨ B )
(A ∧ B )
¬A
1
1
1
1
0
1
0
1
0
0
1
1
0
0
0
0
0
1
our set of truth values (, , and ).⁵ Second, it will allow us to see more clearly the differences between certain theories of vagueness which we shall examine below—for example, between three-valued views (§2.2) and the supervaluationist view (§2.4). Two wfs A and B are logically equivalent (written A ≡ B ) if they have the same truth value on every interpretation. ≡ is an equivalence relation on F, and F/≡ is the set of equivalence classes of F under ≡. Letting |A | be the equivalence class containing the wf A , we may set: • • •
|A | ∨ |B | = |A ∨ B | |A | ∧ |B | = |A ∧ B | ¬|A | = |¬A |
Note that the occurrences of ∨, ∧, and ¬ on the right of these identities represent the sentential connectives, while the occurrences on the left denote newly defined algebraic operations on F/≡. It is a straightforward matter to check that these operations are well defined, and that F/≡ together with these operations is a Boolean algebra.⁶ It is not uncommon to refer to these equivalence classes of wfs as propositions, and to this Boolean algebra of equivalence classes as the classical propositional calculus. What about the conditional? That is, what are we to say of sentences of the form A → B ? There are two strategies: both yield the result that A → B is equivalent to ¬A ∨ B and to ¬(A ∧ ¬B ), but via different ⁵ Such confusion is easier if—as is standard practice—we use the same symbols (∨, ∧, and ¬) to represent both symbols of the language and operations on the truth values. To avoid this confusion, I have used different symbols: ∨, ∧, and ¬ for the symbols of the language, and , , and for the operations on the truth values. Later on—once we have the distinction between symbols of the formal language and operations on the set of truth values clearly in mind—I will revert to the standard practice of using the same symbols for both and letting context disambiguate. ⁶ For more details see Halmos and Givant 1998. This algebra of equivalence classes is often called the Lindenbaum algebra.
28 routes. One strategy is to regard → as a defined symbol, with A → B abbreviating either ¬A ∨ B or ¬(A ∧ ¬B ), rather than being a genuine wf in its own right. The other strategy is to regard → as a primitive connective, and then (i) add a new clause to our definition of a wf, saying that if A and B are wfs, then so is (A → B ), (ii) add a new quotient operation / to our algebra of truth values, subject to the identities f /g = f g = ( f g ) , and (iii) add a clause to our truth definition which says that [A → B ] = [A ]/[B ]. I think it will be clearer in what follows if we keep our algebra of truth values as uncluttered as possible, so I shall adopt the former strategy. Let us look next at classical set theory. It is a well-known fact that for any set X, the power set PX of X is a Boolean algebra under the operations of union, intersection, and complementation—or equivalently, with the set-inclusion or subset relation ⊆ as the ordering. The connection with the Boolean algebra of classical truth values becomes transparent if we identify each subset of X with its characteristic function.⁷ For any subsets (i.e. characteristic functions) f and g, and any x ∈ X, we may then define: • • • •
( f ∪ g)(x) = f (x) g(x) ( f ∩ g)(x) = f (x) g(x) f (x) = f (x) f ⊆ g ⇔ ∀x ∈ X, f (x) ≤ g(x)
Thus we are saying that to find out whether a given x is in the union of f and g —that is, to find out whether this union set (considered as a characteristic function) assigns x 1 or 0—we find the object (1 or 0) which f assigns to x, and the object (1 or 0) which g assigns to x, and then perform the operation on this pair of objects. The result of this operation is an object in our algebra of truth values—1 or 0—and this object is the value assigned to x by the union set f ∪ g. If it is 1, then x is in the union; if it is 0, then x is not in the union. By asking this question of every x ∈ X, we build up a complete picture of the union set f ∪ g. It is a straightforward matter to verify that these definitions yield a Boolean algebra of subsets. It is also clear that the operations we have just defined are indeed the ordinary set-theoretic operations of union, ⁷ The characteristic function fS of a subset S of X (also known as the indicator function) is a function from X to the set {0, 1} of truth values, which assigns 1 to every member of X which is in S, and 0 to every member of X which is not in S.
intersection, and complementation. With , , and ‘and’, and ‘not’, we have: • • •
29
interpreting ‘or’,
x is in the union of f and g iff x is in f or x is in g x is in the intersection of f and g iff x is in f and x is in g x is in the complement of f iff x is not in f .
As well as unions and intersections of pairs of sets f and g, we can also define unions and intersections of arbitrary families of sets {fi } (the above definitions for pairs of sets are just special cases of these more general definitions): • {f }(x) = {fi (x)} i • {fi }(x) = {fi (x)} Finally, let us consider classical predicate logic. In addition to the propositional constants, the connectives, and the punctuation marks, we now have individual variables x, y, z, . . . , individual constants a, b, c, . . . , n-ary predicate letters P n , Qn , Rn , . . . for each n ≥ 1,⁸ and quantifiers ∃ and ∀. Terms and well-formed formulae are defined as follows: • • • • •
Variables and individual constants are terms; nothing else is a term. Propositional constants are wfs. If t1 , . . . , tn are terms and P is an n-ary predicate letter, then P(t1 , . . . , tn ) is a wf. If A and B are wfs and x is a variable, then (¬A ), (A ∨ B ), (A ∧ B ), ((∃x)A ), and ((∀x)A ) are wfs. Nothing else is a wf.
Thus, as before, a wf is a sequence of symbols. Such a sequence gets a meaning through being given an interpretation. An interpretation M = (M, I) of the language consists in a nonempty set M (the domain), together with a function I which assigns: • • •
a truth value [p] to every propositional constant p; an object I(a) in M to every individual constant a; an n-ary relation I(P) on M to every n-ary predicate P.
⁸ The superscript indicating the arity (number of places) of the predicate will be omitted when it is obvious from the context.
30 In relation to the final clause, note that an n-ary relation on M is a set of n-tuples of members of M. Identifying this set of n-tuples with its characteristic function, we regard an n-ary relation on M as a function from the set M n of all n-tuples of members of M to our set of truth values {0, 1}. Given an interpretation M, every closed wf A is assigned a truth value [A ]M on that interpretation by a recursive definition.⁹ (Where the interpretation in question is obvious from the context, or is irrelevant, the subscript M is omitted.) The clauses for non-quantified compound wfs are just as before. For an atomic wf P(a1 , . . . , an ), which consists of an n-ary predicate P followed by n individual constants a1 , . . . , an , the truth definition is as follows: [P(a1 , . . . , an )] = I(P)(I(a1 ), . . . , I(an )) The idea here is this. As discussed, each individual constant a is assigned an object I(a) in the domain, and P is assigned a function I(P) from the set of all n-tuples of members of M to our set of truth values {0, 1}. Whatever value this function assigns to the n-tuple (I(a1 ), . . . , I(an )), this value is the truth value of the sentence P(a1 , . . . , an ). Thus ‘Bob is bald’ is true just in case Bob is in the extension of ‘is bald’; ‘Bill loves Ben’ is true just in case Bill stands in the loving relation to Ben; and so on. For quantified sentences, saying that everything is P is treated as saying that this thing is P, and this thing is P, and this thing is P, . . . for everything in the domain; and saying that something is P is treated as saying that this thing is P, or this thing is P, or this thing is P, . . . for everything in the domain. This basic idea of universal quantification as generalized conjunction, and existential quantification as generalized disjunction, is made precise in the following way. Let Ax a be the sentence obtained by writing a in place of all free occurrences of x in A , a being some constant that does not occur in A ; and given an interpretation M with domain M, let Mao be the interpretation which is just like M except that in it the constant a is assigned the denotation o. Then: • [∃xA ]M = {[A a] a : o ∈ M} x Mo • [∀xA ]M = {[Ax a]Mao : o ∈ M} ⁹ Looking at the definition of a wf, we see that the only way a quantifier can get into a wf is by being stuck on the front of a wf A , as in ((∃x)A ) or ((∀x)A ). This wf A is called the scope of the quantifier. A variable x occurring within the scope of a quantifier (∀x) or (∃x) is said to be bound. ( The variable x in (∀x) or (∃x) itself is also said to be bound.) A variable x which is not within the scope of any quantifier (∀x) or (∃x) is said to be free. A closed wf is one containing no free variables.
31
The idea here is that to find out the truth value of, say, ∀xPx, we consider the sentence Pa, and we ask what truth value it would have if everything about our interpretation were the same, except that a denoted this object in the domain; we note this truth value. Then we ask what truth value Pa would have if everything about our interpretation were the same, except that a denoted this other object in the domain; we note this truth value. And so on, for all objects in the domain. We now have a set of truth values—the ones we noted down along the way. To get the truth value of ∀xPx, we apply our infimum operation to this set of truth values, yielding another truth value. The analogy with conjunction enters in the fact that is the generalization to arbitrary sets of truth values of our operation on pairs of truth values. We have now finished our tour through classical logic (propositional and predicate) and set theory. We have seen that the Boolean algebra of classical truth values is at the heart of the story. The set-theoretic relations between, and operations on, sets, and the logical relations between, and operations on, sentences, are just the images projected into the worlds of sets and of sentences respectively by the algebraic operations on our set of truth values. If we unplugged the classical algebra of truth values, and plugged in a different one, and then told the rest of the ensuing story of set theory, propositional logic, and predicate logic in the same way as above, we would get different images appearing in the worlds of sets and sentences—i.e. different set-theoretic relations between, and operations on, sets, and different logical relations between, and operations on, sentences—but once we have the recipe for telling the story of logic and set theory given an initial algebra of truth values, how the story ends is completely determined by our initial choice of algebra. Later on we shall indeed vary the choice of algebra, and look at the results for set theory and logic. The point for the moment is simply that once we have the underlying algebra, the rest of the story comes in a natural progression. One thing has been left out of our story so far: something that is essential if we wish to apply the classical semantic picture to ordinary language. Typically we want to know whether a sentence is true (simpliciter), not just that it is true on such-and-such interpretations and false on others. What is going on here is that when we utter a sentence, we mean something —some particular thing—by the sentence we utter. That is, we utter it relative to a particular interpretation. The particular interpretation that is relevant in a
32 given case has been called many things, including intended (Putnam 1983 [1977]; Merrill 1980; Lewis 1999 [1984], 1993; Weiner 2004), correct (Lepore 1983; Davidson 2005 [1986]; van Cleve 1992; Islam 1996; Field 2000), actual (Lepore 1983), real (Hájek 1999), standard (Abbott 1997), proper (Przełe¸cki 1976), and the interpretation which accords with the semantics of the language being spoken (Field 1974, 210–11). Whatever we call the interpretation in question,¹⁰ the final part of the classical semantic picture is the supposition that every time you utter a sentence, you invoke a particular interpretation of the language, and utter the sentence relative to that interpretation; that is, it is this special or designated interpretation that gives the actual meaning of what you say on that occasion. An utterance of a sentence is then true simpliciter if the sentence uttered is true on the intended interpretation (as invoked by that utterance of it), or in other words, if it is an utterance of a sentence relative to an interpretation on which that sentence is true. In a slogan: truth simpliciter is truth on the intended interpretation.¹¹ ¹⁰ In this book, I shall regard all these terms—and some others besides, such as ‘designated interpretation’ and ‘special interpretation’—as interchangeable. The term ‘intended interpretation’ might be felt to carry the connotation that it is the speaker’s intentions that make a certain interpretation the one relative to which she is speaking. I do not intend any such connotation to go with my use of this term. The question of what does determine the semantic facts will be discussed below: see in particular §2.1.1 and §6.1.1. ¹¹ Cf. e.g. Field 1974, 212; Przełe¸cki 1976, 376; Lepore 1983, 182; Menzel 1990, 358; Islam 1996; and Weiner 2004, 165. Thanks to Amitavo Islam for helpful discussions here.
2 The Space of Possible Theories of Vagueness In this chapter, I present the main existing theories of vagueness, as well as some possible theories which have not yet found advocates. As discussed at the beginning of Chapter 1, I present each theory not necessarily as it was first presented in the literature, but rather as it looks from the point of view of the classical semantic picture. That is, we shall examine whether, and if so how exactly, different theories of vagueness depart from that picture. The classical picture was presented in §1.2. We will be aided in what follows if we have before us a checklist of its main components: 1. Interpretations of the language have the following features: (a) They employ a two-element Boolean algebra of truth values (which serve both as the truth values of sentences and as the values of the characteristic functions of sets). (b) The interpretation function (which assigns truth values to simple sentences, elements of the domain to names, and n-ary relations on the domain to n-ary predicates) is total, and the characteristic functions of sets are total. (c) The truth values of compound wfs built up from simpler wfs by means of the connectives ∨, ∧, and ¬ are determined in a recursive fashion from the truth values of their components. 2. Each discourse has a unique intended interpretation. We shall structure our examination of theories of vagueness according to which item(s) on this checklist each theory denies. Each theory will thereby
34 be located in a space of possible theories which differ from the classical view in one or more of these four possible ways.¹
2.1 Epistemicism (Deny Nothing) The epistemicist about vagueness denies nothing on our checklist: she buys the classical semantic picture holus-bolus.² Thus, according to the epistemicist, when I say something vague—for example, ‘Bob is tall’—I speak relative to a particular intended classical interpretation of my language. Each name in the language is assigned some object in the domain of this interpretation (its referent), and each one-place predicate is assigned a subset of this domain (its extension). The sentence ‘Bob is bald’ is true simpliciter just in case the referent of the name ‘Bob’ is in the extension of the predicate ‘is bald’ on this intended interpretation. But a subset of the domain divides it into two parts: those in and those out. There is no remainder. Thus, my sentence ‘Bob is bald’—and indeed every sentence of the language—is either true simpliciter or false simpliciter. There is no other possibility. Rather than focusing on a particular sentence (‘Bob is bald’), focus on a particular predicate, say ‘is a heap’. On the epistemicist story, this predicate has a particular extension on the intended interpretation of the language. This extension is a classical subset of the domain. Thus, for some number n, piles of n or more grains are in the set and piles of n − 1 or less grains are out.³ We need not know what number it is; but there is such a number nevertheless. One key challenge for the epistemicist will be to explain how it is that this line between the heaps and the non-heaps gets to be where it is—especially given that we might have no knowledge of where it is. I discuss this issue in §2.1.1. ¹ One thing we assume throughout is that our language never changes: it is always our standard firstorder language. In other words our syntax never changes—we only consider changes to our semantics. ² Advocates of epistemic theories of vagueness include Cargile 1997 [1969], Campbell 1974, Sorensen 1988, ch. 6; 2001, Williamson 1997 [1992]; 1994, chs. 7–8, and Horwich 1998. ³ Note that the classical semantic picture, all by itself, does not require that the set of heaps be so ‘nicely’ organized; i.e. as far as classical semantics alone is concerned, it would be fine if the set of heaps consisted of piles with an even number of grains—thus as we remove grains from our heap, we would go ‘heap’, ‘non-heap’, ‘heap’, ‘non-heap’, etc. All that classical semantics, by itself, requires, is that ‘is a heap’ have a crisp set of objects as its extension—i.e. each object is either in the set, or is out, with no fuzziness or blurriness at the edges of the set.
35
Assuming that the classical picture is the correct picture of precise discourse, this means that from a semantic point of view, there is, according to the epistemicist, no difference between vague discourse and precise discourse. But of course there is some difference between the two sorts of discourse. After all, we are tempted by Sorites arguments involving ‘tall’, but not by Sorites arguments involving ‘is greater than or equal to 1800 mm in height’. According to the epistemicist, the difference is an epistemological one: vagueness is a matter of (necessary) ignorance.⁴ When F is vague and a is a borderline case of F, the epistemicist thinks that Fa is either true, or false—but we cannot know which. The idea is that, from the logical and semantic points of view, vague predicates are just like precise predicates, but from the epistemological point of view, they are significantly different. Thus, where other theorists of vagueness will need to spend time motivating and explaining their non-classical semantic frameworks, the epistemicist needs to spend time motivating and explaining an epistemological theory which explains why, even though there is a sharp boundary between the things which are (say) bald and the things which are not, we cannot know where this boundary is. I discuss this issue—the one which gives epistemicism its name—in §2.1.2. How does epistemicism solve the Sorites paradox? The idea is that there is a precise cut-off between the heaps and the non-heaps, the bald persons and the non-bald persons, the tall persons and the non-tall persons, and so on. Thus the Sorites paradox is solved by denying the inductive premiss: the premiss which says that if anything in the Sorites series is a heap, then so is the next thing in the series. This premiss is false: there is a pair of adjacent things in the series such that one is a heap and one is not. But this is only part of the solution to the Sorites paradox: we need to say not only where the mistake is, but also why we were taken in, and did not see the mistake instantly. The paradox involves premisses which seem obviously true and reasoning which seems obviously valid. So to solve it, we need to show that actually the reasoning is not valid, or that actually one or more of the premisses is false. But we need to do something else as well. We have a paradox because very plausible starting points (i.e. premisses and reasoning) lead us to an unacceptable conclusion. To solve the paradox, we cannot ⁴ Cf. Williamson 1994, 202: ‘‘ignorance is the real essence of the phenomenon ostensively identified as vagueness’’, and Williamson 1996b, 327: ‘‘This is not to deny that vagueness exists; it is to assert that its underlying nature is epistemic.’’
36 just show that the starting points are not acceptable: we already know they cannot all be acceptable, because they lead to a conclusion which we cannot accept. We have to explain why we thought they were acceptable: we have to explain why we were taken in. Only then do we have a satisfying solution to the paradox.⁵ The second part of the epistemicist’s story is that we cannot know where the cut-off is. This is turned into an explanation of why we were taken in by the Sorites, as follows. We cannot know where the cut-off is, so we mistakenly think it isn’t anywhere. Our ignorance of where the cut-off is makes us think there is no cut-off at all. That is why we are inclined to accept the inductive premiss, even though, according to the epistemicist, the inductive premiss is actually false. 2.1.1 The Location Problem The epistemicist thinks that there is a particular number of hairs dividing the last bald man from the first non-bald man, a particular number of grains of sand dividing the last heap from the first non-heap, and so on. What determines these numbers? What fixes the locations of the boundaries drawn by vague terms? Suppose that the relevant number of grains of sand is 23. Why not 22? What makes 23 the correct number here? How did 23, rather than 22 or 24, get singled out as correct? How did the boundary between heaps and non-heaps come to be located just here?⁶ It seems to most people who have thought about these matters that language is essentially a human artefact, and so there should be some sort of connection between meaning and use.⁷ The basic thought is that the sounds we make mean what they do because of the kinds of situations in which we, and earlier speakers, have made those sounds. So, for example, had we always used the word ‘dog’ where we in fact used ‘cat’, and vice versa, then ‘dog’ would have meant what ‘cat’ in fact means, and vice versa. Here is a prima facie plausible principle which connects meaning and use (for the case of predicates): ⁵ Thanks to Delia Graff Fara for first helping me to appreciate this point. Cf. Stalnaker 1999, 74; Fara 2000, 50; and Schiffer 2000 233 ff. ⁶ Cf. e.g. Wright 1995, 2003 and Schiffer 1999, to name just two of the critics of epistemicism who have focused on this issue. ⁷ Cf. e.g. the opening sentence of Lewis 1992: ‘‘Surely it is our use of language that somehow determines meaning’’, or Williamson 1994, 205: ‘‘Words mean what they do because we use them as we do’’, or Shapiro 2006, 5: ‘‘I take it to be a truism that the competent users of a language determine the meaning of its words and phrases.’’
37
(MU) The claim Pa is true if and only if most competent speakers would confidently assent if presented with a in normal conditions and asked whether it was P, and is false if and only if most competent speakers would confidently dissent if presented with a in normal conditions and asked whether it was P. The biconditionals in (MU) connect meaning (i.e. extension on the intended interpretation, which determines truth and falsity) with use (actual and counterfactual) only in the weak sense of saying that they co-vary. (MU) is compatible with, but does not require, the view that usage directly determines meaning—that is, the view that a certain object is in the extension of P because we do or would apply P to this thing. (MU) is also compatible with the view that we apply words in certain ways because doing so allows us to speak truthfully; in conjunction with this view, MU merely requires that competent speakers in general have access to the meanings of their terms—in keeping with the guiding idea that language is a (useful) human tool. The epistemicist must reject (MU), for the following reason. Take a clear case of ‘is bald’ (say Yul Brynner). ‘Yul Brynner is bald’ is true, and also, everyone would assent to it, so here meaning and use match up. Take a clear non-case of ‘is bald’ (say Fabio). ‘Fabio is bald’ is false, and also, everyone would dissent from it, so here again meaning and use match up. So far so good. But what about the borderline cases? According to the epistemicist, for any borderline case x of baldness, ‘x is bald’ is either true, or false. Yet most competent speakers would neither assent to nor dissent from ‘x is bald’: we shrug our shoulders, we say ‘he is and he isn’t’ or ‘he’s sort of bald’, we say nothing either way, etc. So within the borderline cases, there is—given the epistemicist view—a failure of match-up between meaning and use. We can picture the situation—say for the predicate ‘is tall’—as in Figure 2.1. What we see is that epistemicism entails a mismatch between use (on the left) and meaning (on the right). There are standard counterexamples to (MU). For example, consider the predicate ‘is a lump of gold’. Most ordinary speakers would classify a lump of fools’ gold as a lump of gold, and yet we do not think that ‘is a lump of gold’ truly applies to a lump of fools’ gold. In other words, fools’ gold is not gold, even though most of us would classify it as such: most of us are just wrong. The mechanism which makes ‘is a lump of gold’ fail to have
38 use
meaning 8′
assert true
6′ hedge
4′
false
deny
2′ persons
Figure 2.1. Use and meaning.
any lumps of fools’ gold in its extension, even though we would assert of many lumps of fools’ gold that they are lumps of gold, is as follows. We originally applied our term ‘gold’ to some instances of a certain natural kind. From there, the natural kind did the work of determining the extension of our term. The term truly applies to all instances of the natural kind, even if we would not co-classify them with the original samples, and it fails to apply to any non-instance of the natural kind, even if we would co-classify it with the original samples. By latching our term onto a natural kind, we take away the rest of the job of determining its extension (i.e. beyond the initial instances to which we applied the term) from usage, and hand it over to the world.⁸ This sort of counterexample to (MU) will not help the epistemicist, however—precisely because of the essential role it gives to natural kinds. For with vague predicates, there are no natural kinds in the vicinity to bridge the gap between use and meaning. Consider the predicate ‘is tall’. ⁸ See Kripke 1980 and Putnam 1975.
39
We have applied this term to people who are 6 ft 4 in., 7 ft, 6 ft 2 in., . . . How do we get from this to a particular height such that the predicate truly applies to everyone greater than (or equal to) this height, and no one else? That is the epistemicist’s problem. If there was a particular height x such that being of that height or greater was a natural kind, our problem would be solved. But there is no such height. There is nothing more natural about being 6 ft or more than about being 5 ft 11 in. or more or being 6 ft 1 in. or more, or indeed being x foot or more for any x at all. Thus the epistemicist violates (MU) without being able to posit anything else, apart from usage—such as a natural kind—to determine meaning. It may seem at this point as though the epistemicist is forced to say that meaning facts are primitive, that it is just a brute fact that the extension of ‘is tall’ is some particular precise set—an option which may seem simply unacceptable.⁹ However, there is in fact another option: for despite the fact that she has to deny (MU), the epistemicist does not have to deny that use determines meaning. Recall Figure 2.1. (MU) tells us that the left-hand side and the right-hand side must match. The claim that use determines meaning is the claim that the left-hand side determines the right-hand side. Now to claim that one set of facts—meaning facts—are determined by another set of facts—facts about usage—is to claim that meaning supervenes on use, that there can be no difference in meaning without a difference in use; that if the meaning were different, the use would be too. Thus the claim that use determines meaning becomes the claim that if the right-hand side of the picture were different, the left-hand side would be different too. Now there is nothing inconsistent about claiming both that the left-hand side and the right-hand side do not match, and that the right-hand side could not be different without the left-hand side being different. This is the sort of line taken by Williamson (1997 [1992], §4; 1994, §7.5).¹⁰ His claim is that use does determine meaning—just not in a straightforward way. There is a relationship between use and truth conditions such that the former determines the latter—but it is complex, and we do not know what it is. Certainly use does not determine meaning in the simple way mentioned ⁹ For further discussion see pp. 282–3 below. ¹⁰ This line bears strong affinities to Soames’s 1997 response to Kripkenstein’s sceptical puzzle. Soames distinguishes a second sense of ‘determines’, in which the A facts determine the B facts if we can, in principle, deduce (a priori) the B facts from the A facts. His point is that facts about usage (plus facts about the environment and our mental states) do determine the meaning facts in the sense discussed in the text above, but not in the ‘a priori derivability’ sense.
40 above (immediately after the introduction of (MU)), according to which an object is in the extension of P because we do or would apply P to this thing. We have no idea of the mechanisms intervening between use and meaning: we do not know how (if at all) the right-hand side of our picture would alter if we twiddled this or that bit of usage on the left-hand side. Yet we can still maintain that the right-hand side could not be different unless the left-hand side was too. This is a consistent response to the problem of how precise extensions for vague predicates are determined, but it is by nature unsatisfying: we are told merely that something is the case (meaning is determined by use), when we want to know how it could possibly be the case (assuming, with the epistemicist, that meaning is cleanly bipartite, while usage is fuzzily tripartite).¹¹ Indeed, a residual dissatisfaction with the idea that our usage does determine sharp lines between the bald and the non-bald, the heaps and the non-heaps, and so on (combined with a dislike of primitive meaning facts) is part of the motivation for some of the other approaches to vagueness that we shall examine below. Williamson does at one point suggest a more informative account of the relationship between use and meaning. He considers the following objection to his view: Suppose . . . that in normal perceptual conditions any competent speaker of English refuses to classify me as thin and refuses to classify me as not thin. How could the truth or falsity of ‘[Williamson] is thin’ possibly supervene on that pattern of use? (1994, 207)
and responds as follows: The . . . worry concerns the apparent symmetry of the situation. However, the concepts of truth and falsity are not symmetrical. . . . The epistemic theorist can see things this way: if everything is symmetrical at the level of use, then the utterance fails to be true, and is false in virtue of that failure . . . In that sense, truth is primary. (1994, 208) ¹¹ On a related note, Burgess 2001, 513 argues that ‘‘the critic [of epistemicism] is misrepresented if he or she is taken to be looking primarily for a supervenience thesis. The critic . . . wants to know, primarily, of particular borderline examples, what makes them heaps, bald men, red objects, or whatever, when other objects, perhaps indiscriminable from them by the standard methods of testing performed under suitable conditions, are not correctly so describable.’’ See also Keefe 2000, 80–3.
41
Burgess (2001, 519) sees here a ‘parasitic’ strategy for saying how use determines meaning; as Weatherson (2003a, 277) puts it: ‘‘wait for the indeterminist to offer a theory of when sentences are true, accept that part of the indeterminist theory, and say all other sentences that express propositions are false’’.¹² In our terms, we can say that the idea is to accept the first biconditional in (MU) (the one connecting assent and truth), while providing a principled reason (viz. the asymmetry of truth and falsity) for rejecting the second biconditional (the one connecting dissent and falsity).¹³ We can see, however, that the parasitic strategy will not work as part of the overall epistemicist view. The epistemicist wants to explain our hedging over Fa in borderline cases as a manifestation of our ignorance as to whether a is or is not F, rather than in terms of its being neither true nor false that a is F.¹⁴ As we shall see in the next section, the best account of why we would be ignorant in borderline cases—when in fact, in such cases, either a is F, or a is not F, according to the epistemicist—employs the idea of knowledge requiring a margin for error, which ensures that we cannot know of objects in the vicinity of the non-F/F boundary whether or not they are F. As Williamson (1997 [1992], 278) puts it: ‘‘an utterance of ‘[Williamson] is thin’ is an expression of knowledge only if I am some way from the boundary of ‘thin’ ’’. Thus, this part of the epistemicist picture requires that the hedging region cover the false/true boundary. But now consider the parasitic strategy. In accordance with (MU), the indeterminist locates false/neither-true-nor-false and neither-true-nor-false/true boundaries in the same places as the deny/hedge and hedge/assert boundaries provided by usage. The epistemicist who follows the parasitic strategy will then end up with a false/true boundary which coincides with the hedge/assert boundary—thus conflicting with the idea that the hedging region must cover the false/true boundary.¹⁵ ¹² The ‘indeterminist’ here is the theorist who thinks, in opposition to the classical semantic picture which the epistemicist accepts, that statements about borderline cases are neither true nor false. ¹³ Thanks to Robbie Williams for helpful comments here. ¹⁴ See e.g. Williamson 1994, 3. ¹⁵ Apart from this problem with the parasitic strategy, I also have deep misgivings about the underlying idea that there is a fundamental asymmetry between truth and falsity—but discussing this issue would take us far afield. In addition to these worries, Burgess 2001 argues that the parasitic strategy yields the wrong truth values in some cases; Weatherson 2003a correctly points out that Burgess is wrong about these cases, but substitutes other cases in their place.
42 2.1.2 Why Can’t We Know Where the Lines Are? Supposing that there is a precise dividing line between the bald and the non-bald (etc.), why can’t we know where it is? (Recall that the claim that we cannot know where the cut-offs are is a key plank in the epistemicist’s resolution of the Sorites paradox: it provides the explanation of why we are taken in by the paradox in the first place.) The epistemicist who has made the most progress towards answering this epistemological question is Williamson (1997 [1992], §6; 1992; 1994, ch. 8). Williamson argues that where our capacities of discrimination are limited, knowledge requires a margin for error. For example, suppose that you walk into a football stadium which in fact contains 30,000 people, and say ‘There are 30,000 people here’. Your claim is true—but did you know there were 30,000 people there, or were you just right by luck? That depends upon your powers of discrimination. If you can detect very fine differences in crowd size—if, had there been one less, or one more, person there, you would have noticed a difference, and would not have claimed there were 30,000 people there (in that counterfactual situation)—then it seems right to say you do know there are 30,000 people there (in the actual situation). But suppose your powers of discrimination are coarser: that you would only begin to notice a difference in the crowd if 100 or more people were removed, or added. Then, if there had been one less, or one more, person there, you would not have noticed any difference, and would still have claimed there were 30,000 people there (in that counterfactual situation)—and so it seems right to say you do not know there are 30,000 people there (in the actual situation). Given your powers of discrimination, you could (in the actual situation) know a claim such as ‘there are at least 29,900 people present’. For if we were to alter the situation in a way you can not detect–by adding or subtracting up to 100 people–then you would not notice any difference, and would still claim there are at least 29,900 people present; but you would be right (in these counterfactual situations). The general moral is that for you to know something in the actual situation, it must be true not only in the actual situation, but also in counterfactual situations which are similar enough to the actual one that you would not notice the difference (so, how similar is similar enough depends upon your powers of discrimination):
43
(ME) ‘S knows that P’ is true in a situation T only if ‘P’ is true in situations that are sufficiently similar to T. Williamson calls this a margin for error principle for P.¹⁶ It might seem obvious how to apply this idea to the case of vagueness. Let’s assume the epistemicist is right about the semantics of vague predicates, and let’s suppose, for the sake of argument, that as a matter of fact the cut-off between tall and not-tall comes at 1.8 m (i.e. those whose height is greater than or equal to 1.8 m are tall). If Bob is 1.6 m in height, then I can know he is not tall, because in similar situations—where his height is slightly different—it’s still true that he is not tall. However if he is 1.8 m in height, then I cannot know that he is tall, because while this claim is true, had his height been ever so slightly less, the claim would have been false, and yet I would not have detected any difference. This clearly does not get to the heart of the matter, however. For the Sorites paradox for ‘is tall’ does not turn on our being faced with an actual line of men, whose height we cannot determine with absolute precision. It is simply stipulated that we have a line of men, ranging in height from four feet to seven feet, in nanometre increments. So the height of each man is given to us—and yet we feel there is a paradox. Thus if ignorance is supposed to explain the feeling that there is a paradox, then this ignorance must be of something other than the exact heights of the persons in the Sorites series. Indeed it is, according to Williamson. There must be a margin for error around the actual situation (in which Bob is 1.8 m in height, and the cut-off for tall is at 1.8 m), if we are to know, in that situation, that Bob is tall. We initially supposed this margin to consist of cases in which the cut-off for tall is still at 1.8 m, but Bob’s height is slightly different. This was, as we saw, a mistake. Instead, the margin consists of cases in which Bob is still 1.8 m in height, but the cut-off for tall—i.e. the extension of ‘is tall’ on the intended interpretation—is slightly different. We know from the previous section that Williamson’s view is that for the extension of ‘is tall’ to be different, community usage of ‘is tall’ must be different. So, we are to imagine counterfactual situations in which usage is slightly different, but Bob’s height is the same. Consider first the case in ¹⁶ The view that knowledge requires a margin for error has obvious affinities with both reliabilist and tracking theories of knowledge. For more on the surrounding epistemological issues, see Williamson 2000.
44 which in the actual situation, Bob is 1.6 m in height, and we claim he is not tall. Community usage would have to be very different for the cut-off for tall to be shifted below 1.6 m: that is a big change from the actual cut-off of 1.8 m. Assuming I would recognize such a difference in usage and not say in that situation that Bob is not tall, I do know in the actual situation that Bob is not tall: I am safely within my margin for error. But consider the case in which in the actual situation Bob is 1.8 m tall. We might suppose that for the cut-off to be shifted just above 1.8 m, usage would not have to be very different: just a few people here or there would have to have different dispositions regarding the use of ‘is tall’ to make such a tiny shift in its extension.¹⁷ Plausibly, were such a change in usage to take place, I would be none the wiser: so I do not know in the actual situation that Bob is tall, for I have no margin for error. If this is correct, then we can see why we cannot know where the cut-off for tall is. In order to know that the cut-off between tall and not-tall is at 1.8 m, I would have to know both that a person 1.8 m in height is tall, and that a person arbitrarily less than 1.8 m in height is not tall. But we have just seen that I cannot know such things, because when we are so close to the cut-off that the amount of change in usage required to push the cut-off over the point we are considering is small enough for us not to be able to notice it, there is no margin for error. Thus I cannot know where the cut-off between tall and not-tall is—and similarly for other vague predicates. 2.1.3 Worldly Vagueness? A question which I shall be asking about each of the theories we examine is ‘Where does the theory locate vagueness: in the relationship between language and the world, or in the world itself?’ In other words, does the theory see vagueness as a semantic phenomenon, or as a metaphysical phenomenon? The epistemicist—unlike proponents of any of the other theories we shall examine—locates vagueness in neither of these places. She sees vagueness as an epistemic phenomenon. Vagueness and indeterminacy are ruled out of the classical semantic picture—which the epistemicist buys holus-bolus—in two places. First, there is no vagueness in the world. ¹⁷ In fact, it is hard to see how Williamson could be entitled to claims of this sort, given his view (discussed in the previous section) that ‘‘Meaning may supervene on use in an unsurveyably chaotic way’’ (1994, 209).
45
Classical models are entirely precise: they represent the world as a crisp set of objects and properties such that for each property and each object, the object either definitely possesses the property, or definitely does not possess it. Second, there is no semantic indeterminacy—no vagueness in the relationship between language and the world. Each discourse has a unique intended model: so there is no vagueness or indeterminacy concerning what any name, predicate, or sentence means. For the epistemicist, then, vagueness is neither a metaphysical nor a semantic phenomenon. It is an epistemic phenomenon. Although the world itself is completely precise, and all items of our language have a unique, precise meaning, some of these meanings are unknowable. For example, ‘is tall’ picks out a particular crisp set of objects in the world, but we cannot know the full details of the membership of this set—and it is in this necessary ignorance that the vagueness of ‘is tall’ is located. 2.1.3.1 Semantic Realism I just asked whether epistemicism locates vagueness in the relationship between language and the world, or in the world itself—and in the other sections of this chapter entitled ‘Worldly Vagueness?’ I shall be asking the same question about the other theories of vagueness to be examined. It is worth making explicit a presupposition of the question, and the methodology I employ in answering it. Taking the second point first: my method is to read off the answer directly from the system of model theory employed by the theory of vagueness in question. Underlying this method is a literal attitude towards model theories for vague language. That is, I regard a model theory for (part of) a language as giving a literal (although not necessarily complete) description of the relationship between that language and the world: a system of model theory as a whole tells us about the kinds of relationships that a language may have to a world; what is going on in the intended interpretation(s) of a particular discourse tells us the actual relationship between that discourse and the world. Of course this is not always the appropriate attitude to take towards model theory. A logician is perfectly entitled, for example, to employ some system of model theory in order to establish results about provability in some formal system: for this, she needs only relevant soundness or completeness results; it does not matter whether the model theory in any sense gives the real meaning of the sentences in question. Likewise, a philosopher might, for example, employ a model theory for a system of modal logic involving
46 possible worlds and merely possible objects, but not take this model theory literally in my sense: when his model theory tells him that ‘Possibly P’ is true iff there is a possible world in which P is true, he does not regard this as a literal description of what goes on when I say ‘Possibly P’; that is, he does not think that I stand to a bunch of possible worlds, which between them do or do not make my statement true, in the same way that the teacher who says ‘Someone threw that apple core’ stands to the collection of students in the room, who between them do or do not make that statement true. Rather, he might treat his model theory in an instrumental fashion, regarding it as a calculus for keeping track of our modal commitments.¹⁸ This is all quite right, but in the present context—i.e. in the various sections of this chapter entitled ‘Worldly Vagueness?’—a literal attitude towards model theory is appropriate. This is because a presupposition of the question being asked in these sections is semantic realism, and where one is a semantic realist about an area of discourse, one should be a model-theoretic literalist. Let me explain. Semantic realism, in the sense I mean it here, is the view that there are genuine semantic properties and relations: in particular, relations of reference holding between sub-sentential or sub-propositional expressions (names, predicates, relation symbols) and parts of the world (objects, properties or sets of objects, relations or sets of n-tuples of objects), and properties of truth possessed by (some) sentences or propositions.¹⁹ Note that while semantic realism implies that there are real relations of reference, it does ¹⁸ This non-literal attitude towards modal model theory is to be contrasted with the ersatzist perspective, which involves taking some form of Kripke-style modal model theory employing possible worlds literally, but denying that the possible worlds and mere possibilia which figure in the model theory are to be regarded as concrete things on a par with the actual world and its inhabitants. In relation to the distinction between literal and non-literal attitudes towards model theory, cf. the discussions in the literature on modal logic of the distinction between pure and applied (aka depraved) semantics. The original distinction is due to Plantinga 1974, 126–8. For some recent discussion see Gregory 2005; Zimmerman 2005, 414 ff.; and Divers 2006. My first introduction to the idea of seeing (classical) model theory as providing an account of the relationship between natural language and the world was in helpful discussions with Amitavo Islam. ¹⁹ Within the class of semantic realists—characterized by adherence to the view that both reference and truth are real—we might furthermore distinguish those who hold that reference is primary, and those who hold that truth is primary. On the former view (which I call ‘reference-first semantic realism’), the truth status (e.g. true, false, no truth value, an intermediate truth value) of sentences is explicable in terms of the referents of their parts. For example, in the sentence ‘Maisy is hungry’, the name ‘Maisy’ refers to a certain dog, and the predicate ‘is hungry’ picks out a certain property (or set of things), and it is because Maisy in fact has this property (is a member of this set of things) that the sentence is true. On the latter view (which I call ‘truth-first semantic realism’), what fixes or determines reference is the truth conditions of certain sentences. For example, one might hold that the reference of some term is whatever it has to be in order to make all the sentences in some overarching theory true.
47
not imply that every sub-sentential expression has a unique or determinate reference.²⁰ It also does not imply that all sentences are true or false: the properties of truth countenanced by the semantic realist might include truth statuses such as neither true nor false, true to such-and-such degree, and so on. Semantic realism contrasts with semantic antirealist views which deny the existence of real reference relations, or real truth properties, or both. Advocating the reality of reference while denying the reality of truth seems bizarre, and to my knowledge this position remains unoccupied; there are those, however, who deny the reality of truth and reference, and those who deny the reality of reference but not of truth. In the former camp we find, for example, the antirepresentationalists. According to their ‘push-and-pull’ or ‘he said–she said’ view of language, our use of language is to be understood as a game or practice, and sentences and other utterances are to be assessed in terms of their appropriateness, like moves in a game—that is, in terms of their ‘horizontal’ connections with other moves, rather than in terms of their ‘vertical’ (i.e. referential or representational) relations with extra-linguistic reality.²¹ In the latter camp we find, for example, Davidson (1984 [1977]), who combines an analogy between semantics and physics—in physics, ‘‘we explain macroscopic phenomena by postulating an unobserved fine structure. But the theory is tested at the macroscopic level’’ (p. 222), and in semantics, on Davidson’s analogy, truth of sentences plays the role of the macroscopic phenomena, and reference of sub-sentential expressions plays the role of the underlying fine structure—with an instrumentalist rather than realist attitude towards the postulated underlying machinery. Thus on Davidson’s approach it ‘‘makes no sense . . . to complain that a theory comes up with the right truth conditions time after time’’, but gets the references of sub-sentential expressions wrong (p. 223).²² ²⁰ Indeed, this follows from the fact (n. 19) that semantic realism is compatible with the view that the truth conditions of sentences determine all other semantic facts, together with the now well-known fact (emphasized by Quine, Wallace, and Putnam, amongst others—see e.g. Wallace 1977 and Putnam 1981, 33), that in general the truth conditions of whole sentences do not determine unique references for their parts. ²¹ Abbott 1997, 128–30 has a useful categorization of views of reference, which distinguishes two other views besides semantic realism and antirepresentationalism (which Abbott calls ‘bold antirepresentationalism’). ²² Referential antirealism of this sort must be distinguished from truth-first semantic realism. The latter view agrees with Davidson on the primacy in semantics of the truth conditions of whole sentences, but disagrees that there are no real relations of reference between sub-sentential expressions and (parts of ) the world.
48 The question asked about each of our theories of vagueness in the sections of this chapter entitled ‘Worldly Vagueness?’ presupposes semantic realism. The idea is that when we speak vaguely, there is our language on the one hand, the world on the other hand, and semantic relations between the two. The question is then: where is the vagueness? Does the theory in question locate vagueness in the world itself, or in the relationships between language and the world? Clearly, from a semantic antirealist perspective, the question is misguided. That leaves two points to discuss. First, by what right do I presuppose semantic realism in these sections? Second, given this presupposition, why should we be model-theoretic literalists (in the context of answering the question posed in these sections)? On the first question: Semantic realism is the obvious, default view of language. For instance, it certainly seems that the thing that distinguishes language from a game such as chess or draughts is precisely that words refer to objects in the world, so that combining words in different ways on paper or in speech allows us to say things about the world, to represent it as being thus or so, whereas chess or draughts pieces do not refer to objects in the external world, and moving them and combining them in different ways on a board does not represent anything about goings-on external to the board. Even Davidson (an opponent of semantic realism), refers to the ‘Building-Block theory’—a version of reference-first semantic realism—as ‘‘an old [view] and a natural one’’ (1984 [1977], 220). We are therefore entitled to assume a broadly semantic realist perspective unless this is shown to be untenable. But that has never been done: none of the arguments against semantic realism, put forward by antirepresentationalists, referential antirealists, and others, is convincing. I shall not defend this claim, however: to do so properly would require a book-length treatment, and it seems more appropriate to save that for another occasion than to load it onto the present work, for two related reasons. First, most of the vagueness literature assumes semantic realism as a guiding viewpoint (although see §2.7 for one exception); second, the debate between semantic realisms and antirealisms is of very broad significance, far beyond the study of vagueness. Semantic realism will, then, be an unargued presupposition. On the second question: Given semantic realism, model-theoretic literalism follows almost immediately. For if we approach vague language from a semantic realist standpoint—if we believe in referential relations between
49
linguistic items and items of extra-linguistic reality, in truth properties of sentences, and that truth and reference are related in certain fundamental ways (e.g. we might believe that ‘Maisy is tired’ is true just in case the referent of ‘Maisy’ is in the extension of ‘is tired’)—then we would be foolish not to adopt a model-theoretic framework for exploring different particular views on the reference and truth of vague expressions. For model theory is purpose-built for discussing such matters: it enables the discussion to proceed with a far greater level of clarity and precision than would otherwise be possible, and provides a unifying perspective from which the differences between rival views of vagueness can be perceived with great accuracy. But then once we have adopted a model-theoretic framework for these reasons, there is simply no room left for us to turn around and take a non-literal attitude towards it. For the whole idea was that we want to know what is going on when we use vague language—what the world we talk about is like, and how our language is related to it—and we adopted a model-theoretic framework precisely because of the clarity and precision of the answers to these questions that such a framework can be seen as providing.²³ Summing up: In the various sections of this chapter entitled ‘Worldly Vagueness?’, I presuppose a broadly semantic realist perspective, according to which when we use vague language, we refer to and/or represent goingson in extra-linguistic reality. I am not assuming that we determinately refer to unique parts of the world: I am assuming only reality—not determinacy—of reference. More generally, I am not, at the outset, assuming any particular account of the relationship between vague language and the world—I am not assuming of any particular model theory for vague language that it is the correct one. The task is to determine exactly what the relationship is between vague language and reality—and part of this task involves determining what the reality that vague language talks about is like in itself. Among other things, this will give us an answer to the question whether the world can be regarded as precise, with vagueness being a matter of the ²³ Of course, as I have already noted, there may well be other contexts in which we wish to employ some system of model theory for different reasons and in which a literal attitude to that model theory would not be appropriate. This also means that when I say, for example, that many-valued theories of vagueness locate vagueness in the world (§2.2.2), this cannot be read as saying that previous advocates of many-valued semantics think that vagueness is a worldly, not a semantic, phenomenon: for while an author may have advocated a many-valued model theory for vague language, she may not have taken her model theory literally in my sense.
50 referential and representational relationships between vague language and the world, or whether we must ultimately locate some or all vagueness in the world itself. Because of the advantages that model theory provides in terms of clarity and precision, these questions will be pursued within a modeltheoretic framework—and because this is the reason for employing such a framework, the model theory will be taken literally: that is, I will regard what is going on in the intended model(s) of vague discourse, according to a particular model theory for vague language, as a (proposed, but of course not necessarily correct) literal account of the relationship between vague talk and reality. Granted this perspective of model-theoretic literalism, the claims that I make about worldly vagueness and semantic indeterminacy in the various sections of this chapter entitled ‘Worldly Vagueness?’—which might seem contentious given other possible perspectives on the use of model theory—are obvious and uncontroversial.
2.2 Additional Truth Values (Deny 1a) Our story in §1.2 began with the Boolean algebra ({0, 1}, ∨, ∧, ) of classical truth values.²⁴ We saw that with this algebra in hand, classical propositional logic, set theory, and predicate logic come in a natural progression. We then added the idea of there being an intended interpretation of any given discourse, to complete our presentation of the classical picture. If we take this story and keep everything the same, except that we unplug the algebra of classical truth values, and instead plug in a different algebra of truth values, then we get a family of different semantic pictures—one member of the family for each underlying algebra. In the cases where the algebra of truth values that we plug in has more than two elements, what we get is a many-valued logic. The simplest case is where we have three truth values, leading to versions of three-valued logic. This idea has a natural application in the case of vagueness—for it is a fairly natural thought about vague sentences such as ‘Bob is bald’, where Bob is a borderline case of baldness, that they ²⁴ In §1.2, I used different symbols for the algebraic operations on the set of truth values and for the connectives in our language, in order to make this distinction very clear. As foreshadowed in ch. 1 n. 5, I now revert to the standard practice of using the same symbols for both, and letting context disambiguate.
51
are neither true nor false, but have a third truth value.²⁵ One particularly nice aspect of this approach is that it allows us to maintain an attractive parallelism between meaning and use, of the sort that the epistemicist was forced to reject. We can now say that a vague sentence Pa is true if most competent speakers would confidently classify a as being P, is false if most competent speakers would confidently deny that a is P, and is neither true nor false if most competent speakers would hedge over whether a is P. The epistemicist had only two semantic statuses for sentences—true and false—whereas ordinary speakers have at least three pragmatic stances which they can adopt towards a sentence—assertion, denial, or hedging. The three-valued approach posits a new semantic status, thereby allowing a parallelism between meaning and use to be maintained.²⁶ In developing any three-valued view, our first step is to specify an algebra ({0, 1, ∗}, ∨, ∧, ) which has three truth values, 1, 0, and the new truth value ∗, and three operations, ∨, ∧, and . Consider Table 2.1. Every different way of filling in the blank cells with a 1, 0, or ∗ yields a different algebra of truth values. (A cell marked with the ‘ditto’ symbol must be filled with whatever is in the cell above it.) There are twenty-one empty cells, and three possible fillings for each; thus we have 321 = 10, 460, 353, 203 possibilities. Thus, our family of three-valued logics is large! Some members of the family are more interesting than others. For a start, it is natural to be more interested in those operations ∨, ∧, and which, when fed only 1’s and 0’s, output what the classical operations would output. This fixes every row which does not have a ∗ in it, reducing the number of empty cells to eleven (and thus leaving 311 = 177, 147 possible algebras remaining). It is also natural to want the truth values of A ∨ B and A ∧ B to be the same as those of B ∨ A and B ∧ A respectively—thus we want our binary operations ∨ and ∧ to be symmetric. Introducing ²⁵ Other common motivations for introducing three-valued logics include: Future contingents. Consider a statement about the future, such as ‘Bob will eat pizza for dinner next Tuesday’. You might think that as of right now, this statement is neither true nor false: it’s up to Bob, and he has not decided yet. Semantic paradoxes. Consider the sentence: ‘This sentence is false.’ Suppose it is true; then what it says is indeed the case; but what it says is that it is false. So if it’s true, it’s false, so it’s not true. So what if it is false? Then what it says is not the case. What it says is that it is false; so it is not false, i.e. it is true. So if it is false, it is true, so it is not false. So it seems that the sentence can be neither true nor false. It might reasonably be thought that these motivations lead most naturally not to the idea of a third truth value, but to the idea that some sentences lack a truth value altogether. We shall come to this idea in §2.3. ²⁶ For further discussion of this issue see p. 58 below.
52 Table 2.1. Framework for algebras of three truth values ∨
∧
1
1
1
∗
1
0
∗
1
∗
∗
∗
0
0
1
0
∗
0
0
the convention that a cell marked with the symbol 2 must be filled with whatever is in the cell above the cell above it, this leaves us with seven empty cells (see Table 2.2). Finally, we are naturally more interested in those algebras where ∨ and ∧ are duals (with respect to ).²⁷ Thus, once we Table 2.2. Partially constrained framework for algebras of three truth values 1
1
1
∗
1
0
∨
∧
1
1
0
1
0
2
2
∗
1
∗
∗
∗
0
0
1
1
0
1
0
∗
2
2
0
0
0
0
²⁷ See p. 69 for an explanation of this notion of duality.
53
have fixed one of these operations, the other is fixed too. That removes three more cells. The number of algebras left is still 34 = 81. Of these, I shall mention only two! The first is arrived at via the idea that any compound proposition with a component that has the value ∗ should itself have the value ∗. Given that, once we have fixed the algebra of truth values, we are simply going to run the rest of the classical story without further changes—thus, for example, the truth value of A ∨ B will be that truth value to which the operation ∨ in our algebra of truth values maps the pair (the truth value of A , the truth value of B )—this means that our operations must be as in Table 2.3 (where every blank cell in Table 2.2 has been filled with a ∗).²⁸ A second rationale for filling in the blank cells in Table 2.2 is as follows. Suppose the ∗ were a 1, and calculate the value of the corresponding classical operation, and suppose the ∗ were a 0, and calculate the value of the corresponding classical operation; if you get 1 both times, the value of the new operation that you are defining is 1; if you get 0 both times, the value of the new operation is 0; if you get 1 once and 0 once, then the value of the new operation is ∗. (Where you are trying to determine what the Table 2.3. Algebra of three truth values: first rationale ∨
∧
1
1
1
1
0
1
∗
∗
∗
1
0
1
0
∗
1
∗
∗
∗
∗
∗
∗
∗
∗
0
∗
∗
0
1
1
0
1
0
∗
∗
∗
0
0
0
0
²⁸ This gives us something analogous to Kleene’s 1952 weak truth tables. It does not give us Kleene’s weak truth tables themselves, because what I have just given is not a truth table (recall the discussion on p. 26).
54 new operation should assign to a pair of ∗’s, you need to calculate all four possibilities where each ∗ is replaced by a 1 or a 0.) Following this rationale through, we get Table 2.4.²⁹ The rationale used to define the latest set of operations may have sounded like supervaluationism, but the outcome is in fact very different. The three-valued view just presented—along with every other view constructed from the classical view by replacing the algebra of classical truth values with an alternative algebra—is truthfunctional, which is to say that the truth value of a compound sentence is determined solely by the truth values of its components. Truth-functionality is ensured by the part of the picture which says that [A ∨ B ] = [A ] ∨ [B ], [A ∧ B ] = [A ] ∧ [B ] and [¬A ] = [A ] (see p. 26 above). On the other hand, as we shall see when we discuss it in §2.4, the supervaluationist view is not truth-functional. A third rationale for filling in the blank cells in Table 2.2 is as follows. Recall that at the beginning of our discussion of the classical picture, we considered the appropriate ordering of our two classical truth values 1 and 0. Let us do this again, with our three truth values 1, 0, and ∗. We may ask, under what assignments of truth values to sentences S and T is S at least as true as T? Obviously, if S and T have the same truth value, then S ≥ T Table 2.4. Algebra of three truth values: second rationale ∨
∧
1
1
1
1
0
1
∗
1
∗
1
0
1
0
∗
1
1
∗
∗
∗
∗
∗
∗
∗
0
∗
0
0
1
1
0
1
0
∗
∗
0
0
0
0
0
²⁹ This is Łukasiewicz’s three-valued logic (minus his treatment of the conditional, which is the three-valued analogue of the fuzzy Łukasiewicz conditional, to be introduced in §2.2.1.1). It is also analogous to Kleene’s 1952 strong truth tables (cf. n. 28).
55
(letting ≥ mean ‘at least as true as’), and if S is true and T is false, then S is truer than T (i.e. S ≥ T and T S). But what if S has the value ∗ and T does not? One natural thought is that if T is false, then S is truer than T, and that if T is true, then T is truer than S. This yields the following lattice of truth values: 1 | ∗ | 0 Setting x ∨ y equal to the supremum of x and y relative to this ordering, x ∧ y equal to the infimum of x and y relative to this ordering, and setting 0 = 1, ∗ = ∗, and 1 = 0, yields an algebra of truth values. As is easily verified, it is in fact the same as our previous algebra. We wanted to replace the algebra of classical truth values with a different structure. We have seen that we have many choices here. Making this choice is our only substantive step. After making it, we then run the rest of the classical story just as before. Thus, when it comes to propositional logic, the truth value [A ∧ B ] of the proposition A ∧ B will be [A ] ∧ [B ], where [A ] is the truth value of A , [B ] is the truth value of B , and ∧ is the operation in our new algebra of truth values that replaces the classical operation, symbolized by the same symbol, in our original Boolean algebra of truth values.³⁰ When it comes to set theory, we have a ready-made ³⁰ Of course, not all the details of the resulting story of propositional logic will be the same as in the classical story: what we are holding fixed is the method of generating the story, given an algebra of truth values as input; when we vary that algebra, the details will, in general, change. The changes are, however—and this is the point—perfectly predictable. Thus, e.g., the classical algebra of truth values is a Boolean algebra, and so all the laws of Boolean algebras reappear, in different guise, as laws of classical propositional logic. For example, mirroring law 5 of Boolean algebras on p. 24, two formulas A and A ∨ (B ∧ ¬B ) of classical propositional logic always have the same truth value. Our second algebra of three truth values is not a Boolean algebra—for example, in relation to law 5 just considered, 0 ∨ (∗ ∧ ∗ ) = ∗ = 0—and so, correspondingly, two formulas A and A ∨ (B ∧ ¬B ) will have different truth values when A has the value 0 and B has the value ∗. Our first algebra of truth values (unlike our second) is not even a lattice—for example, in relation to law 4 of lattices on p. 23, 1 ∨ (1 ∧ ∗) = ∗ = 1—and so, correspondingly, two formulas A and A ∨ (A ∧ B ) will have different truth values when A has the value 1 and B has the value ∗—unlike in classical propositional logic (and the logic resulting from our second algebra of truth values), where such formulae always have the same truth value. However, the point is that the laws of each logic do correspond perfectly to the laws of its own underlying algebra of truth values—thanks to the uniform method of generating the logic from the algebra.
56 notion of a three-valued set: a set whose characteristic function maps objects to our three-membered set of truth values {0, 1, ∗}. Just as some sentence might be neither true nor false, but instead have truth value ∗, so too some object might be neither in a set S (i.e. mapped to 1 by S’s characteristic function), nor not in S (i.e. mapped to 0 by S’s characteristic function)—this object will be mapped to ∗ by S’s characteristic function. Then, supposing, for example, that x is in S, but neither in nor out of T, what is x’s status with respect to S ∪ T? The relevant part of the classical story tells us that (S ∪ T)(x) = S(x) ∨ T(x), where ∨ is now the operation in our new algebra of truth values that replaces the classical operation, symbolized by the same symbol, in our original Boolean algebra of truth values. So we need to see to what value this operation maps the pair (1, ∗). In the first algebra we looked at, the answer is ∗—i.e. x is neither in nor out of S ∪ T. In the second algebra, the answer is 1—i.e. x is in S ∪ T. In order to handle unions and intersections of arbitrary numbers of sets in the way we handled them in the classical story, we need operations and . But as long as our operations ∨ and ∧ are symmetric, associative, and idempotent—as they are on the proposals we have examined—these are easily obtained. Where x, y, z are any elements of our algebra of truth values, we just set {x, y, z} = (x ∨ y) ∨ z = x ∨ (y ∨ z) = (x ∨ z) ∨ y = x ∨ (z ∨ y) = (z ∨ x) ∨ y = z ∨ (x ∨ y) = . . . , and likewise for and ∧. The link between logic and set theory in the classical picture is also maintained in the new three-valued picture. If, on a certain interpretation, ‘Bob’ is assigned x as its referent, and ‘is bald’ is assigned the set S as its extension, then the truth value of ‘Bob is bald’ will be the value (1, 0, or ∗) to which S maps x. Quantified wfs are handled via the operations and just mentioned, in a way that is perfectly analogous to their classical treatment. Finally, ‘Bob is bald’ will be true simpliciter if it gets the value 1 on the intended interpretation. How will three-valued views handle the Sorites paradox? Consider a Sorites series for, say, ‘is tall’. To make things simpler, let’s number the people in the series from 0 to 2000. 0 is 1 m tall, 1 is 1 mm taller, 2 is 1 mm taller again, and so on, until we get to 2000, who is 3 m tall. We want to say that ‘0 is tall’ is false, that ‘2000 is tall’ is true, and that for each person n, if n is tall, then so is (n − 1)—that is, that the following Sorites conditionals are all true: ‘if 2000 is tall, then 1999 is tall’; ‘if 1999 is tall, then 1998 is tall’; . . . ; ‘if 2 is tall, then 1 is tall’; ‘if 1 is tall, then 0 is tall.’ The
57
paradox is that these things that we want to say cannot all be true (at least in the classical picture). One obvious strategy for a three-valued response to this situation—illustrated with respect to the second of the two particular three-valued views introduced above—is as follows. Where Bob is a clear case of tallness, ‘Bob is tall’ has the value 1; where Bob is a borderline case of tallness, ‘Bob is tall’ has the value ∗; and where Bob is a clear countercase of tallness, ‘Bob is tall’ has the value 0. Going from the 2000 end to the 0 end, our Sorites series begins with people who are clear cases of tallness and ends with people who are clear countercases of tallness, and in between contains people who are borderline cases of tallness. So now consider our Sorites conditionals, after recalling the part of the picture which says that A → B is an abbreviation of ¬A ∨ B , and then noting that 1 ∨ 1 = 1, 1 ∨ ∗ = ∗, ∗ ∨ ∗ = ∗, ∗ ∨ 0 = ∗, and 0 ∨ 0 = 1.³¹ The upshot is that some of the Sorites conditionals are not true. So that’s the error in the paradoxical reasoning. Why, then, do we get taken in by the paradox? Because none of the conditionals is false, either. The only way for ‘if P then Q’ to have the value 0 is for P to have the value 1 and Q the value 0. This does not happen in our series: between the persons who are tall (i.e. are mapped to 1 by the set of tall things) and those who are not tall (i.e. are mapped to 0 by the set of tall things) are the borderline cases—that is, the persons who are neither tall nor not tall (i.e. are mapped to ∗ by the set of tall things).³² There are two main objections to three-valued approaches to vagueness. One of them focuses on the fact that the three-valued approach is truthfunctional. I shall postpone discussion of it until p. 85.³³ The other objection is that the three-valued approach falls foul of ‘the problem of higher-order vagueness’, by imposing sharp cut-offs between the objects to which a predicate does not apply and its borderline cases, and between the borderline cases and the objects to which the predicate applies. In fact we ³¹ Alternatively, we could regard A → B as an abbreviation of ¬(A ∧ ¬B ). This makes no difference, because (1 ∧ 1 ) = 1, (1 ∧ ∗ ) = ∗, etc. ³² In general in this chapter, when I say how a given view of vagueness will handle the Sorites paradox, I do so for the purpose of giving a feel for the view to newcomers to the literature, not for the purpose of assessing the view (recall the discussion at the beginning of Ch. 1). Thus in the present case, for example, I do not mean to suggest that the approach to the Sorites just laid out would withstand thorough scrutiny, nor that no other response to the Sorites could be available to proponents of three-valued views of vagueness. ³³ The discussion there is framed in terms of truth gaps, rather than additional truth values, but as far as this objection is concerned, that difference is irrelevant.
58 need to distinguish two potential problems here—problems which are, in my view, conflated in the literature under the one heading ‘the problem of higher-order vagueness’. One problem is that the view does not allow for a gradual transition between the clear cases and the clear countercases. I shall postpone discussion of this jolt problem (as I call it) until Chapter 4 (see also §3.5.5). The other problem is the location problem discussed in connection with epistemicism. Recall our claim, at the beginning of this section, that one particularly nice aspect of the three-valued approach is that it allows us to maintain an attractive parallelism between meaning and use, of the sort that the epistemicist was forced to reject. We can now say that a vague sentence Pa is true if most competent speakers would confidently classify a as being P, is false if most competent speakers would confidently deny that a is P, and is neither true nor false (i.e. has the value ∗) if most competent speakers would hedge over whether a is P. Let’s reconsider this claim. Certainly, the three-valued approach does better than epistemicism in maintaining a broad link between use and meaning—at least we now have a tripartite distinction on both sides. However, at a finer level of detail it seems that the three-valued picture is not so nice after all. Suppose that we are faced with a series of men ranging in height from 4 feet to 8 feet, in very small increments. We are to go down the series, classifying each man as tall, or not. At the beginning, all competent speakers will confidently deny tallness; by the end, all will confidently assert it. But different speakers will become less confident, and start to hedge, at different points, and then later will cease to hedge, and become more confident, at different points. Thus, on the use side we seem to have blurred boundaries between denial and hedging, and hedging and assertion, whereas on the semantic side, we have—on the three-valued account—perfectly sharp divisions between the men mapped to 0 by the characteristic function of the extension (on the intended interpretation) of ‘is tall’ and the men mapped to ∗, and between the men mapped to ∗ and the men mapped to 1. I do not find this objection to the three-valued account anywhere near as potent as the corresponding objection to the epistemicist. In the latter case, we could see no plausible route whatsoever from usage to the epistemicist’s semantic picture. In the present case, however, I think we can. We can say that the set of men mapped to 1 by the characteristic function of the extension (on the intended interpretation) of ‘is tall’ is the set of men which every competent speaker would confidently classify as tall; the set
59
of men mapped to 0 is the set of men of which every competent speaker would confidently deny tallness; and the set of men mapped to ∗ is all the rest. Given this story, the divergences between individuals concerning where they start and stop hedging do not threaten the parallelism between meaning and use—where ‘use’ here means what it obviously should mean in this context: namely, use of the speech community as a whole. That’s a brief statement of a response to the location problem as it faces the three-valued view, and there is more to say about this issue. I shall not say it here, however, for I present a more detailed response along the same lines to a related problem in §6.2.2.³⁴ 2.2.1 The Fuzzy Picture Continuing our examination of the many-valued approach, we ask now: what if we have more than three truth values? The situation will be the same in all cases: we replace the Boolean algebra of classical truth values with a different structure, and then proceed as we did in the classical picture. One natural family of finitely many-valued logics comprises those built on an underlying algebra of n truth values whose elements 1 , . . . , n−2 are {0, n−1 n−1 , 1} for some natural number n ≥ 2.³⁵ Ordering these truth values in accordance with the standard ordering of the rational numbers yields a bounded lattice, and (echoing the third rationale discussed above in connection with three-valued logic), for all truth values x and y in our chosen set of values, we may then set x ∨ y equal to the supremum of x and y, x ∧ y equal to the infimum of x and y, and we may set x equal to 1 − x. However, while four-valued logics have been studied in some detail, they have not figured prominently in discussions of vagueness. Nor have many-valued logics with a larger finite number of truth values. Moving to infinitely many truth values, the first natural stopping point (for reasons to be discussed in n. 40 below) is to have uncountably many truth values—or more concretely, to take as our set of truth values all real ³⁴ Note that I do not think that there is no problem for the three-valued view stemming from its positing of perfectly sharp divisions between the men mapped to 0 by the characteristic function of the extension of ‘is tall’ and the men mapped to ∗, and between the men mapped to ∗ and the men mapped to 1. In fact I think there is a devastating problem here: the jolt problem, to be discussed in Ch. 4. My point at present is that there is not a big problem concerning how our usage—as a speech community—could determine the position of these sharp divisions. ³⁵ The finitely many-valued logics of Łukasiewicz take this form; see Łukasiewicz and Tarski 1970 [1930], 141 or Malinowski 1993, 36.
60 numbers between 0 and 1 inclusive. (You can think of these as percentage values—so a sentence can be 0.3 true, i.e. 30 per cent true, or 0.56 true, i.e. 56 per cent true, etc.) This takes us to the fuzzy picture.³⁶ In this section I shall present the fuzzy view in some detail. All that I will be doing is spelling out, step by step, that the fuzzy picture arises from the classical picture simply by replacing the Boolean algebra of classical truth values with a different algebra of truth values, which has as its elements the real numbers between 0 and 1 inclusive. After making this change, we simply follow through the classical story, step by step. I shall nevertheless spell this out in detail for two reasons. First, there is some suspicion of the fuzzy view in the literature. There is a perception that it is built on sand—that there is something suspect, from the logico-mathematical point of view, about its foundations. This impression is mistaken—and I hope that this section will make this clear. Second, discussions in the vagueness literature of the fuzzy approach typically focus on the fuzzy logical calculus. We are introduced to various sentential connectives, and told how to calculate the truth values of compound sentences given the truth values of their components. A definition of validity or consequence is introduced, and it is explained how a typical Sorites argument is invalid or unsound (depending upon the particular version of fuzzy logic in question). This is all very well, as far as it goes, but simply being taught how to calculate with fuzzy truth values in this way is not very illuminating. We can gain a much clearer understanding of the fuzzy position if we also look at fuzzy sets, and at fuzzy model theory, which is where fuzzy logic meets fuzzy sets. In this section I want to convey not just a facility with certain techniques, but a whole semantic picture —as rich and detailed as the classical picture, and (as I shall argue later) a worthy rival to it. The classical truth values 1 and 0 can be thought of as Yes and No. In the classical picture, when we ask whether a sentence is true, or whether a given object is in a given set, the answer can only be Yes or No. The basic thought behind the fuzzy view is that while the classical picture is fine in the realm for which it was developed—mathematics—it is inadequate when ³⁶ The study of infinite-valued logics begins with Łukasiewicz; see e.g. Łukasiewicz and Tarski 1970 [1930]. Fuzzy set theory was born in Zadeh 1965; a related earlier idea is Post’s notion of an n-valued set (Malinowski 1993, 47, 98). The history of both branches of the fuzzy view (i.e. logic and set theory) is rich; see Hájek 1998, ch. 10, and Novák et al. 1999, ch. 8, for overviews. For technical introductions to fuzzy logic and set theory, see e.g. Klir and Yuan 1995; Hájek 1998; Novák et al. 1999; and Nguyen and Walker 2000. The chief sources for the fuzzy view of vagueness are Goguen 1968–9; Lakoff 1973; and Machina 1976; Black 1997 [1937] is a relevant earlier work.
61
it comes to modelling vague language and the properties expressed therein. For example, neither Yes nor No will do as an answer to the question as to whether Balding Bob is bald. We need a richer set of options: we need degrees of truth and of set-membership. Standard fuzzy practice is to take [0, 1], the set of all real numbers between 0 and 1 inclusive, as the set of truth values, in place of the classical set {0, 1}. 0 is thought of as No and 1 as Yes, but now we have many other options as well: a number between 0 and 1 represents a tentative Yes or No; the higher the number, the less tentative a Yes (and the more tentative a No) it is. Just as there is a natural ordering of the classical truth values, so there is a natural ordering of the fuzzy truth values, arrived at in an analogous manner. Suppose, for example, that S is 0.7 true and T is 0.6 true; then T is strictly less true than S. Now, what S and T mean is not important here: the same relationship would hold between any sentences which were, respectively, 0.7 true and 0.6 true. So we may transfer the truth-ordering of sentences onto the truth values that sentences possess—and overall, the result turns out to be exactly the ordering we get when we restrict the standard ordering of the real numbers to the interval [0, 1]. With this ordering in place, we want to define some operations on [0, 1]. The basic set of operations is ∨, ∧, and : as before, the idea is that these will be used to define, respectively, logical disjunction and set-theoretic union, logical conjunction and set-theoretic intersection, and logical negation and set-theoretic complementation. Now, just as before, if S and T have the same truth value, then we want ‘S or T’ to have that truth value also, and if S and T have different truth values, then we want ‘S or T’ to have the greater of these truth values. Thus we want to say that for any truth values x and y in [0, 1], x ∨ y is the supremum of x and y with respect to our ordering of [0, 1]. For conjunction, if S and T have the same truth value, then we want ‘S and T’ to have that truth value also, and if S and T have different truth values, then we want ‘S and T’ to have the lesser of these truth values. Thus we want to say that for any truth values x and y in [0, 1], x ∧ y is the infimum of x and y with respect to our ordering of [0, 1]. For negation, if S is less true than T, then we want ‘not T’ to be less true than ‘not S’, and we also want the negation of the negation of S to have the same truth value as S. It is customary to set x = 1 − x, for each x ∈ [0, 1], although in fact (as we shall see in §2.2.1.1) this is not the only definition that satisfies our two requirements.
62 We now have an algebra ([0, 1], ∨, ∧, ), consisting of a set of truth values together with some operations on that set. Unlike in the classical case, this is not a Boolean algebra, for it is not complemented.³⁷ It is, however, a Kleene algebra. Just as the Boolean algebra of classical truth values is the fundamental structure underlying classical logic and set theory, this Kleene algebra of fuzzy truth values is the fundamental structure underlying fuzzy logic and fuzzy set theory. Let us begin with fuzzy propositional logic. We have the same language as in the classical case: what changes is its interpretation. As before, an interpretation of the language consists in an assignment of a truth value [p] to every propositional constant p—only this time, the truth values are elements of [0, 1], not {0, 1}. Given an interpretation, every wf A is assigned a truth value [A ] by the following recursive definition: • • •
[A ∨ B ] = [A ] ∨ [B ] [A ∧ B ] = [A ] ∧ [B ] [¬A ] = [A ]
Note that the occurrences of ∨ and ∧ on the left of these identities represent the sentential connectives, while the occurrences on the right denote the operations on our Kleene algebra of truth values. (The situation in the third case is quite clear, as we have two separate symbols, ¬ and .) Thus we are saying that the truth value of the wf A ∨ B —the object assigned to this sequence of symbols on the interpretation in question—is found by locating the object assigned to A and the object assigned to B , and then performing the ∨ operation on these two objects. The result of this operation is an object in our algebra of truth values—and this object is the truth value of A ∨ B . Two wfs A and B are logically equivalent (written A ≡ B ) if they have the same truth value on every interpretation. ≡ is an equivalence relation on F, and F/≡ is the set of equivalence classes of F under ≡. Letting |A | be the equivalence class containing the wf A , we may set: • • •
|A | ∨ |B | = |A ∨ B | |A | ∧ |B | = |A ∧ B | ¬|A | = |¬A | ³⁷ For details see Novák et al. 1999, 23.
63
Note that the occurrences of ∨, ∧, and ¬ on the right of these identities represent the sentential connectives, while the occurrences on the left denote newly defined operations on F/≡. It is a straightforward matter to check that these operations are well defined, and that F/≡ together with these operations is a Kleene algebra.³⁸ As in the classical case (p. 27 above), it is not uncommon to refer to these equivalence classes of wfs as propositions, and to this Kleene algebra of equivalence classes as the fuzzy propositional calculus. An interesting fact here is that this Kleene algebra of fuzzy propositions is identical to the algebra of propositions yielded by Łukasiewicz’s three-valued logic.³⁹ Let us look next at fuzzy set theory. Given any ordinary (classical, crisp) set X, we may identify each fuzzy subset of X with its characteristic function, which is a function from X to [0, 1]. The number assigned by this function to x ∈ X represents x’s degree of membership of the subset in question. For any subsets (i.e. characteristic functions) f and g, and any x ∈ X, we define: • • • •
( f ∪ g)(x) = f (x) ∨ g(x) ( f ∩ g)(x) = f (x) ∧ g(x) f (x) = f (x) f ⊆ g ⇔ ∀x ∈ X, f (x) ≤ g(x)
Note that the occurrences of ∨, ∧, and on the right of these identities denote the operations on our Kleene algebra of truth values, and the occurrence of ≤ denotes our ordering of the truth values. Thus we are saying that to find out to what degree a given x is in the union of f and g, we find the object (in our algebra of truth values) which f assigns to x, and the object which g assigns to x, and then perform the ∨ operation on this pair of objects. The result of this operation is an object in our algebra of truth values, and this object is the value assigned to x by the union set ( f ∪ g); that is, the degree of membership of x in the union set ( f ∪ g). By doing this for every x ∈ X, we build up a complete picture of the union set ( f ∪ g). It is a straightforward matter to verify that these definitions yield a Kleene algebra of subsets. It is also clear that the operations we have just defined are the obvious fuzzy analogues of the classical set-theoretic ³⁸ See Nguyen and Walker 2000, 68. ³⁹ For details see Nguyen and Walker 2000, 66–70. Recall that (the non-conditional fragment of ) Łukasiewicz’s three-valued logic was introduced in §2.2.
64 operations of union, intersection, and complementation. As in the classical case, as well as unions and intersections of pairs of sets f and g, we also define unions and intersections of arbitrary families of sets {fi } (and again, the above definitions for pairs of sets are just special cases of these more general definitions): • {fi }(x) = {fi (x)} • {fi }(x) = {fi (x)} The crucial thing to note here is that for these definitions to work, S and S need to be well-defined for an arbitrary set S of truth values—in other words, the lattice of truth values must be complete. The lattice of classical truth values contains only two elements, so it is automatically complete—but when an ordered set has more than two elements, it may be a lattice without being a complete lattice. However, ([0, 1], ≤) is indeed a complete lattice.⁴⁰ Finally, let us consider fuzzy predicate logic. Again, we have the same language as in the classical case: what changes is its interpretation. An interpretation M = (M, I) of the language consists in a nonempty set M (the domain), together with a function I which assigns: • • •
a truth value [p] to every propositional constant p; an object I(a) in M to every individual constant a; an n-ary fuzzy relation I(P) on M to every n-ary predicate P.
In relation to the final clause, note that an n-ary fuzzy relation on M is a fuzzy set of n-tuples of members of M. Identifying this fuzzy set of n-tuples with its characteristic function, we regard an n-ary fuzzy relation on M as a function from the set of all n-tuples of members of M to our set of truth values [0, 1]. Given an interpretation M, every closed wf A is assigned a truth value [A ]M on that interpretation by a recursive definition. (Again, where the interpretation in question is obvious from the context, or is irrelevant, the subscript M is omitted.) The clauses for non-quantified compound wfs ⁴⁰ So are the lattices of finitely many truth values introduced at the beginning of §2.2.1. However, the set of all rational numbers between 0 and 1 inclusive, under the standard ordering, yields a lattice which is not complete (e.g., the set of all such rationals which are less/greater than √12 has no supremum/infimum). That is why we do not consider taking the set of all rational numbers between 0 and 1 inclusive as our set of truth values, but instead move straight from finitely many truth values to continuum-many.
65
are just as before. For an atomic wf P(a1 , . . . , an ), which consists of an n-ary predicate P followed by n individual constants a1 , . . . , an , the truth definition is as follows: [P(a1 , . . . , an )] = I(P)(I(a1 ), . . . , I(an )) The idea here is this. As discussed, each individual constant a is assigned an object I(a) in the domain, and P is assigned a function I(P) from the set of all n-tuples of members of M to our set of truth values [0, 1]. Whatever value this function assigns to the n-tuple (I(a1 ), . . . , I(an )), this value is the truth value of the sentence P(a1 , . . . , an ). Thus ‘Bob is bald’ is true to the degree to which Bob is in the extension of ‘is bald’; ‘Bill loves Ben’ is true to the degree to which Bill stands in the loving relation to Ben; etc. For quantified sentences, saying that everything is P is treated as saying that this thing is P, and this thing is P, and this thing is P, . . . for everything in the domain; and saying that something is P is treated as saying that this thing is P, or this thing is P, or this thing is P, . . . for everything in the domain. This basic idea of universal quantification as generalized conjunction, and existential quantification as generalized disjunction, is made precise in the following way. Let Ax a be the sentence obtained by writing a in place of all free occurrences of x in A , a being some constant that does not occur in A ; and given an interpretation M with domain M, let Mao be the interpretation which is just like M except that in it the constant a is assigned the denotation o. Then: • [∃xA ]M = {[A a] a : o ∈ M} x Mo • [∀xA ]M = {[Ax a]Mao : o ∈ M} The idea here is that to find out the truth value of, say, ∀xPx, we consider the sentence Pa, and we ask what truth value it would have if everything about our interpretation were the same, except that a denoted this object in the domain; we note this truth value. Then we ask what truth value Pa would have if everything about our interpretation were the same, except that a denoted this other object in the domain; we note this truth value. And so on, for all objects in the domain. We now have a set of truth values—the ones we noted down along the way. To get the truth value of ∀xPx, we apply our infimum operation to this set of truth values, yielding another truth value. The analogy with conjunction enters in the
66 fact that is the generalization to arbitrary sets of truth values of our ∧ operation on pairs of truth values.⁴¹ In sum: starting from the algebra of fuzzy truth values, we gain a useful perspective from which to see the links between fuzzy logic, set theory, and model theory. As in the classical case, we see that with this algebra in hand, everything else comes in a natural progression. Furthermore, it is now clear that although we start from different algebras of truth values, the way in which we proceed from our starting point is the same in both cases. Finally, both pictures also agree in their addition at the end—in order to yield a notion of truth simpliciter, given only a notion of truth on an interpretation—of the idea of there being a unique intended interpretation of any discourse. Recall our checklist of the main elements of the classical view. All our many-valued views differ from the classical view—and from each other—only over point 1a: they replace the classical algebra of truth values with an alternative algebra —with different alternatives in the case of different many-valued theories. How does the fuzzy view solve the Sorites paradox? What are the main objections to the fuzzy picture? I shall discuss these questions, and many more, in Part III. But before moving to our next approach to vagueness, I shall fill out my presentation of the fuzzy picture with a discussion of one further matter. ⁴¹ One might think that as in the case of the definitions of union and intersection for arbitrary families offuzzy sets, in order for these truth definitions for quantified sentences to work, we need S and S need to be well-defined for an arbitrary set S of truth values—in other words, the lattice of truth values must be complete. This time, however, completeness is a sufficient not necessary but condition for our definitions to work. In order for the definitions to work, S and S need not be defined for every set S of truth values: they need only be defined for every set S of truth values which we might arrive at by taking a closed wf ∀xA or ∃xA , stripping off the quantifier, replacing any free occurrences of x by a new constant a, and then putting one truth value, viz. [Ax a]Mao , into S for eachobject o in the domain. Thus the number of sets S of truth values for which we need S and S to be defined will not be greater than the number of wfs of our language. (NB: We are talking about the number of sets of truth values, not the number of truth values in any of these sets.) If (as is standardly the case, and is the case in my presentation here) there are countably many wfs, then (given that we have uncountably many fuzzy truth values) the number of sets S of truth values for which we need S and S to be definedwill be less than the cardinality of the power set of the set of truth values, and thus we do not need S and S to be well-defined for arbitrary sets S of truth values. Thus, lattice completeness gives us more than we actually need. Getting by on just what we do need, without the completeness requirement, requires some effort (see Restall 1994, §4.1, who draws on Brady 1988). I shall not go into the details, because we need our truth values not just for logic, but also for set theory—and there, as we have seen, we do need the lattice of truth values to be complete.
67
2.2.1.1 Operations on Fuzzy Truth Values I have so far mentioned the basic operations ∨, ∧, and on the fuzzy truth values, where x ∨ y is the supremum of x and y, x ∧ y is the infimum of x and y, and x = 1 − x. Let us now consider some other operations. First, negations. A fuzzy negation c is a function from [0, 1] to [0, 1]. There are four noteworthy properties that such a function might possess: 1. 2. 3. 4.
c(0) = 1 and c(1) = 0 If x ≤ y, then c(x) ≥ c(y) c is continuous c(c(x)) = x
Earlier (p. 61) I mentioned that properties 2 and 4 are desirable; as a matter of fact, if a function c : [0, 1] → [0, 1] possesses these properties, then it possesses the other two as well (Klir and Yuan 1995, 52). In the fuzzy literature, conditions 1 and 2 are taken as minimal, and negations which satisfy only these two conditions are studied; notable examples are negations of the threshold type, of which there is one for each value of the parameter t ∈ [0, 1): 1 for x ≤ t c(x) = 0 for x > t As for negations which satisfy all four conditions, the standard fuzzy negation is not the only example. Other notable examples are Sugeno negations, of which there is one for each value of the parameter λ ∈ (−1, ∞): cλ (x) =
1−x 1 + λx
and Yager negations, of which there is one for each value of the parameter w ∈ (0, ∞): cw (x) = (1 − xw )1/w Another notable negation is the Gödel negation:⁴² 1 if x = 0 c(x) = 0 otherwise ⁴² See e.g. Malinowski 1993, 89, and Hájek 1998, 31. For some more details on fuzzy negations, see Klir and Yuan 1995, 51–61, and Nguyen and Walker 2000, 100–8.
68 Fuzzy conjunctions and disjunctions are functions i and u from [0, 1] × [0, 1] to [0, 1]. Some basic properties which we would like them to have are: 1. 2. 3. 4.
i(x, 1) = x and u(x, 0) = x i(x, y) = i(y, x) and u(x, y) = u(y, x) i(x, i(y, z)) = i(i(x, y), z) and u(x, u(y, z)) = u(u(x, y), z) If y ≤ z, then i(x, y) ≤ i(x, z) and u(x, y) ≤ u(x, z)
It turns out that functions with these properties had already been studied in connection with the theory of statistical metric spaces, where they were referred to as t-norms (or triangular norms) and t-conorms respectively.⁴³ There are further conditions which t-norms and t-conorms may or may not satisfy. One condition which was mentioned earlier is idempotence: i(x, x) = x and u(x, x) = x. It turns out that the standard fuzzy conjunction is the only idempotent t-norm, and the standard fuzzy disjunction is the only idempotent t-conorm (Klir and Yuan 1995, 63, 77). Nevertheless, other t-norms and t-conorms are studied in the literature; notable examples are Yager t-norms, of which there is one for each value of the parameter w ∈ (0, ∞): iw (x, y) = 1 − min(1, [(1 − x)w + (1 − y)w ]1/w ) and Yager t-conorms, of which there is one for each value of the parameter w ∈ (0, ∞): uw (x, y) = min(1, (xw + yw )1/w ) Other notable conjunctions are the Łukasiewicz conjunction:⁴⁴ i(x, y) = max(0, x + y − 1) and the product conjunction:⁴⁵ i(x, y) = xy Fuzzy conditionals are functions m from [0, 1] × [0, 1] to [0, 1]. Notable examples include the Łukasiewicz conditional: 1 if x ≤ y m(x, y) = 1 − x + y otherwise ⁴³ This literature begins with Menger 1942. ⁴⁴ The terminology is from Hájek (1998, 28 and ch. 3); as Hájek (1998, 277) notes, ‘‘what we call ‘Łukasiewicz conjunction’ seems never to appear explicitly in [Łukasiewicz’s] writings’’. ⁴⁵ See e.g. Goguen 1968–9, 346–7, and Hájek 1998, 28 and ch. 4. For some more details on fuzzy conjunctions and disjunctions, see Klir and Yuan 1995, 61–83, and Nguyen and Walker 2000, 83–110.
the Gödel conditional:⁴⁶ m(x, y) = and the Goguen conditional: m(x, y) =
69
1 if x ≤ y y
otherwise
y/x if x ≥ y and x > 0 1
otherwise
I shall discuss the conditional further in §5.5.1. We have now considered separately a number of operations on the fuzzy truth values which might be regarded as candidate interpretations of the logical connectives (negation, conjunction, disjunction, and the conditional). Some groupings of these operations hang together more naturally than others. There are two main types of coherence that a group of operations might have. First, a trio of a negation c, a conjunction i, and a disjunction u hang together in one sense if i and u are duals with respect to c, i.e. the De Morgan laws hold: c(i(a, b)) = u(c(a), c(b)) c(u(a, b)) = i(c(a), c(b)) Our basic operations ∨, ∧, and hang together in this sense.⁴⁷ Second, given a conjunction i, the conditional m is said to result from i by residuation (or to be a residuum of i) if: m(x, y) = {z : i(x, z) ≤ y} and where m is a residuum of i, the negation c is said to be a precomplement of i if: c(x) = m(x, 0) Notable trios of operations where the conditional and the negation are, respectively, the residuum and the precomplement of the conjunction are:⁴⁸ (Łukasiewicz conjunction, Łukasiewicz conditional, ) (∧, Gödel conditional, Gödel negation) (product conjunction, Goguen conditional, Gödel negation) ⁴⁶ See e.g. Rescher 1969, 44, and Malinowski 1993, 89, 100. ⁴⁷ This trio of operations is not unique in this respect; see e.g. Klir and Yuan 1995, 83–7. ⁴⁸ See Hájek 1998, 27–32.
70 Given a set of two truth values, there are only 22 = 4 unary operations 2 on that set, and only 2(2 ) = 16 binary operations. On the set of fuzzy truth values [0, 1], by contrast, there are uncountably many n-place operations for each n ≥ 0.⁴⁹ This structural richness makes fuzzy logic an interesting object of study from a technical point of view. From the point of view of studying vagueness, on the other hand, while I would not want to rule out the possibility that further down the track, distinctions amongst different connectives might shed light on important vagueness-related issues, nevertheless I cannot see, at the present stage of enquiry, that there is anything to be gained by moving beyond the basic fuzzy operations ∨, ∧, and . Therefore, with the exception of the discussion of the conditional in §5.5.1, I shall not discuss other operations further in this book. 2.2.2 Worldly Vagueness? Many-valued approaches locate vagueness not in the relationship between language and the world, but in the world itself. The idea is that when I speak vaguely, I invoke a unique intended interpretation, and on this interpretation, each name in my language has a unique referent, each predicate has a unique extension, and so on. The vagueness is all within this intended model. For example, the particular property that my predicate ‘is red’ picks out is inherently indeterminate: some objects possess it, some do not possess it, and others neither possess it nor fail to possess it (i.e. its characteristic function maps these objects to an intermediate truth value—to ∗ on the three-valued approach, or to a value in (0, 1), on the fuzzy approach). On this approach, vagueness is found in the world itself, not in the relationship between language and the world. Well, I really need to be more subtle here. In fact, I do not think that the three-valued approach accommodates vagueness at all, whereas I do think that the fuzzy approach does. I shall argue for this claim in §4.2. What I should say at this point is that all the many-valued approaches keep the referential links between language and the world as determinate as they are on the classical view. Where they differ from the classical view is in their picture of the world itself, which now contains indefinite properties. Whether this indefiniteness is enough to warrant calling these properties ⁴⁹ Cf. Novák 1998, 80.
71
vague is a matter to which we shall come. The point for now is that the divergence from the classical picture comes on the world side, not in the relationship between language and the world. Forbes (1983, 245) writes: ‘‘Note that on [the fuzzy] approach, vagueness resides entirely in concepts. The objects in [the domain of the fuzzy membership function] are perfectly determinate and the fuzzy sets themselves also have exact identity conditions: two such sets are the same iff the same things are members of each to the same degree.’’ Forbes is clearly assuming that worldly vagueness requires vague objects and/or vague identity. I disagree. If the fuzzy view is correct, there exist vague properties and relations alongside the precise ones (indeed, the precise ones are those special cases of the vague ones that map all n-tuples to 0 or 1)—and this is one perfectly genuine sort of worldly or metaphysical vagueness. In order to describe the world, it is not enough just to list the objects it contains: one must also describe their properties and the relations they bear to one another. Metaphysical vagueness is often taken to mean vagueness in objects—but if there is vagueness in properties or relations, this will also be a kind of metaphysical or worldly vagueness. On the fuzzy semantic picture, there are properties and relations in the world that are vague. Contra Forbes, this is not vagueness in concepts.
2.3 Truth Gaps (Deny 1b) We have been examining one way of departing from the classical picture: unplug the algebra of classical truth values, plug in a different algebra, and then proceed as before. We now turn to a second way of departing from the classical view. We leave the classical algebra of truth values untouched, and depart from the classical picture at two other points. First, we allow the propositional part of the interpretation function—the part which assigns a truth value to each simple sentence—to be partial (rather than total). That is, some simple sentences get assigned no truth value at all. (We still say that the interpretation function is total when restricted to names—i.e. every name gets a referent—and when restricted to predicates—i.e. every predicate is assigned an extension.) Second, we allow the characteristic functions of sets to be partial. Recall that if the characteristic function of the set S assigns the object x the value 1, this means x is in S, and if it
72 assigns x the value 0, this means x is not in S. We now allow that S might assign x no value at all—which we interpret as meaning that x is neither in S nor not in S. Thus, we now have truth value gaps for simple sentences, and we have gappy sets.⁵⁰ Given these changes, we now want to run the rest of the classical story with no further changes—or, where this is not possible, with only the minimal changes required to accommodate the two basic changes which we have just made. One immediate upshot is that we get truth gaps for atomic sentences, as well as for simple sentences. For recall the classical story, in which the truth definition for an atomic wf P(a1 , . . . , an ), which consists of an n-ary predicate P followed by n individual constants a1 , . . . , an , is as follows: [P(a1 , . . . , an )] = I(P)(I(a1 ), . . . , I(an )) Recall the idea here: each individual constant a is assigned an object I(a) in the domain, and P is assigned an n-ary relation on the domain, that is a function I(P) from the set of all n-tuples of members of M to our set of truth values {0, 1}. Whatever value this function assigns to the n-tuple (I(a1 ), . . . , I(an )), this value is the truth value of the sentence P(a1 , . . . , an ). This function may now be partial. So suppose it assigns nothing to the n-tuple (I(a1 ), . . . , I(an )); then the atomic sentence P(a1 , . . . , an ) will have no truth value. More concretely, ‘Bob is bald’ is true if Bob is in the extension of ‘is bald’ (i.e. the characteristic function of the extension of ‘is bald’ assigns 1 to the referent of ‘Bob’), is false if Bob is not in the extension of ‘is bald’ (i.e. the characteristic function of the extension of ‘is bald’ assigns 0 to the referent of ‘Bob’), and is neither true nor false if Bob is neither in nor out of the extension of ‘is bald’ (i.e. the characteristic function of the extension of ‘is bald’ assigns nothing to the referent of ‘Bob’). This idea has a natural application in the case of vagueness—for it is a fairly natural thought about vague sentences such as ‘Bob is bald’, where Bob is a borderline case for baldness, that they are neither true nor false. As in the case of the three-valued logicians discussed in §2.2, this allows us to maintain an attractive parallelism between meaning and use, of the sort that the epistemicist was forced to reject. The difference this ⁵⁰ In principle, we could separate these aspects: we could, for example, consider a view according to which all simple sentences have a truth value, but some sets are partial. In practice, however, the two aspects go together very naturally, and we shall not explore the option of separating them.
73
time is that instead of supposing—with the three-valued logicians—that sentences about borderline cases have a third truth value, we now suppose that they simply have no truth value. Instead of positing extra truth values which some sentences might have, we stick with our two truth values 1 and 0, but we suppose that some sentences might possess neither of them.⁵¹ There is an open question remaining. The classical story about propositional logic said that • • •
[A ∨ B ] = [A ] ∨ [B ] [A ∧ B ] = [A ] ∧ [B ] [¬A ] = [A ]
And the story about set theory said that • • •
( f ∪ g)(x) = f (x) ∨ g(x) ( f ∩ g)(x) = f (x) ∧ g(x) f (x) = f (x)
Now, what are we to say about the truth value of a compound wf, say A ∨ B , if one or both of A or B lacks a truth value? And similarly, what are we to say concerning what the characteristic function of, say, S ∪ T assigns to x, if one or both of the characteristic functions of S and T assigns x nothing? We shall examine two different ways of approaching this issue—one in §2.3.1, the other in §2.4. We shall focus on the problem of assigning truth values to compound sentences, rather than on the analogous problem for set theory, because this has been the focus in the literature on vagueness. It should be noted, however, that for each of the approaches we shall discuss, perfectly analogous things could be said in regards to the set-theoretic problem. ⁵¹ In the literature, many writers who take this sort of approach implement it by saying that a predicate, instead of being assigned an extension (as on the classical view), is assigned a pair of an extension and an antiextension, which need not exhaust the domain (Kripke 1975; Soames 1999). I say instead that a predicate is assigned a set as its extension, both on the new view and on the classical view, but the difference is that sets may now be partial, that is, their characteristic functions may be partial, rather than total (as on the classical view). Given the way that I presented the classical picture, my formulation is smoother. The two formulations are, however, equivalent. The set of things assigned 1 by the characteristic function of my extension is their extension; the set of things assigned 0 by the characteristic function of my extension is their antiextension; and the set of things assigned nothing by the characteristic function of my extension is the set of things which are in neither their extension nor their antiextension.
74 2.3.1 The Recursive Approach One core part of the classical view—item 1c on our checklist—was the idea that the truth values of wfs of the forms A ∨ B , A ∧ B , and ¬A are determined solely by the truth values of A and B . The first way of handling our problem sticks as closely as possible to this idea, given that, now, A and B might not have truth values at all. What we need to do is give a recursive definition of the truth values (or lack thereof) of such compounds, given the truth values (or lack thereof) of their components. More specifically, the way this is usually done is to give truth tables which tell us the values that compound sentences are to have, given the values of their components. Now, all the same considerations arise here as arose in connection with the search for three-element algebras of truth values in §2.2. Thus, one idea is that any compound proposition with a component that has no truth value should itself have no truth value. This leads to Kleene’s (1952) weak truth tables (see Table 2.5). Note that Table 2.5 is not incomplete. I have used blanks, rather than a symbol such as ∗, to indicate truth value gaps, in order to mark the difference between what we are doing here and what we did in §2.2. In §2.2 we were proposing a new algebra of truth values, with a third object ∗ in it, as well as 1 and 0. Here, the classical algebra of truth values is unchanged. What changes is the assignment of truth values to compound wfs. For we can no longer say Table 2.5. Kleene’s weak truth tables A
B
A ∨B
A ∧B
¬A
1
1
1
1
0
1 1
0
1
0
1
0 0
1
1
0
0 0
1
0
0
0
75
that [A ∨ B ] = [A ] ∨ [B ] etc.—because while the algebraic operation on the right is unchanged, some sentences do not get assigned a value in the algebra at all, and where this happens, the clause just stated tells us nothing about the value which should be assigned to compounds made up from such sentences. The second idea is as follows. If a compound sentence has components which lack truth values, one should look at all possible ways of assigning truth values to those components, and in each case calculate the truth value of the compound sentence (using the classical rules: remember, we have not altered the classical algebra of truth values, we have simply supposed that some sentences might have no truth value; but where every sentence does have a truth value, we proceed classically). If the answer is 1 every time, the compound sentence gets the value 1; if the answer is 0 every time, the compound sentence gets the value 0; if the answer is 1 sometimes and 0 at other times, the compound sentence gets no truth value. This leads to Kleene’s (1952) strong truth tables (see Table 2.6). Once again, the rationale just used may sound like supervaluationism, but the outcome is in fact very different. As was desired at the outset, the view just presented has the property that the truth values of wfs of the forms A ∨ B , A ∧ B , and ¬A are determined recursively, given the truth values of A and B . The supervaluationist assignment of truth values to compounds does not have this recursive structure, as we shall see in §2.4. Table 2.6. Kleene’s strong truth tables A
B
A ∨B
A ∧B
¬A
1
1
1
1
0
1 1
1 0
1
1
1
0
0 0
1
1
0 0
0
0
0
0
1
0
0
76 The third idea is to consider the ordering of sentences with respect to truth, and to say that a sentence with no truth value is truer than a false sentence and less true than a true sentence. As we would now expect, this rationale also leads to Kleene’s strong truth tables. There are many more views of this sort. For there are as many different gappy truth tables for compound wfs based on the one underlying classical algebra of truth values as there are three-membered algebras of truth values which agree with the classical algebra where only 1 and 0 are involved—and as we saw, there are 311 = 177, 147 of the latter. We shall not consider any more of them in detail. The remaining parts of the gappy story—for example, the treatment of quantifiers—are perfectly parallel to the corresponding parts of the three-valued story. The response to the Sorites paradox on the part of the recursive truth gap view is perfectly parallel to the three-valued story. Likewise, the major objections to the recursive truth gap view are perfectly parallel to the major objections to the three-valued view (and similarly for the responses thereto). So too when it comes to the question of locating vagueness in the relationship between language and the world, or in the world itself: here again, the comments made about the three-valued view apply also to the truth gap view (see §2.4.3 for further discussion).
2.4 Supervaluationism (Deny 1c) The view that the truth values of compound sentences are not determined in a recursive way from the truth values of simpler sentences (i.e. the denial of 1c) can be combined naturally either with the supposition of additional truth values (i.e. the denial of 1a) or with the supposition of truth gaps (i.e. the denial of 1b). In this section we look at one very prominent strategy for denying 1c: supervaluationism. We start by presenting this strategy in combination with truth gaps. Let us return to the point in the development of the gappy account at which we allow the interpretation function to assign nothing to simple sentences, and allow the characteristic functions of sets to be partial. That is, we have a partial interpretation Mp = (M, I p ) of our standard first-order language, which consists in:
• •
77
a nonempty set M (the domain) a partial function I p which assigns: – a truth value (in our classical algebra of truth values) to zero or more of the propositional constants in our language – an object in M to every individual constant in our language – an n-ary relation on M to every n-ary predicate in our language—but the n-ary relation assigned may have a characteristic function which is partial.
We now face the issue of what to say about the truth values of compound sentences. We have examined one type of approach to this question—the recursive approach. We now examine an alternative type of approach: the supervaluational approach. We shall say that a classical interpretation extends a partial interpretation if it agrees with all the assignments that the partial interpretation makes, and also makes an assignment everywhere that the partial interpretation leaves a gap. More precisely, the classical interpretation M = (M , I) extends the partial interpretation Mp = (M, I p ) just in case all of the following conditions hold: • • • •
M = M (i.e. the classical and the partial interpretation have the same domain). For every propositional constant q in the language, if I p assigns q a truth value, then I assigns q that same truth value. For every individual constant a in the language, I(a) = I p (a). For every n-ary relation R in the language and every n-tuple x in M n , if the characteristic function of I p (R) assigns x a value, then the characteristic function of I(R) assigns x that same value.
We can now give a statement of our basic strategy for assigning truth values to compound sentences in our partial interpretation Mp (one important subtlety will be added to the basic strategy below). A compound sentence gets assigned the value 1 on Mp if it comes out as having the value 1 on every classical interpretation which extends Mp ; it gets assigned the value 0 on Mp if it comes out as having the value 0 on every classical interpretation which extends Mp ; and it gets assigned no truth value on Mp if it comes out as having the value 1 on some classical interpretations which extend Mp and 0 on others. The function just described—which assigns truth values
78 to sentences on our partial interpretation, according to the truth values they receive on classical extensions of our partial interpretation—is called a supervaluation. So far, I have mentioned only simple sentences and compound sentences—but what about atomic sentences, i.e. sentences formed from an n-place predicate followed by n names? There are two options here. First, we may regard atomic sentences as getting their truth values directly from the partial interpretation, without any role being played by the classical extensions: the truth value of P(a1 , . . . , an ) on the partial interpretation Mp is the value assigned by the function I p (P) to the n-tuple (I p (a1 ), . . . , I p (an )); thus, in particular, P(a1 , . . . , an ) will have no truth value where I p (P) assigns nothing to this n-tuple. Second, we may regard atomic sentences as getting their truth values from the supervaluation, just as compound sentences do: the truth value of P(a1 , . . . , an ) on the partial interpretation Mp is 1 (0) if the truth value of this wf is 1 (0) on every classical interpretation which extends Mp ; P(a1 , . . . , an ) has no truth value on the partial interpretation Mp if it has the value 1 on some classical extensions and the value 0 on others. It is not hard to see that these two methods of assigning truth values to atomic sentences yield exactly the same truth value assignments. It may therefore seem as though we can pick whichever option takes our fancy, and that nothing hangs on the fact that two different options are available. In fact I think consideration of the two options leads to interesting insights into supervaluationism—but we will in a position to see this more clearly once some other material has been covered, and so I shall postpone further discussion of this issue until the end of §2.4.1. The reader may notice that the supervaluational procedure just outlined is superficially similar to the rationale behind the strong Kleene truth tables. But as I mentioned earlier, the outcome is in fact very different—for the supervaluational method of assigning truth values to compound sentences is not recursive. For example, suppose that the simple sentence q has no truth value on our partial interpretation. There is at least one classical interpretation which extends our partial interpretation on which q is true, and at least one on which q is false. On the former, ¬q is false, and on the latter, ¬q is true. Thus, by the foregoing account, ¬q is neither true nor false on the partial interpretation. Consider q ∨ ¬q. This is a classical tautology, i.e. true on every classical interpretation. Thus, in particular it is true on every classical interpretation which extends our partial interpretation. Thus, by the
79
foregoing account, q ∨ ¬q is true on the partial interpretation. Suppose that, like q, the simple sentence r has no truth value on our partial interpretation. There is, then, at least one classical extension of our partial interpretation on which q and r are both true, one on which they are both false, one on which q is true and r is false, and one on which q is false and r is true. Now consider q ∨ r. It will be true on the first, third, and fourth of these four types of classical extension, and false on the second. Thus by the foregoing account, q ∨ r is neither true nor false on the partial interpretation. But now compare q ∨ ¬q and q ∨ r. Both have the logical form of a disjunction, with each disjunct neither true nor false on our partial interpretation. Yet the former is true on our partial interpretation, and the latter is neither true nor false. Thus, on the supervaluationist account, the truth values of compound sentences are not determined solely by the truth values of their components. The formal technique just described for assigning truth values to compound sentences on a partial interpretation is due to van Fraassen (1966). Given an extra twist, it has a natural application in the context of vagueness, as Fine (1997 [1975]) showed.⁵² The extra twist is the notion of an admissible extension of a given partial interpretation. We approach this notion via the idea of a precisification of vague language. We often need to precisify some part of vague language. For example, we decide that an adult shall be someone over 18 years old, for the purposes of the law; we decide that a reasonable catch of fish shall be ten or less fish (per person per day) for purposes of setting recreational fishing quotas; we decide that an individual (with no dependents) shall be in poverty if he has an annual income of less than $10,488, for purposes of compiling statistics and setting social security benefits; and so on. In general, different precisifications of the same vague term may seem equally legitimate: the point is that we need some precisification, not any particular one. For example, some jurisdictions set the cut-off for adulthood at 18 years of age, some at 21. However, none set the borders of the set of adults in anything like the following ways: •
A person is an adult if and only if his or her age is greater than or equal to 18 and divisible by 2 (i.e. the adults are those of ages 18, 20, 22, . . .).
⁵² Apart from Fine, presentations of supervaluationist approaches to vagueness include Mehlberg 1997 [1958]; Dummett 1997 [1975]; Kamp 1975; Pinkal 1995; Bennett 1998; and Keefe 2000, chs. 7–8.
80 • •
A person is an adult if and only if his or her age is greater than or equal to 18 and his or her favourite colour is green. A person is an adult if and only if either his or her age is greater than or equal to 21 and it is a weekday, or his or her age is greater than or equal to 18 and it is a weekend.
Yet note that if we have an initial partial interpretation which assigns ‘adult’ a gappy extension corresponding to our ordinary vague use of that term, then for each of the definitions just given, there is a classical extension of this partial interpretation which closes the gaps in the application of ‘adult’ in accordance with that definition. For all that we have required of a classical extension is that it must close the gaps in the original partial interpretation: we have imposed no further constraints on how the gaps must be closed.⁵³ Yet the foregoing example shows clearly that not any way of closing the gaps gives us something that we would ordinarily regard as a legitimate precisification of vague language. An admissible extension of a given partial interpretation, then, will be one which corresponds to a legitimate precisification of the vague language whose intended interpretation was given by that partial interpretation. There are various constraints on legitimate precisification. One obvious one is that we have leeway over how to decide the borderline cases, but it is not admissible to stipulate, for example, that a man with no hair is not bald, or that a man with lots of hair is bald. This constraint is already built into the idea of a classical interpretation being an extension of a partial interpretation: it makes assignments where the partial interpretation makes none; it does not disrupt any of the assignments which the partial interpretation does make. But there are other constraints which are not built into the very idea of a classical extension; we stipulate that an extension is admissible if it satisfies these further constraints. For example: •
•
Suppose that x is a borderline case of both ‘red’ and ‘orange’. If we precisify ‘red’ in such a way that x comes out as red, then we must not precisify ‘orange’ in such a way that x comes out as orange. Suppose that x and y are borderline cases of ‘tall’, and y is taller than x. If we precisify ‘tall’ in such a way that x comes out as tall, then y must also come out as tall. ⁵³ Cf. n. 3 above.
•
81
Suppose that Bill is 16, Ben is 18, and Bob is 20. If we precisify ‘juvenile’ and ‘adult’, it must turn out that each of Bill, Ben, and Bob falls in exactly one of these categories, and it must not be the case that Bill and Bob fall in one category, while Ben falls in the other.
The first constraint concerns how one object should be classified relative to several predicates; the second constraint concerns how several objects should be classified relative to one predicate; and the third constraint concerns how several objects should be classified relative to several predicates. This is just a tiny sample of the constraints on admissible precisification, not a full list. But it should be enough to give the idea. Now suppose we have a partial interpretation which corresponds to our actual use of a vague language. A classical extension of this partial interpretation will (to repeat) be admissible just in case it corresponds to a legitimate precisification of that vague language. Note that we have defined the notion of an admissible classical extension of a given partial interpretation. Do not call the admissible extension an admissible interpretation of the given vague language—this will lead to confusion later. Our delineation of a certain class of classical interpretations as admissible extensions is always relative to a given partial interpretation, which is assumed to correspond to a body of usage of vague language. That is, there is a unique intended partial interpretation of some vague language, and then it has many admissible classical extensions. These correspond to legitimate precisifications of the vague language whose intended interpretation was given by the original partial interpretation—they are not themselves intended interpretations of the vague language. (They could not be, because they remove all its vagueness! But more on this below—see especially §4.4.) Returning to supervaluationism as applied to vagueness, the strategy for assigning truth values to compound sentences in our partial interpretation Mp is now as follows. A compound sentence gets assigned the value 1 on Mp if it comes out as having the value 1 on every admissible classical extension of Mp ; it gets assigned the value 0 on Mp if it comes out as having the value 0 on every admissible classical extension of Mp ; and it gets assigned no truth value on Mp if it comes out as having the value 1 on some admissible classical extensions of Mp and 0 on others. The intuitive gloss on this is that a vague sentence is true simpliciter if it would be true no matter how its vagueness were removed; it is false simpliciter if it would be false no matter how its vagueness were removed; and it is
82 neither true simpliciter nor false simpliciter if there are legitimate ways of removing its vagueness that would render it true, and others that would render it false. One point of interest about the supervaluationist approach is that, although its semantics is non-classical, it yields classical logic. That is, any classical tautology is a supervaluationist tautology, and vice versa, and any inference which is valid in classical logic is valid according to the supervaluationist, and vice versa. Let’s be more precise about this. I am assuming here—as throughout this chapter—that we are dealing with our standard first-order language; if we enrich the language in certain ways, then the following result does not hold.⁵⁴ Where W is the set of wfs of our language, a consequence relation Cx is a subset of PW × W . Where is a set of wfs and A is a wf, we say |=x A iff (, A ) ∈ Cx . Any A for which (∅, A ) ∈ Cx is called an x tautology. Now, we define the classical consequence relation Cclass thus: (, A ) ∈ Cclass iff there is no classical interpretation on which all the members of are true and A is false. We define the supervaluationist consequence relation Csval thus: (, A ) ∈ Csval iff there is no partial interpretation on which all the members of are true and A is not true (i.e. is false, or is neither true nor false).⁵⁵ It then turns out that Cclass = Csval .⁵⁶ ⁵⁴ As Hyde 1997, 652 emphasizes, the result also does not hold—even for our standard firstorder language—if we consider the multiple-conclusion consequence relation, in place of the usual multiple-premiss-single-conclusion consequence relation considered in what follows. See §2.4.2 below. ⁵⁵ It has been said that supervaluationists have a choice as to how define validity (cf. Dummett 1997 [1975], 108, and Fine 1997 [1975], 137, and for discussion see e.g. Williamson 1994, 147–8): they can say what we have just said (‘global validity’) or they can say that an argument is valid just in case there is no classical extension of any partial interpretation on which all the members of are true and A is not true (‘local validity’). It should be obvious that the latter definition in fact has no plausibility at all—i.e. our definition is obviously the correct one. However, we shall see in §2.5 that there is a view quite different from supervaluationism, which I call plurivaluationism—and for this view, the obvious definition of validity corresponds to the latter idea. Plurivaluationism and supervaluationism have been conflated in the literature, leading to the illusion that there is one view of vagueness which has two natural options concerning how to define validity. ⁵⁶ Proof. (i) Suppose (, A ) ∈ Cclass . Then (A) there is no classical interpretation on which all the members of are true and A is false. Now suppose we have an arbitrary partial interpretation on which every member of is true. This means that on every extension of our partial interpretation, every member of is true. Then by (A), on every extension of our partial interpretation, A is true. Hence A is true on our partial interpretation. Thus on any partial interpretation on which all the members of are true, A is true; i.e. (, A ) ∈ Csval . (ii) Suppose (, A ) ∈ / Cclass . Then there is a classical interpretation on which all the members of are true and A is false. But note that a classical interpretation is a partial interpretation: it is a special case of the latter, where the only classical interpretation which extends it is itself. So, there is a partial interpretation on which all the members of are true and A is false; hence (, A ) ∈ / Csval .
83
How will the supervaluationist view handle the Sorites paradox? Suppose we remove one grain at a time from a 10,000-grain pile of sand, until we have one grain left. Call the pile with n grains ‘pile n’. Our ordinary, vague usage of the term ‘heap’ corresponds to a partial interpretation on which, let us say, piles 100 through 10,000 are assigned the value 1 by the characteristic function of the extension of ‘is a heap’, piles 1 through 10 are assigned 0, and piles 11 through 99 are assigned no value. (I am talking here about the usage of the whole community of speakers—recall the discussion on pp. 58–9.) Now, for each 11 ≤ n ≤ 100, there is an admissible extension—which we shall call extension N —of this interpretation on which piles 1 through n − 1 are assigned 0, and piles n through 10,000 are assigned 1. (There are also extensions on which, for example, piles 1 through 10, 35, 39, and 42 are assigned 0, and the rest are assigned 1, but these are not admissible.) Let the sentence ‘the cut-off is at n’ mean that piles 1 through n − 1 are non-heaps, and piles n through 10,000 are heaps. Consider the inductive premiss of the Sorites argument associated with this setup: ‘For every 2 ≤ n ≤ 10,000, if n is a heap, then so is n − 1.’ This sentence is false on each of our admissible extensions: on interpretation N, where the cut-off is at n, the corresponding instance of this universal claim is false, and hence the universal claim itself is false. Thus, according to the supervaluationist approach, the inductive premiss gets the value 0 in our partial interpretation. This, then, is the mistake in the paradoxical reasoning: the inductive premiss is false. So why is the paradox compelling? Consider the claim ‘The cut-off is at n’. For each 11 ≤ n ≤ 100, this claim is true on exactly one of our admissible extensions. Thus, in our partial interpretation, it is neither true nor false. There is, then, no point n such that we can truly say ‘The cut-off is at n’. So far, so good. But here, says the supervaluationist, is where we make a natural mistake, and so get drawn into the paradox. We conclude from the foregoing that ‘There is no cut-off’ is true—or in other words, we conclude that the inductive premiss is true. But here we are mistaken. ‘There is no cut-off’ is false on every admissible extension (for each extension puts the cut-off somewhere—just at a different point on different extensions), and hence—according to the supervaluationist—false simpliciter. Thus, the paradox is compelling because we tend to assume that because the cut-off is not here, or here, or here, . . . through all the possible positions where the cut-off might be, it follows that ‘There is no cut-off’ is true—that is, that the inductive premiss is true. On the supervaluationist
84 account, this does not follow. There is no point n such that we can truly say ‘The cut-off is at n’, and yet ‘There is a cut-off’ is true simpliciter —and so the inductive premiss, which denies this, is false simpliciter. The major objection unique to supervaluationism arises directly from the solution to the Sorites just presented. I call it the problem of missing witnesses and counterexamples, or just the ‘problem of missing witnesses’ for short. Intuitively, if a universal claim is false, then there must be a counterexample which makes it false, and if an existential claim is true, then there must be a witness which makes it true. For example, if ‘Everyone is female’ is false, there must be some particular person who is not female, and if ‘Someone is female’ is true, there must be some particular person who is female. But on the supervaluationist view, these relationships between quantified claims and instances do not hold. Consider the inductive premiss of a typical Sorites argument: For every x in the series, if Fx then Fx . As we have seen, the supervaluationist account has this come out false. This would naturally lead us to suppose that one of the following sentences is true, to give us a counterexample to the universal claim: Fx1 but not Fx2 Fx2 but not Fx3 .. . Fxn−1 but not Fxn However, according to the supervaluationist, none of these is true. Consider the sentence ‘Fx but not Fx ’. There are several possible cases: •
•
If x and x are both clear cases of F, or if x and x are both clear countercases of F, then the sentence is false on every admissible extension and hence false in the partial interpretation. If x is a clear case of F and x is a borderline case, or if x is a borderline case of F and x is a clear countercase, or if x and x are both borderline cases of F, then the sentence is true on some admissible extensions and false on others, and hence neither true nor false in the partial interpretation.
Thus the sentence will be either false, or neither true nor false, but will never be true. So we have a false universal claim without a counterexample
85
to make it false. Likewise, if we assert ‘There is an x such that Fx but not Fx ’, this is true in the partial interpretation, because it is true on every admissible extension; yet, as we have seen, there is no sharp line to make it true. Thus the supervaluationist view does not accord with our ordinary use of quantifiers: in particular, it violates expected relationships between quantified claims and their instances.⁵⁷ The major objection unique to the recursive version of the truth gap view is the truth-functionality objection. The objection has been pushed most strongly by Fine, and has also been endorsed by Williamson (1994, 135–8), Urquhart (1986, 113), and others. Fine took as the chief motivation for his supervaluationist theory, as against the recursive approach, the idea that while (for example) it may be unclear whether a particular point on the rainbow is red or orange—so that neither ‘Point x is red’ nor ‘Point x is orange’ is clearly true—it is certainly the case that ‘Point x is red or point x is orange’ is true, and it is certainly the case that ‘Point x is red and point x is orange’ is false. This is one example of a general phenomenon which he called ‘penumbral connection’: ‘‘the possibility that logical relations hold among indefinite sentences’’ (Fine, 1997 [1975], 124). Fine argued that the phenomenon of penumbral connection poses a problem for the recursive approach. Consider three sentences: ‘Bob is bald’, ‘Point x is red’, and ‘Point x is orange’. Suppose that each of them lacks a truth value (or possesses a third truth value: the argument applies equally to the recursive version of the three-valued approach). Then, according to a recursive account, the following two sentences have the same truth value (or lack thereof): ‘Point x is red and Bob is bald’ and ‘Point x is red and point x is orange’. The problem, Fine thinks, is that this is just obviously wrong: the first sentence should lack a truth value, but the second should be false. Similarly, where Bob is a borderline case of baldness—i.e. ‘Bob is bald’ lacks a truth value—the sentence ‘Bob is bald or Bob is not bald’ will lack a truth value on the recursive approaches we have examined, whereas on the supervaluationist approach it will be true. Again, Fine thinks that this supervaluationist assignment is just obviously, intuitively correct, while the recursive assignment is clearly wrong. ⁵⁷ Cf. e.g. Sanford 1976; Rolf 1984, 232; and Tappenden 1993, 564. An analogous problem arises at the level of sentential connectives (which is not surprising, given the analogies between disjunction and existential quantification, and conjunction and universal quantification): a disjunction can be true without either disjunct being true, and a conjunction can be false without either conjunct being false.
86 I have always found this puzzling. Consider ‘red’. If one indicates a point on a rainbow midway between clear red and clear orange and asks an ordinary speaker the following questions, then in my experience the responses are along the lines indicated: • • • •
‘‘Is the point red?’’ Umm, well, sort of. ‘‘Is the point orange?’’ Umm, well, sort of. ‘‘But it’s certainly not red and orange, right?’’ Well, no, it sort of is red and orange. ‘‘OK, well it’s definitely red or orange, right?’’ No, that’s what I’ve been saying, it’s a bit of both, the colours blend into one another.
These reactions fit with the recursive assignments of truth values, not the supervaluationist assignments. Ordinary speakers hedge over ‘x is red’ and ‘x is not red’ when x is a borderline case of redness, and they hedge in just the same way over ‘x is or isn’t red’ and ‘x is and isn’t red’. I have found these reactions to be robust, over a range of examples. Where Bob is a borderline case of baldness, ordinary speakers do not think that ‘Bob is bald or he is not bald’ is clearly true, and they do not think that ‘Bob is bald and he is not bald’ is clearly false: they react to these two sentences with just the sort of hesitancy with which they react to the sentence ‘Bob is bald’; they regard all three sentences as equally dubious. Similarly for ‘heap’, ‘tall’, and so on.⁵⁸ We thus have two sets of intuitions on the table, one of which fits naturally with the supervaluationist assignments of truth values, and one of which fits naturally with the recursive assignments. But at least one other set of intuitions has also been reported in the literature.⁵⁹ According to this third set, when a is a borderline case of F, ‘a is F or a is not F’ is not assertible (which is, prima facie, in conflict with the supervaluationist view), but ‘It is not the case that a is F and that a is not F’ is assertible (which is, prima facie, in conflict with the recursive view). What are we to make of all this? There is much to say, but I will save this discussion for §5.5. For now, I think that any fair-minded person who steps back from her own favoured set of intuitions and even-handedly surveys the literature would have to agree—given first that there are ⁵⁸ Machina 1976 has similar intuitions to mine. ⁵⁹ See Burgess and Humberstone 1987, 199–200; cf. also Tappenden 1993.
87
conflicting (reports of the) data, and second that some of the data do not fit (easily) with the supervaluationist view, some do not fit (easily) with the recursive view, and some do not fit (easily) with either view—that there is no simple, knockdown intuitive argument here either way (i.e. in favour of supervaluationism over the recursive approach, or vice versa). This is still news: for in some quarters it is thought that the supervaluationist’s penumbral connection argument against recursive approaches is a simple, knockdown argument from intuition. In light of the foregoing, I consider that position untenable—and even more so once we notice that the feature of the supervaluationist semantics which leads to the assignment of True to ‘Bob is or isn’t bald’ even when ‘Bob is bald’ and ‘Bob isn’t bald’ are assigned no truth value is the very same feature which leads to the missing witness problem. It is hard to claim—in the context of discussing truth-functionality—that this feature of supervaluationism gives it a clear intuitive advantage over recursive approaches, when it is the very feature that leads to the missing witness problem—the latter being a great intuitive disadvantage of supervaluationism, and one which the recursive approach does not share. The upshot, then, is that the debate between supervaluationist and recursive approaches will need to be settled either by a much more detailed argument from data about the assertibility of sentences about borderline cases, based on proper empirical foundations (see §5.5), or else by other means altogether.⁶⁰ 2.4.1 Supervaluationism and Additional Truth Values As mentioned at the outset, one can combine a supervaluationist approach to compound sentences with a many-valued—rather than two-valued but partial—approach to the interpretation of simple sentences and to the characteristic functions of sets. The three-valued case is straightforward: just as the recursive three-valued approach is perfectly analogous to the recursive truth gap approach, so too the supervaluationist three-valued approach is perfectly analogous to the supervaluationist truth gap approach.⁶¹ If we ⁶⁰ As far as the dialectic of this book is concerned, I present an argument against all non-degree theories of vagueness (Part II) and a number of arguments against supervaluationist degree theories (§2.4.1)—thereby leading us to recursive degree theories. I then return to a detailed discussion of the truth-functionality objection to recursive degree theories in §5.5. ⁶¹ We start with a three-valued interpretation, which consists in a nonempty domain M together with a (total) interpretation function I which assigns a truth value (1, 0, or ∗) to each propositional
88 have more than three truth values, however, then things can get more interesting on supervaluationist versions of the many-valued approach than they can on supervaluationist versions of the partial two-valued approach. In particular, suppose we have a fuzzy interpretation—i.e. a many-valued interpretation where the underlying algebra of truth values is the Kleene algebra [0, 1] discussed in §2.2.1. Instead of extending it to a fuzzy model of the entire language in the recursive way examined in §2.2.1, how might we extend it to a model which assigns an element of [0, 1] to each sentence as its degree of truth, but where the assignments to compound sentences are not a function of the assignments to simple sentences? One way would be to say that a compound sentence gets assigned the value 1 if it comes out as having the value 1 on every classical extension; it gets assigned the value 0 if it comes out as having the value 0 on every classical extension; and it gets assigned the value 0.5 in all other cases. This is not very attractive, however: the assignment of 0.5 in the third case is arbitrary, and the proposal allows simple and atomic sentences to have any degree of truth, while restricting other types of sentence to the values 1, 0, and 0.5, which seems odd.⁶² How, then, might we allow compound sentences to have the full range of intermediate degrees of truth, but in a non-recursive way? One idea is to introduce a further piece of machinery: a measure defined over the set of classical extensions of our fuzzy interpretation.⁶³ We stipulate that the measure is normalized, and we say that the degree of truth of a compound sentence is equal to the measure of the set of classical extensions on which it is true. A classical contradiction is true on no classical interpretation constant, an object in M to each individual constant, and an n-ary relation on M (the characteristic function of which is a total function from M n to our set of three truth values) to each n-ary predicate. Instead of taking the recursive approach from this point on, we say that a classical interpretation extends a many-valued interpretation if it agrees with all the assignments of classical truth values (i.e. 0 and 1) that the many-valued interpretation makes (this can be made precise by making the obvious changes to our definition of a classical interpretation extending a partial interpretation); we introduce the notion of an admissible extension of a many-valued interpretation (in a way perfectly analogous to our introduction of the notion of an admissible extension of a partial interpretation); and we then follow the supervaluationist route: a compound sentence gets assigned the value 1 on our many-valued interpretation if it comes out as having the value 1 on every admissible extension, etc. ⁶² I am assuming here that atomic sentences get their truth values directly from the base interpretation, rather than from the supervaluation—recall the two options regarding the assignment of truth values to atomic sentences discussed on p. 78 above, and see below for further discussion of this point. ⁶³ A measure over a set S is a function μ from the power set of S to the non-negative real numbers, such that μ(∅) = 0, and for any disjoint subsets A and B of S, μ(A ∪ B) = μ(A) + μ(B). Intuitively, the measure μ assigns to each subset of S a size (a real number). The measure is said to be normalized if μ(S) = 1.
89
whatsoever; so the set of classical extensions on which it comes out true is the empty set; hence on this proposal, it comes out as having degree of truth 0. A classical tautology is true on every classical interpretation whatsoever; so the set of classical extensions on which it comes out true is the entire set of classical extensions; hence on this proposal, it comes out as having degree of truth 1. If we restrict ourselves to admissible extensions, the constraints on precisification will generate other sentences with degree of truth 0 and 1—for example, ‘x is red and x is yellow’ will get degree of truth 0, because there will be no admissible extension on which both ‘x is red’ and ‘x is yellow’ come out true.⁶⁴ This may sound like a neat idea, but it faces a number of problems. First is the problem of selecting a measure. Where we have only finitely many admissible extensions, there is a natural choice. Where A is the set of admissible extensions on which A is true, and is the set of all admissible extensions, the measure of A, and hence the degree of truth of A , is simply |A| || .⁶⁵ So, if there are 100 admissible extensions, and A is true on 54 of them, then the degree of truth of A is 0.54. That’s fine—but the finite case is a very special one.⁶⁶ In realistic cases, there will be infinitely many—indeed, uncountably many—admissible extensions. For example, we can set the cut-off for tallness anywhere between, say, 1.8 m and 1.9 m. And in the infinite case, there is no non-arbitrary, natural choice of measure.⁶⁷ So the degree form of supervaluationism is incomplete: we either need to be told how to find a unique, non-arbitrary, natural measure on a set of admissible extensions—which seems utterly hopeless—or we need to be ⁶⁴ For views in the ballpark of degree-theoretic supervaluationism as just presented, see Kamp 1975; 1981, 234–5; and Lewis 1983 [1970], 228–9; 1983 [1976], 69–70. (Sanford 1993, 225 presents a different sort of view, in which the admissible valuations are fuzzy rather than classical, and the supervaluation assigns a sentence not a single degree of truth, but a range of values.) Note that, instead of our fuzzy base interpretation, we could take the degree supervaluationist’s treatment of compound wfs and apply it over a three-valued, or partial two-valued, base interpretation. But then we would face the opposite problem to that faced by the idea of applying the original supervaluationist’s treatment of compound wfs over a fuzzy base interpretation: we would allow compound sentences to have any degree of truth, while simple and atomic sentences could only have the values 1, 0, or ∗. ⁶⁵ |X| is the cardinality of the set X. ⁶⁶ Thus Akiba 2004, 422 moves too fast when he says ‘‘suppose that we assign real numbers from 0 to 1 as degrees of Bruce’s baldness. If you accept . . . supervaluationism, such an assignment is very easy to obtain: you may just assign the ratio that the number of . . . interpretations in which ‘Bruce is bald’ is true bears to the number of all [interpretations].’’ This will work only where there are finitely many interpretations to consider. ⁶⁷ Our set of admissible extensions is not like a set of points in physical space, or in a space such as Rn , where we have a natural notion of the size of a subset.
90 told how to handle a multiplicity of acceptable measures. We already have a multiplicity of admissible extensions of a given many-valued interpretation, bringing with them a multiplicity of truth values for each sentence (one on each extension). Now we need something different: a way of processing a multiplicity of measures over the set of admissible extensions, bringing with them a multiplicity of degrees of truth of sentences in the one intended base interpretation. No such proposals currently exist. Second, once we acknowledge uncountable sets of admissible extensions, we have no guarantee that there will be any measure function which assigns a size to every set of admissible extensions. This means we could end up with compound sentences with no degree of truth. For if A is true on the set A of admissible extensions, and A has no measure, then A has no degree of truth. To remedy this situation, we would need to do one of three things. We might show that we will never encounter non-measurable sets of admissible extensions. Failing that, we might show that even though there might be such sets, none of them will ever be the set of extensions on which some sentence of our language is true—so they will not lead to sentences which have no degree of truth. Failing that, we might propose methods for handling both degrees of truth and degree gaps (together). Again, however, no proposals along any of these lines exist. So the degree form of supervaluationism needs a lot of work. But there are even deeper problems which suggest that it is not worth the effort—that the whole idea of degree supervaluationism is misguided. We have seen that degree supervaluationism gives us a view which assigns the full range of degrees of truth to all sentences, but which—unlike the fuzzy view—is not recursive. However, on closer inspection, it turns out that something rather odd is going on. True, all sentences can have the full range of degrees of truth—but nevertheless, simple and atomic sentences, on the one hand, and compound sentences, on the other, get their degrees of truth in fundamentally different ways. Thus, saying that ‘Bob is bald’ has degree of truth 0.3, or that ‘John is tall’ has degree of truth 0.5, really means something quite different from saying that ‘Bob is bald and John is tall’ has degree of truth 0.25. The first sentence has an intermediate degree of truth because as a matter of fact, out there in the world, Bob neither completely possesses the property of baldness, nor completely fails to possess it—he possesses it to degree 0.3. Similarly for the second sentence. But the third sentence has an intermediate degree of truth for a completely
91
different reason: because one quarter of the classical extensions of our fuzzy interpretation assign it the value 1. That is, on one-quarter of the admissible ways of precisifying our language, this sentence comes out (completely, utterly) true. I think it is misleading to have these two completely different routes to truth disguised under the same description (i.e. an attribution of a degree of truth between 0 and 1 inclusive). But this is not the real problem—for we could gain facility with this framework, and never be confused as to the origin or meaning of a particular sentence’s degree of truth. The real problem is that the rules for compound sentences are different from those for atomic and simple sentences, in a way that seems arbitrary, unmotivated, and downright odd. If you say ‘Bob is bald’ and I say ‘John is tall’, each of our statements is judged by looking at the world, finding the extensions of our predicates and the referents of our names, and seeing how they lie with respect to one another. However if, rather than each of us saying one of these things, you (or I) say both of these things, in the form ‘Bob is bald and John is tall’, then this claim is assessed in a completely different way. Now we look at all the possible precisifications of our language, and determine which of them make this sentence true; its degree of truth is then the proportion of precisifications on which it comes out true. But why should the assessment of two individual sentences be so totally different from the assessment of their conjunction? It’s as if we were to assess individual runners by their time around the track, and assess relay teams not by their cumulative time around the track, but by how colour-coordinated their clothes are. I mentioned earlier (p. 78) that there are two options for the supervaluationist as regards the assignment of truth values to atomic sentences. First, he may regard atomic sentences as getting their truth values directly from the base interpretation, without any role being played by the classical extensions: the truth value of P(a1 , . . . , an ) on the base interpretation M = (M, I) is the value assigned by I(P) to the n-tuple (I(a1 ), . . . , I(an )). Second, he may regard atomic sentences as getting their truth values from the supervaluation, just as compound sentences do. In the present discussion of supervaluationism as applied to many-valued base models, I have so far been assuming the first option (cf. n. 62 above). We can avoid the problem just encountered by moving to the second option, according to which atomic sentences—like compound sentences—are judged according to the proportion of admissible extensions which make them true. Thus the
92 base interpretation assigns an atomic sentence the truth value 0.3 just in case 30 per cent of the admissible extensions of the base interpretation assign this sentence the value 1. So now we avoid the strange situation in which atomic sentences and compound sentences get their degrees of truth by completely different means. Nevertheless, this second option faces severe problems of its own. The view under consideration is that even though the intended base interpretation assigns ‘is bald’ a function f as its extension, and ‘Bob’ an object x as its referent, and f (x) = 0.3, still it does not follow from this that ‘Bob is bald’ is 0.3 true. Rather, to determine the truth of ‘Bob is bald’ we have to see what proportion of admissible classical extensions of the base interpretation make this sentence true. But when we think about it, this is a bizarre view. It flies in the face of the most basic intuitions about truth: in particular, about the way truth is determined by the meanings of our words together with the way the world is. Let’s think it through: the view is that ‘Bob’ means this man and ‘is bald’ means that property—and this man possesses that property to degree 0.3—and yet this does not make it the case that ‘Bob is bald’ is 0.3 true. Rather, what makes the sentence 0.3 true is the fact that 30 per cent of the admissible classical extensions of the base interpretation make this sentence true. One wants to say at this point that what this view is calling the ‘truth values’ of atomic sentences have very little to do with truth as we know it—i.e. with truth as something that is jointly determined by meanings and the way the world is. We have thus arrived at a situation whereby atomic and compound sentences get their ‘truth values’ in the same way only by calling something the ‘truth value’ of the atomic sentence which in fact has very little to do with truth. Here is a way to bring out the worry. I noted in the original case of supervaluationism built over a base of partial two-valued models that our two methods of assigning truth values to atomic sentences always yield the same results. In the case currently under discussion—i.e. supervaluationism built over a base of fuzzy models—we have no such guarantee: there is no reason to expect that I(P)(I(a1 ), . . . , I(an )) = x in the base model just in case the measure over classical extensions assigns x to the set of models in which P(a1 , . . . , an ) is true. So let us suppose for a moment that in the base model, the extension of ‘is bald’ assigns the referent of ‘Bob’ the value 0.3, while ‘Bob is bald’ is true in 40 per cent of the classical extensions of the base model. Now let’s ask: how true do we want to say the sentence ‘Bob is bald’ is? Is it 0.3 true, or 0.4 true? It seems clear that the value assigned
93
by the base model is the one that captures the sentence’s degree of truth, and that the other value—the one assigned by the measure over classical extensions of the base model—while it may (or may not) be interesting in some other way, simply doesn’t have much to do with truth as we know it. That much seems very clear when the values assigned by the base model and by the measure over extensions diverge—but of course, the moral carries over to the case where the values are the same (whether by luck or because we impose it as a constraint on acceptable measures that I(P)(I(a1 ), . . . , I(an )) = x in the base model just in case the measure assigns x to the set of models in which P(a1 , . . . , an ) is true). Thinking otherwise would be like thinking that the tachometer in my car measures the frequency to which my car radio is tuned—as long as (whether by accident or by rigid disciplining of my driving and/or listening habits) the read-out on the tachometer is the same as the read-out on the radio tuner. Thus, while degree supervaluationism seems like a neat idea at first sight, it is ultimately a deeply unhappy one. Furthermore, the moral carries over to the original form of supervaluationism (in which both the base model and the supervaluation are partial or three-valued). The supervaluationist faces a choice regarding the assignment of truth values to atomic sentences: this can be done by the base model, or by the supervaluation. When we originally noted this fact, we said that it might seem to be of little import, because the two methods are guaranteed (in the case of the original form of supervaluationism) to give the same results. But we can now see that neither choice is satisfactory. We cannot have the truth values of atomic sentences being assigned by the supervaluation, for this violates our most basic intuitions about truth—in particular, about the way truth is determined by the meanings of our words together with the way the world is. But then if the truth values of atomic sentences are assigned by the base model, while those of compound sentences are assigned by the supervaluation, we have the bizarre situation in which (for example) the individual truth assessments of two atomic sentences operate in a completely different way from the truth assessment of their conjunction. Thus, the core idea of supervaluationism—of any kind—is an unhappy one. 2.4.2 Subvaluationism Given a base interpretation—many-valued, or partial two-valued—we have seen two non-recursive ways of assigning truth values to compound
94 wfs on that interpretation: the original supervaluationist way and the degree supervaluationist way (the degree of truth of a sentence on the base interpretation is the size of the set of admissible extensions on which it is true). There are other ways of proceeding. One notable one is the dual of the original supervaluationist way: we say that a sentence is true/false on the base interpretation if it is true/false on some admissible extension (as opposed to all admissible extensions, as in the original supervaluationist view). This is called subvaluationism.⁶⁸ Clearly, some sentences may now end up true and false in the base interpretation (viz. those which are true in some admissible extensions and false in others). How are we to handle this? There are two obvious ways: one corresponding to (the basic form of) supervaluationism built over three-valued base models, and one corresponding to (the basic form of) supervaluationism built over partial two-valued base models. In the three-valued case, we retain all the machinery as is, and simply reinterpret the third value ∗ as ‘both true and false’ rather than ‘neither true nor false’ (when considered as a truth value) and as ‘both in and out of the set’ rather than ‘neither in nor out of the set’ (when considered as the value of a characteristic function). In the two-valued case, instead of allowing the interpretation function and the characteristic functions of sets to be partial, we allow them to be non-functional (but total) relations—i.e. they may relate some arguments to more than one value (but they do relate every argument to at least one value). Either way, a subvaluationist semantics leads most naturally to a paraconsistent logic.⁶⁹ Hyde (1997) presents a subvaluationist view where the logic that results is Ja´skowski’s discursive logic.⁷⁰ Those who object to the very idea of a sentence being true and false will reject subvaluationism immediately. Aside from such general worries, is there any reason to favour supervaluationism over subvaluationism as a view about vagueness? A notable feature of discursive logic is the failure ⁶⁸ A subvaluationist approach to vagueness is presented in Hyde 1997; for some further discussion see Akiba 1999; Hyde 1999; Beall and Colyvan 2001; and Hyde 2001. ⁶⁹ A logical consequence relation |= is explosive just in case for all A and B, {A, ¬A} |= B; it is paraconsistent just in case it is not explosive (sometimes this is called weak paraconsistency, with strong paraconsistency consisting in the fact that A ∧ ¬A B). A logic is paraconsistent just in case its logical consequence relation is paraconsistent. A theory T is inconsistent just in case for some A, {A, ¬A} ⊆ T; it is trivial just in case for all B, B ∈ T. Thus, paraconsistent logics provide the basis for inconsistent but non-trivial theories. See e.g. Priest and Routley 1989. ⁷⁰ This was the first formal system of paraconsistent logic (originally presented in 1948). See Hyde 1997, 648, and Ja´skowski 1969.
95
of adjunction: A, B A ∧ B. This and other deviations from the classical consequence relation have led commentators such as Keefe (2000, 198) to compare subvaluationism unfavourably with supervaluationism. However, as Hyde points out, if we widen our view from the customary focus on a (multiple-premiss) single-conclusion consequence relation, to consider a (multiple-premiss) multiple-conclusion consequence relation, the symmetry between subvaluationism and supervaluationism is restored.⁷¹ In particular, corresponding to the failure of adjunction in subvaluationism, we have the failure in supervaluationism of (the classically valid principle) subjunction: A ∨ B {A, B}. We see, then, that the failure of adjunction in subvaluationism—and the corresponding objection that subvaluationism simply gets the meaning of conjunction wrong—is the dual of (one manifestation of) the missing witness problem for supervaluationism. Consider the sentences A, ¬A, A ∨ ¬A, and A ∧ ¬A. Suppose that A and ¬A have the value ∗ in the base model: in supervaluationism, this means that they are neither true nor false; in subvaluationism, it means that they are both true and false.⁷² A ∨ ¬A is true in every classical extension and A ∧ ¬A is false in every classical extension, and so in both supervaluationism and subvaluationism, A ∨ ¬A has the value 1 in the base model, and A ∧ ¬A has the value 0. Thus, the supervaluationist allows a disjunction to be true when neither disjunct is, and allows a conjunction to be false when neither conjunct is, while the subvaluationist allows a disjunction to be true (and not false) when both disjuncts are false (and true), and allows a conjunction to be false (and not true) when both conjuncts are true (and false). Arguably, all of these claims run counter to ordinary intuitions about the truth conditions of disjunctions and conjunctions—but it is hard to see how the subvaluationist’s claims could be thought worse than the supervaluationist’s, especially once we appreciate the perfect symmetry of the semantic pictures that underlie these claims. Thus, even if we agree with Keefe (2000, 198 n. 24) that single-premiss consequence is more important ⁷¹ Multiple-conclusion consequence may be thought of as follows. Where and are sets of wfs, |= just in case in every model in which every member of is true, some member of is true. Where contains only one wf, this reduces to the usual notion of single-conclusion consequence. (In the case of singleton sets of wfs, we typically just write the wf, without set-brackets around it; for example we write {A, ¬A} |= B rather than {A, ¬A} |= {B}.) ⁷² In the alternative way of proceeding, where the base model is two-valued, suppose that A and ¬A are assigned neither value in the supervaluationist case, and are assigned both values in the subvaluationist case.
96 than multiple-premiss consequence (at least in the context of discussing vague natural language), in the final analysis the supervaluationist’s preservation of classical single-conclusion consequence merely provides a coat of paint to hide the underlying rust (i.e. the missing witness problem). Supervaluationism may thus appear more attractive than subvaluationism at first sight, but on closer inspection, the supervaluationist’s corrosion problem seems no less worrying than the (unpainted, fully exposed) rust in subvaluationism (e.g. the failure of the truth of the conjuncts to ensure the truth of the conjunction). 2.4.3 Worldly Vagueness? Both the recursive and supervaluationist forms of the truth gap view have a common base: the idea that each vague discourse has a unique intended partial interpretation. There is no vagueness here in the relationship between language and the world. We are not saying that ‘is bald’ has several possible extensions; we are saying it has one unique extension—there is one property out there which it uniquely picks out—but this property is inherently gappy, in that some things in the domain neither possess it nor fail to possess it. Now from here, our two versions diverge over how they assign truth values to compound sentences on a partial interpretation—but none of these subsequent developments either removes the vagueness inherent in the partial interpretation or introduces any further semantic indeterminacy. In the recursive case this is obvious. It is just as true of the supervaluationist approach. This may be surprising, because supervaluationism is widely touted as a purely linguistic or semantic theory of vagueness—as a theory which, precisely, locates vagueness solely in the relationship between language and the world. But this is a mistake, born of conflating supervaluationism—as presented above—with an entirely different theory of vagueness, which does locate vagueness solely in the relationship between language and the world. This is the theory I call plurivaluationism; we shall examine it in the next section. Plurivaluationism countenances only classical interpretations: it has no truck with partial interpretations, or non-recursive rules for assigning truth values to sentences. Instead, it says that each discourse has many admissible interpretations, not one correct one. As we shall see, this view locates vagueness solely in the relationship between language and the world. But this is a different view entirely. In the supervaluationist picture, the admissible
97
extensions of our unique intended partial interpretation do not have the semantic status they have on the plurivaluationist view—they are not candidate meanings for vague expressions. The supervaluationist’s semantic picture consists of a partial interpretation—i.e. one partial interpretation, the intended/correct/etc. one—which assigns certain truth values to compound sentences. The classical interpretations which extend this partial interpretation are not part of the supervaluationist’s semantic picture as such. They are auxiliary calculating devices for determining the truth values of compound sentences on the partial interpretation. They are, as it were, written on scratch paper—not part of the final answer; they are a ladder which we kick away once we have completed our determination of the truth values of compound wfs on our partial interpretation. Once we have completed this determination, the partial interpretation alone stands as our account of the meaning of vague language. In sum, then, we can say of the supervaluationist version of the truth gap approach exactly what we said of the recursive version of the truth gap approach, which is exactly what we said of the three-valued approach. The idea is that when I speak vaguely, I invoke a unique intended interpretation, and on this interpretation, each name in my language has a unique referent, each predicate has a unique extension, and so on. The vagueness is all within this model. For example, the particular property that my predicate ‘is red’ picks out is inherently indeterminate: some objects possess it, some do not possess it, and others neither possess it nor fail to possess it (i.e. its characteristic function does not send these objects anywhere). On this approach, vagueness (or at least indeterminacy—recall the discussion on pp. 70–1) is found in the world itself, not in the relationship between language and the world. The point generalizes. We have considered base interpretations which are partial two-valued, relational two-valued, three-valued, and fuzzy, and we have considered, as non-recursive strategies for assigning values to compound wfs on these base interpretations, the original supervaluationist proposal, the degree supervaluationist proposal, and the subvaluationist proposal. In all cases, when it comes to the question of worldly vagueness, the base interpretation is what matters, and the strategy for assigning truth values to compound wfs is irrelevant. Thus, a recursive approach over a base interpretation of type X and one of our non-recursive approaches over the same base will not differ in respect of locating vagueness (or indeterminacy) in the world, as opposed to in the relationship between
98 language and the world. All these views deny any indeterminacy in the relationship between language and the world; all the indeterminacy is in the world itself. This is because they posit a unique intended base interpretation which has some form of inherent indeterminacy (different forms in the different kinds of base interpretation we have considered—but always some form). True, they also posit a multiplicity of classical models which extend this base interpretation—but these classical models are not admissible interpretations of the language. Rather, they are admissible extensions of the uniquely correct base interpretation of the language. They play an auxiliary role in calculating the semantic value of compound wfs on the base interpretation—they have no direct semantic significance. In short, the multiplicity of admissible extensions does not introduce semantic indeterminacy. This is in sharp contrast to the plurivaluationist picture, to which we turn now.
2.5 Plurivaluationism (Deny 2) The final component of the classical picture which we isolated in §1.2 was claim 2: each discourse has a unique intended interpretation. We look now at a view which denies this: plurivaluationism. We are to accept all the other parts of the classical picture, apart from claim 2: so our interpretations are perfectly classical, and we retain almost all our classical machinery and notions. What we need to revisit is the idea of a sentence being true simpliciter, as opposed to true on this or that interpretation. For we said that a sentence is true simpliciter if it is true on the intended interpretation. If there is no unique intended interpretation, then when will a sentence count as, simply, true? Well, if we jettisoned the very notion of a correct interpretation entirely, then all we would be able to say about a sentence was whether it was true on any particular interpretation. Given this, we could draw a distinction between sentences which are true on every interpretation (i.e. logically true) and those which are not, but we would not be able to distinguish amongst the latter between those sentences which are actually (in fact, as a matter of how things are, etc.) true and those which are actually false. In turn, we would not be able to distinguish the sound arguments as a subset of the valid arguments (i.e. the ones whose premisses
99
are as a matter of fact true). But plurivaluationists do not go this far. They do not jettison the idea of correct or intended interpretations altogether: they just say that in general there is not a unique correct or intended interpretation. The way that a given speech community uses a language places constraints on what is an acceptable interpretation of that language. For example, we apply the name ‘Helen Clark’ to a certain person, so on acceptable interpretations, this person is the referent of that name; we apply the term ‘apple’ to certain objects, so on acceptable interpretations, these objects are in the extension of that predicate; and so on. Now the classical view holds that the set of acceptable interpretations has one member, which we called the intended/correct/etc. interpretation. The plurivaluationist denies this—but retains the notion of an acceptable interpretation. Thus, we have a privileged class of interpretations, but not a single interpretation privileged above all others. One natural motivation for plurivaluationism would be a fondness for classical interpretations, coupled with a conviction that epistemicism cannot adequately answer the location problem. The plurivaluationist can regard a unique intended classical interpretation as a goal which we never actually reach: our usage rules out certain interpretations as incorrect, but never manages to narrow down the remaining set of acceptable interpretations to just one member. Now to return to our question about truth simpliciter. We have a class of acceptable interpretations. Suppose that a given sentence is true on all of them. Then in an obvious sense it does not matter that we have many acceptable interpretations and not just one: we can still say that our sentence is true. Similarly, if our sentence is false on all the acceptable interpretations, then again it does not make any difference that we have many acceptable interpretations and not just one, and we can say that our sentence is false. But if our sentence is true on some acceptable interpretations and false on others, then it does matter that we have many acceptable interpretations and not just one: we can say neither that our sentence is true simpliciter, nor that it is false simpliciter. This may sound like supervaluationism: we have a range of acceptable classical models, and our sentence comes out true/false simpliciter if it is true/false on all of them. Plurivaluationism is, however, crucially different from supervaluationism. This has not been appreciated in the literature: the
100 two positions have never been distinguished.⁷³ Contrast the following two quotations: Broadly speaking, supervaluationism tells us two things. The first is that the semantics of our language is not fully determinate, and that statements in this language are open to a variety of interpretations each of which is compatible with our ordinary linguistic practices. The second thing is that when the multiplicity of interpretations turns out to be irrelevant, we should ignore it. If what we say is true under all the admissible interpretations of our words, then there is no need to bother being more precise. (Varzi 2003b, 14) In van Fraassen’s original terminology, for each acceptable U-model U, the function that assigns to each sentence its truth value in U is called a classical valuation, while the function that assigns the value true to those sentences true in every acceptable U-model is called a supervaluation. (McGee 1997, 154 n. 12)
My usage of the term ‘supervaluationism’ agrees with that of McGee and, before him, van Fraassen. Varzi, by contrast, is using the term ‘supervaluationism’ to describe a very different view: what I am calling plurivaluationism.⁷⁴ In order to avoid confusion, we need to have separate terms for separate theories. As the inventor of the term ‘supervaluation’, van Fraassen may be given priority here—i.e. his term will be applied to his view. That means that the other view will need a different name—hence my introduction of the term ‘plurivaluationism’.⁷⁵ ⁷³ Even Varzi 2007, §1, which seeks to lay out all the options, does not make this distinction. ⁷⁴ This is not intended as a criticism of Varzi: his usage is not out of line with the vagueness literature. I have singled out Varzi only because the passage quoted gives such a clear description of one of the two views that I want to distinguish. ⁷⁵ Because supervaluationism and plurivaluationism have not been distinguished in the literature, it is impossible to say of previous authors who espouse views in the ballpark of either of these positions whether they should be seen as advocating supervaluationism, plurivaluationism, or some (possibly confused) mixture of the two. Hence I shall make no attempt to classify existing ‘supervaluationist’ views with respect to this distinction. Three views should be mentioned in connection with plurivaluationism, however. One is the partial denotation view of Field 1973, 1974. Field clearly espouses the idea (central to plurivaluationism) that the various classical models in his view play the role of equally acceptable interpretations of the language. (This contrasts with the supervaluationist view of classical models as admissible extensions of a unique intended non-classical interpretation.) However, Field does not draw the distinction between regarding the truth condition ‘true if true on all models’ as a mere manner of speaking (as in the plurivaluationist view), as opposed to a further piece of semantic machinery (as in the supervaluationist view), and so ultimately it is not clear where to classify Field’s view. The second view which should be mentioned is that of Przełe¸cki 1976. Przełe¸cki is explicit that (as in
101
But how exactly are the views different? Many views countenance a range of classical models: this is common to supervaluationism, plurivaluationism, and contextualism (see §2.6), among others. What distinguishes each of these views is the role it gives to these classical models, in conjunction with any further semantic machinery that the view introduces. Consider supervaluationism. As I was at pains to point out, the classical models are admissible extensions of a unique intended partial interpretation of the language. They do not give the content of the language in its actual, unprecisified state: they interpret the language as it would be were it precisified. They do not stand in a direct interpretational relationship to the language. Rather, they serve as calculating devices which yield an assignment of truth values to (actual, vague) sentences of the language. This assignment of truth values, together with the other assignments made in the base interpretation—not in its classical extensions—play the interpretational role: they are what constitute a description of the actual semantic state of the language. In the supervaluationist view, each sentence is in a unique semantic state, arrived at via a supervaluation over many classical models. The truth value of a sentence—i.e. the one assigned to it, on the base interpretation, by the supervaluation (as opposed to any of its truth values on the various classical extensions of this base interpretation)—represents the one actual semantic state of the vague sentence. This truth value merely happens to be arrived at via a description of idealized precisifications which misrepresent the actual, vague state of the language. Plurivaluationism, by contrast, has no further semantic machinery, besides the classical models. In the final story about the actual semantic state of the language, there is no other, non-classical model: there are only the classical plurivaluationism, not supervaluationism) his view involves no non-classical semantic machinery, only a multiplicity of classical models, and he makes the point (crucial to plurivaluationism: see §2.5.1—which was written before I read Przełe¸cki’s paper) that on his view ‘‘reality is sharp—composed of exact structures. It is the connection between language and reality that is fuzzy—linking language not to a single exact structure, but to a whole class of such structures’’ (pp. 379–80). The classification of Przełe¸cki as a plurivaluationist is rendered problematic, however, by his claim (p. 380 n. 3) that Fine’s (supervaluationist) view is similar to his: whether this foils the classification depends upon just how similar he thinks Fine’s view is to his own. The third view is the account of abstract mathematical discourse presented by Islam 1996. Like Field, Islam makes it clear that the various classical models in his view play the role of equally correct interpretations of the language, and, like Przełe¸cki, he makes it clear that his view involves no non-classical semantic machinery. Unlike the plurivaluationist view of vagueness, however, it is no part of Islam’s view that a multiplicity of correct interpretations goes hand in hand with speakers failing to specify exactly what their terms are to mean: on his view, talking relative to multiple interpretations is often exactly what we want to do.
102 ones. Together, they give the content of the language in its actual, vague state. Each of them is an interpretation of the language, not a mere extension of the one uniquely correct interpretation. (This is why I have used different terms for the classical models in the context of the different views: admissible extensions versus acceptable interpretations.) In the plurivaluationist view, there is no further semantic story, beyond that told by the many acceptable interpretations. Here we have indeterminacy of meaning: there are many acceptable interpretations, and that’s all there is to say. Thus each sentence is not in a unique semantic state: semantic states are individuated by interpretations, and each sentence has many of them. On this view, the actual semantic state of the sentence is indeterminacy of truth value: there are many acceptable assignments of truth value to the sentence, and nothing to decide between them. That is a complete and accurate description of the actual semantic state of the sentence. Rather than indeterminacy, we might just as well describe this as plurality or multiplicity of meaning. The essential point remains: each sentence is not in a unique semantic state, as semantic states are individuated by interpretations, and each sentence has many of them, with nothing to decide between them. Figure 2.2 is an attempt to give visual form to the distinction between supervaluationism and plurivaluationism. Suppose the request is made to describe the semantic state of a given sentence. The plurivaluationist can say only: it is true on this interpretation, false on this one, and so on, through all the acceptable interpretations. This is because these interpretations are the only pieces of semantic machinery she has. Her picture is of a language which simultaneously has many classical interpretations, and whose semantic state is thereby indeterminate (or equivalently, which is thereby in many semantic states simultaneously). The supervaluationist, by contrast, will describe the sentence as being in a unique, determinate semantic state. For in her picture, she has further semantic machinery, over and above the classical models: partial interpretations, which assign truth values to compound sentences via supervaluations over classical models. For her, the sentence is not in an indeterminate first-order (classical) semantic state—it is in a perfectly determinate higher-order (non-classical) semantic state. Note that while there are important analogies between plurivaluationism, supervaluationism, and possible-worlds semantics for modal languages—analogies which have, for example, enabled formal results from modal logic to be applied in discussions of supervaluationism in the
103
supervaluationism
plurivaluationism
language
language
key: classical model
non-classical model
interprets/ gives the meaning
extends
Figure 2.2. Plurivaluationism and supervaluationism.
literature—the possible-worlds semantic picture is not the same as either the plurivaluationist or the supervaluationist picture. Figure 2.3 shows the possible-worlds semantic picture. In this picture, the modal language has a unique intended interpretation, which (in a standard version of this kind of semantics—details vary in other versions, but not in any way that matters to the discussion here) consists in a set of possible worlds, a binary accessibility relation on this set, a domain (i.e. set of objects) for each possible world, and a valuation function which assigns to each name a referent at each world, and to each predicate an extension at each world. We can, then, regard the overall model as an arrangement of worlds, and we can regard each world as a classical model (for each world comes with a domain of objects and a referent/extension for each name/predicate). Now we can look at things this way if we wish: but we must remember that the role which these classical models (i.e. worlds)
104
language
key: possible world/ classical model
modal model
interprets/ relative possibility/ gives the meaning accessibility
Figure 2.3. Possible-worlds semantics.
play in the possible-worlds semantic picture is quite different from the role that classical models play in either the plurivaluationist picture or the supervaluationist picture. In the modal picture, unlike in the other two pictures, each classical model is not the kind of thing that could, all by itself, provide an interpretation for the entire language. (Note that the language in this case is a modal language, rather than the first-order language considered in relation to supervaluationism and plurivaluationism.) In the plurivaluationist picture, each classical model interprets the entire language; as there are many classical models in play, we end up with many interpretations of the language. In the modal case, by contrast, all the classical models put together go to make up one interpretation of the entire (modal) language. In the modal case, therefore, the classical models are not in any sense rival interpretations (not of the language as it is, as in the plurivaluationist case, or of the language as it would be were it
105
precisified, as in the supervaluationist case); rather, they are components of a single (non-classical) interpretation. There is a strong intellectual urge to move from indeterminacy or multiplicity with respect to first-order states to determinacy and uniqueness with respect to higher-order states. Thus, for example, it is very natural to think of plurivaluationism as follows. The semantic state of a sentence (for example) is a sequence of truth values, indexed by the acceptable interpretations—say (T, F, F, T, T, F, . . .). We might call this sequence the sentence’s truth profile. But note that we have now moved from plurivaluationism to a different view. On this view, the sentence is in a unique semantic state, represented by the assignment to it of a unique sequence of truth values. The sentence’s semantic state is given by its truth profile, and it has a unique one of these. On the plurivaluationist view, by contrast, the sentence is in an indeterminate state (or a plurality of states), represented by many assignments of truth values to it simultaneously. Again, to take another example, it is very natural to think of plurivaluationism as follows. In modal semantics, we assign predicates intensions instead of extensions. But as we have just discussed, possible worlds are like classical models, so we might now proceed to assign a predicate an ‘intension’ in our plurivaluationist approach to vagueness. This ‘intension’ is just the obvious analogue for predicates of the truth profile of a sentence. But again, we have now moved from plurivaluationism to a different view. On this view, the predicate is in a unique semantic state, represented by its ‘intension’. On the plurivaluationist view, by contrast, the predicate is in an indeterminate state (or a plurality of states), represented by many assignments of extensions to it simultaneously. 2.5.1 Worldly Vagueness? We thus have a distinction between the semantic state of a sentence (or other linguistic item) being indeterminate—because the only semantic states there are, are classical ones, and no unique one of these is privileged—and the semantic state being determinate—because as well as classical semantic states, we have non-classical semantic states, and a unique one of these is privileged for each sentence. This distinction is very important, because only the plurivaluationist view warrants a number of descriptions which have often been attached to ‘supervaluationism’ in the literature. I am thinking of all those claims which talk of a purely linguistic theory of vagueness, of
106 vagueness as semantic indecision, of vagueness as being confined solely to the relationship between language and the world, and so on. The classic quote of this sort is from Lewis: I regard vagueness as semantic indecision: where we speak vaguely, we have not troubled to settle which of some range of precise meanings our words are meant to express. (1986a, 244 n. 32)⁷⁶
The crucial point to note is that this view of vagueness fits with plurivaluationism, not supervaluationism. We saw in §2.4.3 that on the supervaluationist approach, vagueness—or at least indeterminacy—is found in the world itself, not in the relationship between language and the world. As far as the issue of worldly vagueness is concerned, the supervaluationist approach over a certain sort of base interpretation is in exactly the same boat as the recursive approach over that sort of base interpretation: the base interpretation brings with it worldly indeterminacy (e.g. partial properties), and the superstructure cannot then take it away. The supervaluational approach differs from the recursive approach in positing worldly indeterminacy and a ‘non-local’ (non-truth-functional) account of how compound sentences get their truth values. The plurivaluationist, by contrast, locates vagueness entirely in the relationship between language and the world. The plurivaluationist model theory gives us a picture on which each n-place predicate is associated with many n-place relations (one in each acceptable interpretation), each of which is inherently precise. The idea is that when we use a vague predicate, we do not refer to a unique relation, but all the relations we simultaneously refer to are inherently precise. Suppose ‘Bob is tall’ comes out true on some acceptable interpretations of ‘tall’ and false on others. Then we can say neither that ‘Bob is tall’ is true nor that it is false. But, unlike on the supervaluationist approach, this does not mean that the property of tallness is inherently indeterminate. The point is rather that there is no unique property of tallness. There are many properties, each corresponding to an acceptable extension of ‘tall’, and all precise, in the sense that each object either definitely possesses or definitely fails to possess each property. ⁷⁶ Cf. Lewis 1986a, 212.
107
When we speak vaguely, we fail to single out a unique such property. So vagueness is solely a matter of the relation between language and the world. The world itself is precise. To cover this ground one more time: the supervaluationist’s classical extensions represent ways in which vague predicates (and other vague parts of language) could be precisified. What the predicate ‘tall’ actually means is this one, unique partial property which it is assigned in the intended base interpretation. There is no indeterminacy as to what the predicate means: it means this property and no other. However, there are many possibilities as to how this property could be sharpened —one way corresponding to each extension which the predicate might get assigned on an extension of the base interpretation. So we have complete determinacy of meaning. But the thing that is uniquely meant—the property tallness—is inherently indeterminate (because partial, or three-valued, etc.). Contrast the plurivaluationist view, where there are no indeterminate properties (no partial models, no threevalued models, etc.). There are only the perfectly precise classical extensions. But there are many of them, and nothing to decide between them. The indeterminacy here concerns which—out of all these candidate precise properties—is meant by ‘tall’. This indeterminacy is purely semantic. 2.5.2 More on Plurivaluationism How will the plurivaluationist view handle the Sorites paradox? Suppose we remove one grain at a time from a 10,000-grain pile of sand, until we have 1 grain left. Call the pile with n grains ‘pile n’. Our ordinary, vague usage of the term ‘heap’ corresponds to a family of acceptable classical interpretations, one for each 11 ≤ n ≤ 100, on which piles 1 through n − 1 are assigned 0, and piles n through 10,000 are assigned 1. (I am talking here about the usage of the whole community of speakers—recall the discussion on pp. 58–9.) Let the sentence ‘The cut-off is at n’ mean that piles 1 through n − 1 are non-heaps, and piles n through 10,000 are heaps. Consider the inductive premiss of the Sorites argument associated with this setup: ‘For every 2 ≤ n ≤ 10, 000, if n is a heap, then so is n − 1.’ This sentence is false on each of our acceptable interpretations: on an interpretation where the cut-off is at n, the corresponding instance of this universal claim is false, and hence the universal claim itself is false. Thus, according to the plurivaluationist approach, the inductive premiss can be said to be, simply,
108 false: it is false on every acceptable interpretation, so the fact that we have more than one acceptable interpretation becomes irrelevant here. This, then, is the mistake in the paradoxical reasoning: the inductive premiss is false. So why is the paradox compelling? Consider the claim ‘The cut-off is at n’. For each 11 ≤ n ≤ 100, this claim is true on exactly one of our acceptable interpretations. Thus we can say neither that it is true nor that it is false: the fact that we have more than one acceptable interpretation is relevant here. There is, then, no point n such that we can say ‘The cut-off is at n’. So far so good. But here, says the plurivaluationist, is where we make a natural mistake, and so get drawn into the paradox. We conclude from the foregoing that ‘There is no cut-off’ is simply true—or in other words, we conclude that the inductive premiss is true. But here we are mistaken. ‘There is no cut-off’ is false on every acceptable interpretation (for each puts the cut-off somewhere—just at a different point on different interpretations), and hence—according to the plurivaluationist—we can simply say that it is false. Thus, the paradox is compelling because we tend to assume that because we cannot say that the cut-off is here, or here, or here, . . . through all the possible positions where the cut-off might be, it follows that we can say ‘There is no cut-off’—that is, that the inductive premiss is true. On the plurivaluationist account, this does not follow. There is no point n such that we can truly say ‘The cut-off is at n’, and yet we can simply say that ‘There is a cut-off’ is true—and so we can simply say that the inductive premiss, which denies this, is false. This is, of course, analogous to what the supervaluationist said about the paradox—but it is not exactly the same as what the supervaluationist said. The crucial difference is as follows. The supervaluationist has an extra piece of non-classical semantic machinery (a supervaluation) which assigns the sentence ‘There is a cut-off’ the value True, and which assigns the sentences ‘The cut-off is at n’ no truth value (or the value ∗). The plurivaluationist, on the other hand, has only a way of talking. On both views, we can say that ‘There is a cut-off’ is true, while we cannot say that ‘The cut-off is at n’ is true or that it is false—but what this really amounts to differs on the two views. On the supervaluationist view, it corresponds directly to the semantic facts. On the plurivaluationist view, it is a mere way of talking. The genuine semantic facts are that ‘There is a cut-off’ is true on this interpretation, and this one, and so on. We can sum this up by simply saying it is true, but we are not thereby describing some other level of
109
semantic reality, beyond what is going on in each classical interpretation. We are merely summing up the actual, long description of the semantic facts (i.e. of what is going on in each classical interpretation) in a short phrase. (Analogy: If each person on the street believes that the government will lose the election, we can say ‘The man on the street thinks the government will be voted out’. Here we talk as if there is a unique person on the street, who believes that the government will be voted out, but this is just a manner of speech: really we are not talking about a particular person—we believe neither that one amongst all the actual persons on the street is privileged, nor that there is an additional special person, over and above the multiplicity of real persons on the street; we are saying that each of the many persons on the street believes that the government will be voted out. When it is not the case that each of the many persons on the street believes the same thing, we cannot use this manner of speaking.)⁷⁷ This difference means that whereas the supervaluationist response to the Sorites led to the missing witness problem, the plurivaluationist—who offers something analogous to the supervaluationist response to the Sorites—faces only an analogue of the missing witness problem. The problem, recall, was that ‘There is a P’ can come out true, even though ‘This is P’ is true of nothing. On the plurivaluationist view, we have the outcome that ‘There is a cut-off’ may be said to be true, even though no sentence ‘The cut-off is at n’ may be said to be true. Note the ‘may be said to be’s. The plurivaluationist does not say that ‘There is a cut-off’ actually is true even while each sentence ‘The cut-off is at n’ actually is not true. On the plurivaluationist picture—as opposed to the supervaluationist one—there is no level of semantic fact at which an existential claim is assigned the value true while none of its instances is. For the only semantic facts concern what happens on each acceptable interpretation—and these are entirely classical. There is an overall level of talk at which we say that some existential sentence is true, while not saying that any of its instances is true; but this talk is, from the semantic point of view, epiphenomenal. It is mere talk. So we do not have the missing witness problem as such: a sentence being made true while none of its instances is. The problem we do have is that ⁷⁷ In fact, we employ this way of speaking even if most persons believe the government will be voted out: it does not matter if a few don’t. To get the analogy with plurivaluationism right, we need to suppose for a moment that we say ‘‘The man on the street ’s’’ only when all persons on the street . Cf. n. 78 below.
110 we are given a way of talking—‘say that a sentence is simply true if it is true on every acceptable interpretation’—that clashes with our ordinary use of quantifiers. In particular, it violates expected relationships between quantified claims and their instances. Similar comments apply to the penumbral connection/truth-functionality issue. The plurivaluationist will tell us that ‘This leaf is red’ and ‘This leaf is not red’ can be said neither to be simply true nor simply false, while ‘This leaf is red or not red’ can be said to be simply true, and ‘This leaf is red and not red’ can be said to be simply false. Yet, of course—for reasons that should now be familiar—we have no violation of truth-functionality here. There is no level of semantic fact at which a conjunction is assigned the value False, while neither of its conjuncts is, or at which a disjunction is assigned the value True, while neither of its disjuncts is. For the only semantic facts are the facts about what is happening in each acceptable interpretation—and these are entirely classical (hence truth-functional). What we have is just a level of talk laid on top of these semantic facts. The talk sounds non-truth-functional, but it is in fact epiphenomenal. Unlike on the supervaluationist view, it does not literally describe a non-truthfunctional semantic reality. The question remains, of course, whether this way of talking fits or clashes with ordinary usage. 2.5.3 Other Forms of Plurivaluationism? In the discussion of supervaluationism we considered some variants of the basic view: for example, the degree form of supervaluationism and subvaluationism. Is there a parallel range of variants of plurivaluationism? Yes and no. Yes, in that we can distinguish plurivaluationists of the following kinds: 1. those who want to say that a sentence is neither true nor false when it is true on some acceptable interpretations and false on others (cf. the basic form of supervaluationism) 2. those who want to say that a sentence is half true when it is true on half of the acceptable interpretations and false on the other half (cf. the degree form of supervaluationism) 3. those who want to say that a sentence is both true and false when it is true on some acceptable interpretations and false on others (cf. subvaluationism).
111
No, in that there is no semantic difference between any of these views—they merely employ different descriptions of, or ways of talking about, the very same semantic facts. We have seen that for the plurivaluationist, the semantic facts are completely encapsulated in and exhausted by the multiplicity of acceptable classical interpretations—i.e. in the facts about which out of all the possible classical interpretations of a language are acceptable, and the facts about the nature of these acceptable interpretations (i.e. what referent each assigns to each name, what extension each assigns to each predicate, which sentences each makes true, and so on). Now we can describe the way in which the truth of a sentence varies over the acceptable interpretations however we like—for example, in any of the three ways outlined above.⁷⁸ No specific descriptive mechanism is part of the plurivaluationist view (i.e. is part of the core semantic view which I am calling ‘plurivaluationism’). Any, or all, of them is OK (as long as we are all clear on how we are using words in a given context). The key point is that beyond the core semantic facts, anything further that the plurivaluationist says is mere talk, which does not add to or alter the semantic facts in any way. This is in contrast to supervaluationism, where there is a genuine difference in the semantic machinery posited by, for example, the basic supervaluationist, the degree supervaluationist, and the subvaluationist. 2.5.4 Pragmatism There are several approaches to vagueness to be found in the writings of David Lewis. One of them is distinctive (or at least apparently so—but see on) in that it locates vagueness not in semantics, but in pragmatics. Lewis sets up a framework in which a language (in one sense) is a function which assigns meanings to sentences and their components: the meaning of a sentence is a function from possible worlds to truth values; the meaning of a predicate is a function from possible worlds to sets of possible objects; and so on. Lewis then has a story to tell about how a particular population P comes to use a particular one of these languages £. ⁷⁸ There are many other possibilities besides these three. For example, we can also imagine a plurivaluationist who wants to say that a sentence is true/false when it is true/false on most acceptable interpretations. Cf. n. 77 above, and Lewis 1983 [1979], 244. Another view worth mentioning is that of Braun and Sider 2007, according to which a sentence is true just in case it expresses a unique proposition which is true. On this approach, we would say that a sentence which has many admissible interpretations is untrue, regardless of its truth values on its various admissible interpretations.
112 Briefly, the connection between P and £ holds by virtue of a convention of truthfulness and trust in £ which prevails in P: members of P only make utterances which they believe to be true in £, and they proceed as if other members of P are doing this too. Now, Lewis’s languages are precise: a sentence is true in a given language at a given world, or it is false in that language at that world; there is no vagueness in the semantics of any language. Rather, vagueness enters into the relationship between the population P and the set of possible languages: it appears in the form of indeterminacy concerning which language a population uses. Lewis writes: our convention of language is not exactly a convention of truthfulness and trust in a single language, as I have said so far. Rather it is a convention of truthfulness and trust in whichever we please of some cluster of similar languages . . . The convention confines us to the cluster, but leaves us with indeterminacies whenever the languages of the cluster disagree. We are free to settle these indeterminacies however we like. Thus an ordinary, open-textured, imprecise language is a sort of blur of precise languages—a region, not a point, in the space of languages. (1983 [1975], 188)⁷⁹
A Lewis-type language is a function from syntactic items to meanings. If we hold the syntactic items fixed, then considering different Lewis-type languages is just the same as considering different interpretations of one uninterpreted, syntactically specified formal language.⁸⁰ The pragmatist tells us that our practice does not serve to associate us with a unique Lewis-type language, but only with a cluster of such languages. This is just another ⁷⁹ See also Lewis 1969, 200–2: ‘‘I think we should conclude that a convention of truthfulness in a single possible language is a limiting case—never reached—of something else: a convention of truthfulness in whichever language we choose of a tight cluster of very similar possible languages. The languages of the cluster have exactly the same sentences and give them corresponding sets of interpretations; but sometimes there are slight differences in corresponding truth conditions . . . . Our actual language is like a resonance hybrid of the possible languages that make it up.’’ Lewis does not appear to be very attached to this pragmatic view. Elsewhere he writes: ‘‘we have so far been ignoring the vagueness of natural language. Perhaps we are right to ignore it, or rather to deport it from semantics to the theory of language-use. We could say, as I do elsewhere, that languages themselves are free of vagueness but that the linguistic conventions of a population, or the linguistic habits of a person, select not a point but a fuzzy region in the space of precise languages. However, it might prove better to treat vagueness within semantics, and we could do so as follows’’ (Lewis 1983 [1970], 228). Lewis then proceeds to present a degree-theoretic form of supervaluationism (see §2.4 above). Burns 1991, 1995, on the other hand, defends the pragmatic view as the correct theory of vagueness. ⁸⁰ ‘‘By a language, I mean an interpreted language. So what I call a language is what many logicians would call a language plus an interpretation for it. For me there cannot be two interpretations of the same language; but there can be two languages with the same sentences’’ (Lewis 1969, 162).
113
way of saying that our practice does not fix a unique correct interpretation of our (uninterpreted) language, but only a cluster of such interpretations. And that is plurivaluationism! Pragmatism, then, is simply a stylistic variant of plurivaluationism.⁸¹
2.6 Contextualism While they differ in many ways from one another, the views we have looked at so far can all be regarded as having one thing in common: they offer synchronic accounts of vagueness. They hold that in order to see what is distinctive of vague language, we need to look at two areas: the nature of its interpretations (classical, gappy, many-valued, etc.) and the number (one or many) of these interpretations which are singled out as correct or acceptable. No mention is made here of how (if at all) which interpretation(s) is intended changes over time or with context. No doubt the proponents of the views we have examined would be prepared to admit that such changes do, or at least might, occur: for example, it is plausible that ‘today’ uttered today has a different referent from the one it had when uttered yesterday. But the proponents of these views will not think that such changes have anything in particular to do with vagueness. This is where contextualism is distinctive. The basic idea behind contextualism is that vagueness is a diachronic phenomenon, which only emerges when we consider the semantic state of a language over time (or more generally, over multiple instances of interpretation). It is to be accounted for not (primarily) in terms of the nature and number of correct interpretations (at any time), but in terms of how the intended interpretation(s) changes over time. Hence the opening sentence of Tappenden (1993): ‘‘This paper develops one aspect of a program aimed at studying normative constraints on the evolution of language’’, and Soames (1999, 209): ‘‘We need a dynamic model of vagueness’’ (my emphases). ⁸¹ In fact, the interpretations in Lewis 1983 [1970] are more complicated than the classical interpretations countenanced by the plurivaluationist, so we need to be a little more subtle—but my basic point remains. The point can be more accurately stated thus: relative to a given syntax and a given sort of interpretation, plurivaluationism and pragmatism are simply stylistic variants of one another. Keefe 1998a argues (independently) that pragmatism either fails or collapses into supervaluationism (see also Keefe 2000, ch. 6). I agree with many of her points, but wish to stress that in my terms, pragmatism is the same as plurivaluationism, not supervaluationism.
114 Consider a vague predicate: say, ‘is tall’. Here is one way in which the extension of this predicate might be thought to vary with context. Suppose I say ‘Bill Bradley is tall’. If we are engaged in a discussion of current and former basketball players, my claim would seem to be false, because Bill Bradley is below the average height of current and former basketball players. If we are engaged in a discussion of past presidential candidates, my claim would seem to be true, because Bill Bradley is among the tallest of past presidential candidates. Plausibly, what is going on here is that ‘is tall’ has a different extension in the different contexts: in the first context it excludes persons who are not of a height significantly greater than the average height of current and former basketball players; in the second context it excludes persons who are not of a height significantly greater than the average height of past presidential candidates. Many would agree with (something like) this story. Yet clearly this story does not get to the heart of the issue of the vagueness of ‘is tall’—if in fact it touches this issue at all. There are two ways of seeing this. First, the vagueness of ‘is tall’ emerges within a fixed context of the sort just considered: for example, we can easily generate a Sorites series for the predicate ‘is tall (for a basketball player)’, and likewise for the predicate ‘is tall (for a presidential candidate)’. Second, the phenomenon under consideration affects precise terms, such as ‘act of parliament’. Assuming that the above story about ‘is tall’ is plausible, it is also plausible that ‘is an act of parliament’ has a different extension when uttered in a discussion of English law from the one it has when uttered in a discussion of Australian law. Yet there is no vagueness about whether something is an act of the English parliament or whether something is an act of the Australian parliament. It thus seems clear that ‘large-scale’ contextual variation of extension of the sort just discussed is different from vagueness, which arises even within a fixed (large-scale) context. Accounts of vagueness should therefore kick in and do their distinctive work after the (large-scale) context has been fixed. So contextualism about vagueness is not the view that ‘tall’ has a different extension when we are discussing basketball players from when we are discussing jockeys, that ‘bald’ has a different extension when we are looking for a male model for a shampoo commercial from when we are looking for a male model for a scalp oil commercial, and so on. That thesis—about what I am calling large-scale contextual variation—is not an alternative to any of the views we have discussed, but a near truism which proponents of
115
all of them should accept. So what is contextualism about vagueness? As we shall see, we can have a contextualist version of any of the views we have so far considered.⁸² However, it will make for the simplest presentation of the contextualist view if we begin with contextualism built on a foundation of the strong Kleene recursive truth gap view (see §2.3.1).⁸³ We are thus to suppose that the semantic state of the language is just the way the strong Kleene recursive truth gap view says it is. A vague predicate such as ‘is tall’ is assigned as its extension a partial function from the domain to the set of classical truth values. The predicate is true of those objects sent to 1 by this function, false of those objects sent to 0 by this function, and neither true nor false of objects sent nowhere by this function. The truth values of compound sentences are determined recursively, via the strong Kleene truth tables. For the contextualist, however, this is just the beginning of the story: it is just the way things are initially, at the beginning of our conversation or discourse. At every stage of the discourse, the semantic state of the language will fit this general description—but the specifics will change: which partial interpretation is intended will alter, and so, of which objects ‘is tall’ is true and false will change. How and why do these changes take place? The contextualist position is characterized by two theses, which I call Freedom and Power. Freedom tells us that if P is vague and a is a borderline case of P, then a competent speaker is free to assert Pa or to assert ¬Pa, without compromising her competence.⁸⁴ Power tells us that if a speaker does assert Pa in such a situation, then this assertion thereby becomes true. That is, if P is vague and a is a borderline case of P, and competent speakers judge that Pa, then a is in the extension of P just because of this judgement.⁸⁵ Note that Pa is neither true nor false, on the interpretation that was correct just before Pa was judged or uttered; but making the judgement or utterance changes which interpretation is correct, in such a way that the newly correct interpretation is one on which Pa is true. Now there is a further ⁸² Given (as discussed in the opening paragraph of this section) that contextualism differs from the views we have considered on a dimension orthogonal to any of the dimensions on which those views differ from one another, this is not surprising. ⁸³ Contextualist treatments of vagueness include Kamp 1981; Tappenden 1993; Raffman 1994, 1996; Soames 1999, ch. 7; Fara 2000; and Shapiro 2006. Tappenden, Soames, and Shapiro explore versions of the strong Kleene recursive truth gap approach, while Fara is sympathetic towards classical models as the basis for contextualism. ⁸⁴ This formulation is due to Shapiro 2006, who calls this thesis ‘open texture’. ⁸⁵ Shapiro 2006 calls this thesis ‘judgement-dependence’.
116 important aspect to the contextualist story. It is not only the object a that moves into the extension of P when a competent speaker judges that Pa; other objects move too, according to the rules of adjustment associated with P.⁸⁶ For example, if I decide for present purposes in the course of some conversation (for example, about what T-shirt sizes to order, where the options are S, M, L, and XL) that Bill is tall, then it thereby becomes true that Bill is tall, and it also becomes true of anyone who is negligibly different in height from Bill that he or she is tall (i.e. an interpretation which assigns an extension to ‘is tall’ which assigns 1 to all these people thereby becomes the intended interpretation of our language, at this point in the conversation or discourse).⁸⁷ Note that this further adjustment just happens: I do not have to explicitly consider these other people, and say that they are tall. I simply consider Bill, and say he is to count as tall; this affects the context, and we get a new correct interpretation in which not just Bill, but those negligibly different in height from him, are assigned 1 by the extension of ‘is tall’. It is as if Bill is bound to those who are negligibly different from him in height: when I move him into the positive extension of ‘is tall’ (i.e. into the set of things sent to 1 by the extension of ‘is tall’), those very similar to him move with him. The contextualist idea of deciding borderline cases in a partial interpretation one way or the other is related to, but importantly different from, the supervaluationist’s idea of extending a partial interpretation. For the supervaluationist, an extension decides all borderline cases of all predicates one way or the other, resulting in a classical interpretation. In the contextualist picture, we decide one borderline case of one predicate one way or the other. The rules of adjustment ensure that other borderline cases of this predicate thereby get decided too—but in general, not all the borderline cases of this predicate get decided, and in general, other predicates are not affected.⁸⁸ Thus after the adjustment, we still have a partial interpretation, not a classical one. What is the advantage of adding a contextualist story on top of the strong Kleene recursive truth gap view? I shall argue in Chapter 4 that a major ⁸⁶ This term is due to Soames 1999. ⁸⁷ Similarly, if I decide that Bill is not tall, then an interpretation which assigns an extension to ‘is tall’ which assigns 0 to Bill and those who are negligibly different from him in height thereby becomes the intended interpretation of our language, at this point in the conversation or discourse. ⁸⁸ With the exception of predicates whose meanings are related to those of the predicate directly affected—e.g. ‘is short’, in the case where we decide that Bob is to be tall.
117
problem with all versions of the truth gap view (and of the three-valued view) is that they do not allow for a gradual or jolt-free transition between the clear cases and countercases: they posit two sharp semantic jumps in a Sorites series for a predicate F —one between the objects of which the predicate is true and those of which it is neither true nor false, and one between the objects of which the predicate is neither true nor false and those of which it is false.⁸⁹ One apparent advantage of contextualism over the (non-contextualist) strong Kleene recursive truth gap view is that it can explain why we do not think there are such sharp transitions (even though in fact there are, in any context).⁹⁰ Suppose that in the current context, Bob is the last tall man: in the interpretation which is correct at this stage of the conversation or discourse, Bob is sent to 1 by the extension of ‘is tall’, while everyone shorter than Bob is sent nowhere or to 0. If I wanted to identify the boundary between the tall and the borderline cases as lying between Bob and the next man, I would have to say ‘Bob is tall, and the next man is a borderline case’. But now we have a problem: when I say ‘Bob is tall’, the context shifts in such a way that everyone negligibly different from Bob is also sent to 1, and this includes the next man. So I generate a new context in which what I set out to say is not true: in the new context, Bob is tall, and so is the next man—he is no longer a borderline case. The rules of adjustment ensure that when I classify Bob as tall, the next man goes with him. Thus I can never state where the boundary between the positive and borderline cases (or negative and borderline cases) is without generating a new context in which the boundary is somewhere else. We can never nail down the borderline: just as we are about to get it in our sights, it shifts. The contextualist can thus explain why we do not think there are sharp semantic jumps in a Sorites series for a predicate F (even though there are such jumps, in any context, on this view), in a way that the basic truth gap view cannot. The response to the Sorites paradox on the part of the recursive truth gap view was as follows: the error in the paradoxical reasoning is that some of the Sorites conditionals are not true; the reason we are taken in by the paradox is that none of the conditionals is false, either. The contextualist ⁸⁹ As I have mentioned, this jolt problem is one of two problems which are commonly run together under the heading ‘higher-order vagueness’. The other one is the location problem: the problem of how the positions of these jumps could be determined by our usage of vague terms. This, as I have argued and will argue further, is much less of a problem for the truth gap and three-valued views. ⁹⁰ I say the advantage is apparent, because I shall argue in Ch. 4 that contextualism does not solve the jolt problem.
118 inherits this solution,⁹¹ but can add something more as well. First, not only are none of the Sorites conditionals false (in any context), but furthermore, if we think about any of the conditionals, it is liable to become true. Recall that if we say that Bill is tall, it thereby becomes true that Ben is tall also (where Ben is just after Bill in the Sorites series). Now of course if we merely entertain the conditional ‘If Bill is tall, then Ben is tall’, we do not make it true; but if we say for the sake of argument that Bill is tall, and then consider whether, given that Bill is tall, Ben is tall, we do make the conditional true. This provides an added reason why we tend to think that the Sorites conditionals are true, and hence get taken in by the paradox. Second, the contextualist has a response to the dynamic version of the Sorites paradox (also known as the ‘forced march’ Sorites paradox (Horgan 1994)). Suppose we are walked along our Sorites series for F, and asked of each object in the series whether it is F, and then walked back the other way, and asked the same question of each object again. It is very likely that the point at which we stopped saying ‘Yes’ on the way out would be further along the series than the point at which we started saying ‘Yes’ on the way back. This behaviour might seem rather difficult to explain on the recursive truth gap view, but it is easily explained by the contextualist: as we classify an object as F, we thereby make it true that that object and all others very similar to it—including the next object in the series—are F. We thus push the boundary between the F’s and the borderline cases out before us as we go—and on the way back, we push the boundary back the other way.⁹² Let us now consider other forms of contextualism. If we take a nonrecursive truth gap view—say, supervaluationism—as our foundation, we get a very similar view to the one just presented. At any stage of the conversation or discourse, one partial interpretation is the intended one; speakers have the freedom to classify borderline cases of a predicate one way or the other, and so classifying them changes which interpretation is intended in such a way that the classification is true. The only difference from ⁹¹ In fact, not all contextualists accept their inheritance. For example Soames 1999, 214 ff. offers a rather more complex solution to the regular Sorites paradox, based upon the solution to the dynamic Sorites paradox provided by his contextualism (see on). ⁹² There is clearly a problem looming here when we get all the way through the borderline cases: when we classify the last borderline case as P, we thereby push the next object in the series into the positive extension of P also. But the next object in the series was not a borderline case of P —it was a clear negative case—and Freedom and Power tell us only that we can make borderline cases into positive or negative cases. For an objection along these lines directed specifically at Soames’s contextualism, see Robertson 2000; for a reply see Soames 2002.
119
the recursive view concerns what is said about compound sentences. In particular, then, the response to the Sorites paradox will be different—but the basic form of the response will be the same: the contextualist inherits a solution from the view which she takes as her starting point, and can then add to this a further explanation of why we find the Sorites compelling, as well as an account of the dynamic Sorites paradox. What about many-valued contextualism? Recursive three-valued contextualism is perfectly analogous to recursive partial contextualism, and likewise for their non-recursive counterparts. With more than three truth values, the basic story is still the same: the only difference is that now, presumably, the rules of adjustment should say not only which other objects (apart from the one we are directly classifying) should get reclassified by the extension of the vague predicate in question, but how much they should be reclassified.⁹³ What if we were to build our contextualism on a foundation of classical interpretations? Recall that Freedom tells us that if P is vague and a is a borderline case of P, then a competent speaker is free to assert Pa or to assert ¬Pa, without compromising her competence. Thus far we have taken the borderline cases to be the objects which the extension of P sends nowhere. When we move to classical interpretations, there are no such objects, so we need to interpret ‘borderline case’ differently. An obvious move would be to borrow the epistemicist’s idea that a borderline case is one of which we cannot know whether it is P. Power then tells us that if a speaker does assert Pa or ¬Pa of such an object a, then which interpretation is intended changes in such a way that the assertion is true.⁹⁴ This is importantly different from the epistemicist view. On the latter view, while you might not show incompetence with P by asserting Pa when a is a borderline case which in fact is not in the extension of P, you would still say something false. On the contextualist, but not the epistemicist view, saying makes it so. ⁹³ Goguen 1968–69, 351 considers a view according to which the algebra of truth values may vary from context to context. This is much more radical than fuzzy contextualism in my sense, according to which the intended interpretation may vary from context to context, but is always a fuzzy interpretation—i.e. is always based on the same underlying algebra of fuzzy truth values. ⁹⁴ This might not in fact involve a change of intended interpretation if a already was P (unbeknownst to the speaker) and the speaker asserted Pa, or if a already was not P (unbeknownst to the speaker) and the speaker asserted ¬Pa. But even in these cases, it might still involve a change of interpretation, if a was very close to the border: the rules of adjustment might then shift the border, even though the border does not cross a.
120 Is there any advantage for a contextualist in going classical? Simplicity and conservativeness are the obvious attractions. There is also a big disadvantage: we run headlong into the location problem, the avoidance of which was an advantage of moving from the classical picture to partial (or many-valued) interpretations. The classical contextualist tells us that in any context, there is a unique intended classical interpretation of our language. She thus posits bipartite semantic distinctions (between the bald and the non-bald, etc.) where we have tripartite distinctions of usage (those we agree are bald, those we deny are bald, and those over whom we hedge). The problem then is how the dividing line on the semantic side could have got to be located where it is. Telling us that it will be located somewhere else in a minute does not help at all with this problem. Note that while classical contextualism, non-contextualist supervaluationism, and non-contextualist plurivaluationism all have something in common—namely, they consider a range of classical models—there are crucial differences between these three views. Classical contextualism and noncontextualist plurivaluationism are alike—and unlike non-contextualist supervaluationism—in that they regard the classical models they consider as genuine interpretations of the language (not as mere auxiliary calculating devices). Classical contextualism and plurivaluationism differ, in that the contextualist considers these models to be correct interpretations of the language one at a time, whereas the plurivaluationist considers these models to be acceptable interpretations of the language all at once (i.e. plurivaluationism is a synchronic account of vagueness, while contextualism is a diachronic account). Finally, there is a version of contextualism built on plurivaluationist foundations. Indeed, if our reasons for being a plurivaluationist are of the sort I mentioned earlier, then contextualism is a very happy addition to the view. The sort of plurivaluationist I am thinking of is one who regards a unique intended classical interpretation as a goal which we never actually reach: our usage rules out certain interpretations as incorrect, but does not narrow down the remaining set of acceptable interpretations to just one member. Contextualism sits easily with this view. As we newly classify borderline cases of P as positive/negative cases, we narrow down the set of acceptable interpretations in such a way that an interpretation on which Pa is false/true is no longer acceptable.
121
I have not indicated in the title of this section how contextualism differs from the classical picture. Of course, contextualism built on a non-classical foundation differs from the classical picture in an obvious way—but what about contextualism built on a foundation of classical interpretations? One thing which is clear is that the classical picture does not demand that which interpretation of a given language is the intended one be fixed for all time. One way of seeing this is to note that the classical picture is meant to give an account of mathematical language—and it is a commonplace that mathematicians use the same symbol to mean different things in different contexts. But this fact does not establish that contextualism is already part of the classical picture. We already saw that everyone should admit large-scale contextual variation. What is distinctive of contextualism is the positing of a particular mechanism for changing the intended interpretation, which applies in particular to vague predicates. The mechanism is the combination of Freedom (if P is vague and a is a borderline case of P, then a competent speaker is free to assert Pa or to assert ¬Pa, without compromising her competence) and Power (if a speaker does assert Pa in such a situation, then this assertion thereby becomes true). Now clearly Freedom and Power are not part of the classical picture—but also, they do not conflict with any part of it. The classical picture allows variation of the intended interpretation with context; contextualism posits a particular mechanism—specifically involving vague predicates—which generates such variation. So, unlike proponents of any of the other views we have examined—with the exception of epistemicism—the (classical) contextualist does not reject any part of the classical picture; but unlike the epistemicist, the contextualist does not simply accept the classical semantic story as is: she adds something to it. This addition—Freedom plus Power as a mechanism for changing the intended interpretation—can also be added to other sorts of semantic view, apart from the classical one: to many-valued views, partial views, and so on. That is why we can have a contextualist version of any of the views discussed in previous sections. 2.6.1 Worldly Vagueness? We turn finally to the question of whether the contextualist locates vagueness in the relationship between language and the world, or in the
122 world itself. The answer is that this depends entirely upon what sorts of interpretations are chosen as the basis for the contextualist view in question. Worldly vagueness is a matter of the internal nature of our models; semantic indeterminacy is a matter of the relationship between a language and its models. A model is a picture of what the world is like. Classical models, for example, tell us that all the properties and relations in the world are precise: for each property and each object in the world, the object either possesses that property (outright) or does not possess it (at all). Fuzzy models, for example, tell us that the world contains inherently vague properties and relations: there are objects and properties such that—out there in the world, quite independently of human thought and language—that object possesses that property to an intermediate degree. So if a contextualist builds her view on classical foundations, she posits no worldly vagueness; if she builds her view on fuzzy foundations, she does posit worldly vagueness; and so on for the other sorts of models she might take as the basis for her view. If there is worldly vagueness in the base models, the contextualist superstructure cannot take it away; if there is no worldly vagueness in the base models, the contextualist superstructure cannot introduce it. Semantic indeterminacy enters when a view tells us that which object or property is picked out by a name or predicate is indeterminate: there are multiple, equally correct answers. In other words, semantic indeterminacy enters when a view tells us that instead of one correct interpretation, a language has (at some time) many acceptable interpretations. This is what plurivaluationism tells us; it is not what contextualism tells us (except for contextualism built on a plurivaluationist foundation). The contextualist tells us that meaning is variable —it is one unique way (there is one intended interpretation) at any given time, but different ways at different times—not that it is indeterminate. (Contextualism built on plurivaluationist foundations tells us that meaning is both indeterminate—at a time—and variable—over time.)
2.7 Intuitionism (Assert Nothing) Putnam (1983b, 284–6) suggests that the best logical system for schematizing inferences that involve vague terms is intuitionism: we should ‘‘treat
123
vague predicates (e.g. ‘bald’) just as undecidable predicates are treated in intuitionist logic’’. We accept as valid the argument: 1. ∀n (If a man with n hairs is bald, then a man with n + 1 hairs is bald). 2. A man with 0 hairs is bald. 3. ∴ ∀n (A man with n hairs is bald). We also accept that 2 is true and 3 is false. Thus we conclude that 1 is false, and we accept its negation. However, reasoning intuitionistically, we are not then forced to conclude: 4. ∃n (A man with n hairs is bald, and a man with n + 1 hairs is not bald) as we would be if we were reasoning classically. Thus we can defuse the Sorites paradox, without being committed to the highly unintuitive 4. Read and Wright (1985) argue that if intuitionist logic is combined with the standard intuitionist semantics for negation, then the Sorites paradox recurs in a modified form. Putnam (1985) replies that he ‘‘was proposing a logic for inferences involving vague predicates, not a ‘semantics’, so the attack on intuitionist semantics is another example of ignoratio elenchi’’.⁹⁵ Putnam’s, then, is an example of a ‘no semantics’ approach to vagueness. Such an approach differs from the classical picture not over this or that matter of detail—whether concerning the internal nature of the models countenanced or the external relationship between a language and its intended model(s)—but by not proposing any semantic picture at all. In §2.1.3.1, I contrasted semantic realism and antirealism, and said that the majority of current approaches to vagueness assume a semantic realist viewpoint. Here, however, we have a view of vagueness from an antirepresentationalist perspective: a view according to which we approach vague language not in terms of its referential and representational relationships to the world, but in terms of what one is permitted to assert at a given point. In much the way that model theory is the natural framework within which to discuss and develop views of vagueness from a semantic realist perspective, proof theory is the natural framework within which to discuss and develop views of vagueness from an antirepresentationalist perspective. Putnam’s ⁹⁵ For further discussion of Putnam’s view see Schwartz 1987, 1990; Rea 1989; Schwartz and Throop 1991; Putnam 1991; Williamson 1996a; and Chambers 1998. For a different intuitionist approach to vagueness, see Wright 2001.
124 is such a proof-theoretic, as opposed to model-theoretic, proposal.⁹⁶ It is worth noting the possibility of a proof-theoretic/antirepresentationalist as opposed to model-theoretic/semantic realist approach to vagueness, but I will not discuss the former sort of approach in this book. That is because the interesting issues here are all concerned with which sort of approach to meaning is better in general —rather than having anything in particular to do with vagueness—and because (relatedly) there simply is not the weight of detailed theories of vagueness on the proof-theoretic side that there is on the model-theoretic side, and so there is less to talk about on the former side, and less interesting connections and patterns to discern between different theories. Perhaps if the general debate between semantic realism and antirealism moves in a certain direction, a proof-theoretic approach to vagueness will become a serious contender—but at this stage the race is between rival model-theoretic approaches. ⁹⁶ Of course, proponents of model-theoretic approaches to vagueness may explore the question of the appropriate proof theory for reasoning involving vague language—but they will regard proof theory as semantically grounded. That is, they will regard a system of proof theory as correct if it always preserves truth, or some other important semantically defined property. What I am calling the prooftheoretic approach, by contrast, regards proof theory as able to stand alone, without (model-theoretic) semantics to give it life.
P A RT II
Vagueness
This page intentionally left blank
3 What is Vagueness? In the Introduction, I presented three standard characterizations of vague predicates: they give rise to borderline cases; their extensions have blurry boundaries; and they generate Sorites paradoxes. As noted, these three characterizations seem to be quite closely related to one another—yet not necessarily so closely related that they are simply three ways of saying the same thing. So what is the relationship between them? Rather than three piecemeal characterizations of vagueness, it would be desirable to have a fundamental definition of vagueness: a statement of what is of the essence of vagueness, given which, we can see why vague predicates have borderline cases, generate Sorites paradoxes, and draw blurred boundaries. Such a definition can then serve as a guide to finding the correct theory of vagueness: we need a theory which accommodates the fact that vague predicates have the essential feature(s) pinpointed in our definition. This chapter is devoted to the task of providing such a definition of vagueness. In §3.1, I explain what I take the task of defining vagueness to involve, and why this task is important. In §3.2, I examine and criticize existing definitions of vagueness. In §3.3, I explain a key notion necessary for an understanding of my definition of vagueness. In §3.4, I present my definition, and in §3.5, I present its advantages.
3.1 What Should We Want from a Definition of Vagueness? There are at least two things we could mean when we speak of a ‘definition’ of a property, object, or phenomenon P. First, we could mean a surface characterization: a set of manifest conditions, possession of which marks out the P’s from the non-P’s. Second, we could mean a fundamental definition
128 which pinpoints the underlying nature or essence of the P’s. To illustrate this distinction, consider the following exchange: A. What is water? B. It’s the ‘watery stuff ’: the clear liquid that falls from the clouds and fills rivers and lakes . . . A. No, I mean, what is water? B. H2 O.
B first offers a surface characterization of water, and then offers a fundamental definition. There seems to be room for this sort of distinction when it comes to vagueness: A. What is a vague predicate? B. One that admits of borderline cases, draws blurred boundaries, and generates Sorites paradoxes. A. No, I mean, what is a vague predicate? B. Good question!
We have an adequate surface characterization of vague predicates (the one B just gave, i.e. the conjunction of our three informal characterizations). But there seems to be room for a further question: what is the fundamental underlying nature or essence of vague predicates, which explains why they manifest in the ways featured in this surface characterization? That is the question of this chapter. (Henceforth, when I speak simply of a ‘definition’, I mean a fundamental definition, and when I speak simply of a ‘characterization’, I mean a surface characterization.) Here are some desiderata on a fundamental definition of a property, object, or phenomenon P: the definition should be a statement about what it is to be P that is true, useful, and fundamental. The first and third desiderata are straightforward. A fundamental definition of P must capture the fundamental fact(s) about P. It must get to the heart of the matter and capture not just any truths about P, but the fundamental truths, from which others follow. The second desideratum covers at least the following. A definition must not be circular: ‘human beings are human beings’ may capture a fundamental fact about human beings, but it is not an adequate definition of human being. A definition must be clear and rigorous: ‘vague predicates draw blurred boundaries’ captures a fundamental fact about vague predicates, but it is not sufficiently clear or perspicuous to be an adequate definition of vagueness (see §3.2). A definition should not (unless this is
129
shown to be unavoidable) define some problematic notion in terms of even more problematic notions. Finally, a definition of P must link usefully with the project of offering a substantive theory of P: the definition tells us what are the fundamental facts about P, and the theory of P then needs to account for these facts.¹ The latter point is crucial, and indeed it underlies the need to search for a fundamental definition of vagueness at all, rather than simply resting content with our surface characterization. As discussed at greater length in the Introduction (pp. 6–9), we are at a point in the study of vagueness at which we have a large number of competing theories, and need a way of deciding amongst them. My proposed strategy for making progress here—i.e. in the debate as to what is the correct theory of vagueness—is to proceed via the question of giving a fundamental definition of vagueness. When vagueness is characterized in terms of borderline cases, blurred boundaries, and Sorites-susceptibility, all the main existing types of theory of vagueness can be seen as accommodating vagueness. This leads to the ‘too many theories’ problem: the existing theories cannot all be right, as they conflict with one another; yet how to choose between them, if they all accommodate vagueness perfectly well? My idea for a way forward here is that the situation might well be very different if we had to hand a sharp definition of the core property underlying the various surface phenomena used to characterize vagueness. Given a fundamental definition of vagueness, if we then ask whether each type of theory allows for the existence of predicates possessing the feature isolated in this definition as being of the essence of vagueness, it may well turn out that the answer is ‘No’. Indeed, it is hard to see that there is any other way forward on the ‘too many theories’ problem. The only other strategy which seems to be available is some version of cost/benefit analysis. In its strong form, this consists in arguing that all the alternatives to one’s favoured theory are untenable (cf. Williamson 1994); in its moderate form, it consists in conceding that there is more than one workable theory, but arguing that ¹ Cf. Suppes 1957, 151: ‘‘Many textbooks . . . promulgate the four traditional ‘rules’ of definition: 1. A definition must give the essence of that which is to be defined. 2. A definition must not be circular. 3. A definition must not be in the negative when it can be in the positive. 4. A definition must not be expressed in figurative or obscure language.’’ 1 and 3 come under my heading ‘fundamental’; 2 and 4 under ‘useful’.
130 nevertheless, when all the pros and cons of the various theories are weighed up, one theory comes out ahead of the others on balance (cf. Keefe 2000). The strong form of cost/benefit analysis offers no way forward at this point in the debate, because while some theories are generally regarded as untenable, there is a range of live alternatives in the literature which have been developed and defended to such a level of sophistication that it is simply not credible to say that all but one of them is untenable. In short, while the strong form of cost/benefit analysis would be great if it worked, in fact it simply fails to yield a winner in the case of theories of vagueness. The moderate form of cost/benefit analysis also offers no clear way forward. It suffers from two problems. First, it is inherently limited to yielding a provisional result at best: for new objections and defences are coming to light in the journals at a great rate. Second, two authors can weigh up the same theories—considering the same objections to, and replies on behalf of, each—and yet come to a different conclusion concerning which theory is best on balance. Thus the cost/benefit model desperately needs to be supplemented by theory-neutral criteria for weighing up theories: and we have no such criteria. Related to this second point, it seems that the search for criteria for weighing up the costs and benefits of rival theories of vagueness cannot succeed independently of the project of giving a fundamental definition of vagueness—and so moderate cost/benefit analysis is not, in the end, an alternative to my own strategy. Let me explain. In the debate on vagueness, one often hears arguments such as the following: ‘‘My theory is better than yours, because mine retains the validity of the law of excluded middle, while yours does not.’’ Now in fact the weight of this argument is entirely unknown, unless we have some idea of the defining features of vagueness. For suppose it is the case that the essential feature of vague predicates is that they admit borderline cases (in some specific sense), and suppose that no theory which retains excluded middle can allow for the existence of predicates which have borderline cases (in this sense). If that were the case, then far from being a problem, abandoning excluded middle would be a necessary condition on any correct theory of vagueness. Now I am not saying that these claims about borderline cases and excluded middle are true. I am simply saying that we cannot assess the relevance of retaining excluded middle until we have investigated such questions. A fundamental definition of vagueness will answer these questions. If
131
admitting borderline cases (in some specific sense) is essential to vagueness, the definition will tell us this (and it will specify which sense). Thus, one reason why we need the correct definition of vagueness before we can find the correct theory of vagueness is that we need to know what the essential features of vagueness are in order to assess proposed theories of vagueness—and a correct definition will tell us what these essential features are. It might seem that there is a problem with my proposed strategy. For some feel that a ‘definition’ of P must be something on which all the major theories of P can agree. The rationale here is that if proponents of theories A and B disagree over the definition of P, then they do not have a genuine disagreement about the nature of P: rather, theories A and B are really theories about different things, and their proponents are talking past one another.² But there is no problem here for my strategy. For the requirement that a ‘definition’ of vagueness should be compatible with all theories of vagueness is plausible when we read ‘definition’ as ‘surface characterization’, but not when we read it as ‘fundamental definition’.³ Suppose two scientists are investigating the underlying nature of ‘water’. If one characterizes ‘water’ as the clear liquid that falls from clouds (etc.), and the other characterizes it as the white liquid that comes from cows’ udders (etc.), then when one says that the fundamental definition of water is that it is H2 O, and the other says that the fundamental definition of water is that it is a mixture of 90 per cent H2 O and 10 per cent C12 H22 O11 ,⁴ then it looks as though they are indeed talking past one another, rather ² For example, Greenough 2003 asserts that by characterizing vagueness from a neutral standpoint, we ‘‘can at least ensure that we are all talking about the same thing from the outset in our inquiry into the nature and source of vagueness’’ (p. 235), and claims that if we failed to be neutral in our initial characterization of vagueness, then there would ‘‘be a very real sense in which there would be no disagreement about the character of vague language at all since each partisan would mean something different by the predicate ‘is vague’ ’’ (pp. 238–9). On a similar note, Bueno and Colyvan 2006 write: ‘‘A definition of vagueness should not prejudice the question of how best to deal with it’’ (p. 1); ‘‘What we’re after is a definition of ‘vagueness’ that does not beg any questions about how vagueness is best treated’’ (p. 4); and ‘‘without a definition of ‘vagueness’ it is not even clear that the various theories are theories of the same phenomenon’’ (p. 5). And Williamson 1994, 2 writes that ‘‘we can agree to define the term ‘vagueness’ by examples, in order not to talk past each other when disagreeing about the nature of vagueness’’. See also Shapiro 2006, 1. ³ I do not mean here to be criticizing the authors cited in n. 2, for I am not claiming that they intended ‘definition’ in the sense of ‘fundamental definition’. ⁴ In order to get the point of these examples, this needs to be understood as the (actual) chemical composition of cows’ milk. In fact, milk contains less lactose (C12 H22 O11 ) than this, and some other things besides water and lactose, but I have left out the other ingredients for the sake of simplicity.
132 than having a genuine disagreement about the nature of water. But if they agree on the surface characterization of ‘water’ as the clear liquid that falls from clouds (etc.), and then one says that the fundamental definition of water is that it is H2 O, and the other says that the fundamental definition of water is that it is a mixture of 90 per cent H2 O and 10 per cent C12 H22 O11 , then there is no reason at all to think that they do not have a genuine disagreement about the nature of water. In this second case, we should not think that they are talking past one another; we should think that at least one of them is just wrong about what water really is. Now in the vagueness debate, all parties do indeed need to agree on the surface characterization of ‘vagueness’, in order to avoid talking past one another. But this requirement seems to sort itself out very smoothly in practice, to the extent indeed that it hardly seems necessary to point it out: those who are talking about something under the heading ‘vagueness’ that does not admit of borderline cases, draw blurred boundaries, and generate Sorites paradoxes, are in fact rapidly identified, and confusion is avoided.⁵ On the other hand, all parties do not need to agree on the fundamental definition of vagueness. In fact, it is hard to see how that definition could be theory-neutral, given that the different rival theories of vagueness cannot all be correct. Whatever the fundamental nature of vagueness, it surely cannot be compatible with what epistemicists, supervaluationists, and degree theorists (for example) say about vague predicates, for these theorists say incompatible things. Given that the fundamental definition of vagueness does not have to be compatible with all theories of vagueness, we cannot rule out a proposed definition on the grounds that it conflicts with some existing theory of vagueness. How, then, do we judge proposed definitions? What should make us think that one fundamental definition of vagueness is correct and another incorrect? The answer is simple. We should judge a definition by its clarity and rigour, and by how satisfying and how unified an account it yields of the various aspects of vagueness, including (especially) the three properties that feature in the surface characterization of vagueness. ⁵ For example, some writers in other disciplines—e.g. linguistics and engineering—sometimes talk about other phenomena—e.g. lack of specificity or contextual reference-shifting—under the heading ‘vagueness’; but philosophers easily recognize (see e.g. Greenough 2003 n. 27) that these authors are lumping distinct phenomena in with the one that we investigate under the heading ‘vagueness’.
133
3.2 Existing Definitions of Vagueness It might seem that we do not have to look far for a definition of vagueness—that not only do the three characterizations offered at the outset between them delineate the class of vague predicates, but that one or other of these three characterizations can in fact serve as a fundamental definition of vagueness. It is not hard to see, however, that none of the three characterizations can play this role.⁶ First, the borderline case idea. This does not yield an adequate definition of vagueness, for the borderline case phenomenon is not sufficiently fundamental. The existence of borderline cases for a predicate F is quite consistent with there being perfectly clear and sharp divisions between the cases to which F applies, the cases to which it does not apply, and the borderline cases. However, this sort of clear demarcation of cases is something which the blurred boundaries idea—and hence the ordinary idea of vagueness, which is partially characterized by the blurred boundaries idea—rules out. Consider the predicate ‘is schort’, which I shall introduce into the language right now, as follows:⁷ 1. If x is less than four feet in height, then ‘x is schort’ is true. 2. If x is more than six feet in height, then ‘x is schort’ is false. (The end) The predicate ‘is schort’ has borderline cases: all persons between four and six feet in height (inclusive). If asked whether such a person is schort, we would react with a hedging response. It is quite unclear whether ‘x is schort’ is true or false—or neither, or something else—when x is between four and six feet in height. Unlike the ordinary predicate ‘is short’, however, ‘is schort’ doesn’t seem to be genuinely vague. This is because while there are borderline cases for ‘is schort’, there is no unclarity about where these borderline cases begin and end. The delineation between the things which clearly are schort, the things which clearly are not, and the borderline cases, is too sharp for ‘is schort’ to count as genuinely ⁶ Obviously the conjunction of the three—i.e. our surface characterization of vagueness—is too motley a thing to serve itself as a fundamental definition. ⁷ Cf. Fine’s ‘‘nice1 ’’ (1997 [1975], 120), Soames’s ‘‘smidget’’ (1999, 164), and Tappenden’s ‘‘tung’’ (1993, 556).
134 vague. Thus, merely possessing borderline cases is not enough for true vagueness.⁸ One possible response to this problem is to retain the definition according to which a predicate is vague if and only if it has borderline cases, but to posit an additional phenomenon called higher-order vagueness, over and above (first-order) vagueness, and to say that ordinary vague predicates are also higher-order vague, whereas ‘is schort’ is only first-order vague. A predicate is first-order vague iff it has borderline cases; a predicate is second-order vague if it has borderline cases of borderline cases (things of which it is unclear whether the predicate applies to them or they are borderline cases, and things of which it is unclear whether the predicate does not apply to them or they are borderline cases); a predicate is thirdorder vague if it has borderline cases of borderline cases of borderline cases; and so on. I shall explain why I think this approach is mistaken in §3.5.5. There is another problem for the borderline case definition. If it is to yield a clear and perspicuous account of vagueness, it must be supplemented by a definition of borderline case (Bueno and Colyvan 2006, 3). We have been working with a rough and ready characterization of borderline cases, according to which a borderline case of a predicate is a thing to which it is unclear whether or not the predicate applies⁹—and where the presence of such unclarity is indicated by our tendency to give a hedging response when asked whether the predicate applies to the thing in question.¹⁰ This is fine as part of a trio of informal characterizations which together delineate the class of vague predicates. It would not be fine as part of a fundamental definition of vague predicates as predicates which admit of borderline cases. For clearly there are many precise predicates which have borderline cases in our sense simply because we are ignorant about or uncertain as to whether these predicates apply to certain objects. For example, when asked whether the predicate ‘is travelling faster than any polar bear travelled on 11th January 1904’ applies to the car we are driving in, we would certainly react with a hedging response—and yet this predicate is just as precise as the predicate ‘is travelling faster than 31 kilometres ⁸ This point has been made independently by a number of authors, including Sainsbury 1991, 173, Keefe and Smith 1997a, 15, Fara 2000, 47–8, and Shapiro 2006, 1 n. 1; cf. also Eklund 2001, 376. ⁹ Cf. Williamson 1994, 2, and Keefe and Smith 1997a, 2. ¹⁰ Cf. Fara 1997, 81; 2000, 76.
135
per hour’. One way to remedy this situation would be to add to our definition of a borderline case that the uncertainty involved must not be a matter of ignorance (Peirce 1902; Black 1997 [1937], 71). Of course, our definition of vagueness would then rule out the epistemic theory of vagueness in advance—but I have already said that I do not think that a definition of vagueness need be neutral between all possible theories of vagueness. However the proposal does face a critical problem, which is that it clearly cannot be stating the fundamental fact about vague predicates. If we are told that vague predicates admit of cases where we are uncertain whether the predicate applies, and where this uncertainty is not a matter of ignorance, then we have been told only part of the story about what vagueness is, supplemented by a story about what it is not. We immediately want to ask, ‘‘Then of what is our uncertainty a matter? Why are we uncertain?’’ It is only an answer to this further question that would get to the heart of vagueness. Giving rise to borderline cases—in the sense under discussion—can be regarded only as a symptom of vagueness, rather than as constitutive of vagueness. We could remedy this problem by saying that an object x is a borderline case of the predicate P if there is no fact of the matter whether P applies to x (Sainsbury 1995), or if ‘x is P’ is neither true nor false (Fine 1997 [1975]; Tappenden 1993), or if neither ‘x is P’ nor ‘x is not P’ are true (Shapiro 2006, 2).¹¹ However, we would still be left with the original problem, illustrated by ‘is schort’. I could have explicitly said, when I introduced this predicate into the language, that it is neither true nor false of things between four and six feet in height,¹² or that there is no fact of the matter whether it applies to such things—rather than simply remaining silent about these cases. Either way, ‘is schort’ still would not be vague.¹³ ¹¹ Cf. also Schiffer 2000. Again, this would rule out certain theories of vagueness in advance (epistemicism and theories which say that vague predicates are both true and false of their borderline cases), but I do not regard this as a problem per se. ¹² Cf. Sainsbury’s ‘child∗ ’ 1991, 173. ¹³ The issue of how to define ‘borderline case’ brings out a problem with Hyde’s 1994 defence of the borderline case definition of vagueness against the ‘is schort’ problem. Hyde argues that the notion of borderline case is itself a vague notion, and hence has borderline cases: thus, when we say that a predicate is vague if it has borderline cases, we are not leaving open the possibility of a predicate which is vague, and yet for which there is a sharp jump between the definite cases (or definite non-cases) and the borderline cases. The problem with this approach becomes clear when we ask what ‘borderline case’ could mean, on Hyde’s approach. Under the operational definition of a borderline case as a case over which we hedge, clearly ‘borderline case of P’ will have borderline cases for many predicates P. However, we have already seen the problems with plugging this definition of ‘borderline case’ into a
136 Second, the blurred boundaries idea. The borderline case characterization says that we cannot have one sharp line around the things which are F, where F is a vague predicate; but it does not rule out two concentric sharp lines (as in the case of ‘is schort’). The blurred boundary idea does rule this out. Unlike the borderline case idea, the blurred boundary idea does seem to be truly fundamental—to capture what is essential to vagueness. However, this second idea also fails to yield an adequate definition, for it is not precise or perspicuous enough to be really useful. It is suggestive, but that is all: when it comes to getting a clear understanding of vagueness and its associated problems, the blurred boundaries metaphor is too slippery to be of real help.¹⁴ Third, the Sorites-susceptibility idea. This idea is not sufficiently fundamental to form the basis of an adequate definition of vagueness. For we want to say that vague predicates engender Sorites paradoxes because they are vague; but if we define vagueness as Sorites-susceptibility—if we say that their giving rise to such paradoxes is constitutive, rather than a symptom, of their vagueness—then we cannot do this: we miss out on this explanation.¹⁵ Thus, while giving rise to Sorites paradoxes is an interesting and important characteristic of vague predicates,¹⁶ still it seems that giving rise to such paradoxes cannot be the fundamental fact about such predicates.¹⁷ definition of vagueness. Under the definition of ‘borderline case of P’ as an object x such that ‘x is P’ is neither true nor false, on the other hand, we can accept that there are borderline cases of borderline cases only if we suppose that the language in which we describe the semantics of vague predicates is itself vague. I argue in §6.2.1 that we should not go down this route. (For other objections to Hyde, see Tye 1997 [1994], who argues that the notion of a borderline case is not as vague as Hyde claims, and Varzi 2003a, who argues that there is a circularity in Hyde’s arguments. For replies to these objections, see Hyde 2003.) ¹⁴ Frege was fully aware of the limitations of his metaphor: ‘‘this is admittedly a picture that may be used only with caution’’ (Beaney 1997, 259). ¹⁵ Compare Johnston’s 1989, 1993 missing explanation argument against dispositional theories of value and colour. Compare also my argument above against the Peirce/Black version of the borderline case definition of vagueness. ¹⁶ Although see the end of §3.5.4 below for discussion of the view that not all vague predicates give rise to Sorites paradoxes. Note that while this view is automatically ruled out by a strict wielding of our surface characterization of vague predicates as those which admit of borderline cases, draw blurred boundaries and generate Sorites paradoxes, it is not ruled out by a more subtle wielding of that characterization which allows a few (but not too many) exceptional vague predicates, i.e. which allows us to count a few predicates as vague if they do not generate Sorites paradoxes, but do admit of borderline cases and do draw blurred boundaries, and are relevantly similar to some other predicates which satisfy all three parts of the surface characterization. ¹⁷ There is another problem for this definition—paralleling the secondary problem for the borderline case definition noted above—which is that it needs to be supplemented by a definition of Sorites argument (Bueno and Colyvan 2006, 5).
137
If we cannot use any of our three initial characterizations as a fundamental definition of vagueness, we shall have to look elsewhere. One view in the literature is that vagueness is semantic indeterminacy of the sort involved in plurivaluationism. As Braun and Sider (2007, 134) put it: ‘‘Like many, we think that vagueness occurs when there exist multiple equally good candidates to be the meaning of a given linguistic expression . . . Vagueness is a type of semantic indeterminacy.’’¹⁸ But this cannot be the essence of vagueness, because we could clearly have semantic indeterminacy without vagueness. If Quine is right, ‘gavagai’ is semantically indeterminate (see §6.1.1)—but we feel no temptation to call ‘gavagai’ vague (in the sense in which we are interested—recall n. 5 above): after all, it does not (even if Quine is right) draw blurred boundaries, nor does it generate a Sorites paradox (and while, as we have seen, neither of these features gives the essence of vagueness, they are still marks of vagueness). Similarly, if Field (1973) is right, ‘mass’ as used before relativity theory is semantically indeterminate—but we feel no temptation to call it vague: again, it does not (even if Field is right) draw blurred boundaries, nor does it generate a Sorites paradox. Greenough (2003) has proposed a definition of vagueness as epistemic tolerance. The proposal is that a sentence S is vague just in case, for any two cases which differ by at most a small amount in the one respect that makes a difference to the truth of S, if a speaker knows that S is (say) true in one case, then she does not know that S is not true in the other case. (The full definition is that a sentence S is vague just in case it yields a truth when substituted into the schema ∀τ∀α∀β if |v(β) − v(α)| < c and Ks (S is τ) in α then ¬Ks (S is not-τ) in β where τ is a variable which ranges over the set of truth states {true, determinately true, not true, not determinately true, determinately determinately true, . . . }, α and β are variables which range over actual and counterfactual cases, v is a function from actual and counterfactual cases to non-negative real numbers, such that the truth of S depends only on the value of v, c is some small positive real number, and Ks abbreviates ‘It is known by speaker s that’.) Greenough recognizes that as stated, this definition will ¹⁸ Cf. Lewis (see the passages cited in §2.5.1) and Weatherson (see the passage quoted in §6.1.3); cf. also Fine 1997 [1975], 120.
138 over-classify, and so adds three conditions which ‘‘ensure that a speaker’s ignorance does not result from the wrong source but solely from the vagueness of the sentence S’’: speaker s knows the value of v in every case; s knows the meaning of S; and ‘‘we restrict the range of α and β to ‘normal’ cases of judgement conditions for the speaker s’’ (pp. 259–60). Greenough’s definition is neither useful nor fundamental enough to serve as a definition of vagueness.¹⁹ On the first point: Greenough’s definition makes use of two notions—the notion of a speaker knowing that something is the case, and the notion of normal judgement conditions for a speaker—which are at least as hard to define as vagueness. His definition thus merely pushes back the problem of defining vagueness: we will not have a complete definition of vagueness until we have a definition of knowledge and a definition of normal conditions—and it does not look as though we will have these any time soon. On the second point: it is extremely natural to respond to Greenough’s definition as follows: ‘‘Ah, I see . . . suppose you cannot know that S in α and that not-S in β . . . and suppose that this ignorance is not due to the fact that you do not know what S means, nor due to the fact that you do not know the value of v in α or β, nor due to the fact that you are perceptually impaired in α or β . . . then this ignorance arises because S is vague.’’ But if we adopt Greenough’s definition, then we cannot have this reaction: for then the idea is not that your ignorance arises from the vagueness of S; it is that S’s vagueness consists in the fact that you are ignorant, and that this ignorance does not arise from certain sources. This seems wrong: it is not at all implausible that vagueness leads to the sort of ignorance Greenough describes; but it is not plausible at all that the existence of such ignorance is the fundamental fact about vagueness. Thus Greenough’s definition is subject to the same sort of objection as the definition in terms of Sorites-susceptibility: we want to say that vague predicates engender Sorites paradoxes and certain sorts of ignorance because they are vague; but if we define vagueness as Soritessusceptibility or as epistemic tolerance, then we cannot do this—we miss out on this explanation.²⁰ ¹⁹ Note that this is a criticism of Greenough’s definition qua fundamental definition of vagueness; I am not claiming that Greenough intended his account to amount to a fundamental definition, rather than a surface characterization (see in particular Greenough 2003, 237 n. 2). ²⁰ Eklund 2005, 32–3 independently makes a similar objection to Greenough, and to the characterization of vagueness in terms of quandaries in Wright 2001.
139
Another proposal for a definition of vagueness is that of Eklund (2005). Eklund defines vagueness in terms of tolerance, where a predicate F is tolerant with respect to φ if there is some positive degree of change in respect of φ that things may undergo, which is ‘‘insufficient ever to affect the justice with which F is applied to a particular case’’ (Wright 1975, 334). Thus, for example, ‘tall’ would be tolerant with respect to height if there were some increment of height (say, one nanometre) which never made any difference to the applicability of the predicate: i.e. if someone is (not) tall, then someone who differs from her in height by just one nanometre is also (not) tall. As we shall see in §3.5.1, given some natural, minimal assumptions, no predicate can actually be tolerant, on pain of contradiction. This is a strong reason against defining a predicate as vague just in case it is tolerant: that would mean we must either accept true contradictions, or else deny that there are any vague predicates. But Eklund’s approach is more subtle. He argues that what defines vague predicates is not that they are tolerant, but that ‘‘semantic competence with a vague predicate partly consists in a disposition to accept that this predicate is tolerant (a disposition that can be overridden, for example when it is learned that tolerance principles can never be satisfied)’’ (p. 41). This leads to the view that vague predicates are inconsistent, not in the sense that there are some things to which they actually both apply and fail to apply, but in the sense that ‘‘semantic competence with these expressions involves a disposition to accept some things that in fact lead to inconsistency’’ (p. 15).²¹ While not as unattractive as the view that vague predicates are inconsistent in the first sense, Eklund’s view is still unattractive. The view is that competence with a vague term—knowing what the term means and how to use it—requires thinking that it has some property which it does not have, and could not have on pain of contradiction. This is not an impossible situation, but it is a thoroughly odd one, and certainly, the burden of proof lies with Eklund to show that competence with vague terms really does require (a disposition towards) believing that they are tolerant. I shall argue in §3.5.1 that the arguments which are generally taken to show that speakers believe that vague predicates are tolerant in fact provide just as much support for the thesis that vague predicates satisfy a condition which I call Closeness. Closeness—unlike ²¹ Sorensen 2001 presents a closely related view.
140 tolerance—generates no contradictions, and thus there is no need to place a barrier—as Eklund does—between the properties that competence requires us to believe that vague predicates have and the properties that they really do have. Thus Eklund cannot discharge the burden of proof mentioned above: the considerations generally taken to support the association of vagueness with tolerance really support its association with Closeness.²² Having failed to find an adequate definition of vagueness amongst existing proposals in the literature, I shall propose a new definition. Before doing so, however, I shall in the next section explain a key background notion necessary for understanding this definition.
3.3 Closeness For any set S of objects, and any predicate F —vague or precise—a competent user of F can discern relationships of closeness or nearness or similarity amongst the members of S: closeness or nearness or similarity in the respects that are relevant to—or determine—whether something is F (for short, ‘F-relevant respects’). For example, consider the term ‘red’, and the set of all visible objects in the room. As a competent user of the term ‘red’, you will automatically discern relationships of closeness or similarity amongst these objects in red-relevant respects. For example, the orange things in the room are closer in red-relevant respects to the red things than are the green things. This is not to say that orange things are more red than are green things: neither orange things nor green things are red at all.²³ Rather, it is to say that in the respects that determine whether something is red, orange things are closer—or more similar—to red things than are green things. Think of a colour wheel—or better, a colour solid—and imagine locating each visible object in the room on the point of the colour wheel that has the same colour as that object. The closeness relationships amongst ²² Another definition of vagueness which has been discussed—but not defended—in the literature holds that a vague predicate is a predicate for which mathematical induction fails. This suggestion is made, and convincingly criticized, by Bueno and Colyvan 2006, 4–5. Cf. Dummett 1997 [1975], 102–3. ²³ I mean here that ‘a is red’ and ‘a is green’ are both clearly false, when a denotes (say) a ripe orange. It is also the case, of course, that (some) green things are not at all reddish, while (all) orange things are reddish (and yellowish): for there are unique green hues, whereas all orange hues are binary (see e.g. Hardin 1988, 39).
141
objects in the room in red-relevant respects correspond to the relationships of spatial closeness amongst these objects when they are located on the colour wheel in this way. Now consider the term ‘green’ and the same set of objects. The closeness relationships amongst these objects in green-relevant respects are just the same as the closeness relationships amongst them in red-relevant respects: if two objects are close in respects relevant to whether something is red, then they are likewise close in respects relevant to whether something is green. Of course, the two objects may be red, and not green, but this does not mean that they are dissimilar in respects relevant to whether or not they are green. On the contrary, being of the same colour, they are very close in these respects—and quite far from objects which are green. For example, two ripe tomatoes are very similar in respects relevant to whether something is green, even though neither is green at all, while—conversely—a granny smith apple and a pair of khaki jungle greens are not very similar in respects relevant to whether something is green, even though both are clearly green. In general, all colour predicates are associated with the same closeness relationships on a given set of objects. Think of a large set of coloured pencils, with several shades of each basic colour (e.g. there is not just one blue pencil, but Cobalt Blue, Delft Blue, Kingfisher Blue, and Midnight Blue).²⁴ We can easily distinguish the task of ordering the pencils in their tin (in colour order—i.e. like a rainbow) from the task of identifying the blue (or green, red, etc.) pencils. The ordering is a one-off task: if we are told to order the pencils, and then select a blue one, we will order them in the same way as if we are told to order the pencils, and then select a green one (or a red one, etc.). Given any colour predicate F, what we are doing when we order the pencils is putting pencils which are closer together in F-relevant respects closer together in the tin, and this is quite distinct from what we do when we select an F coloured pencil. To repeat the point made about the tomatoes, apples, and army pants in terms of our present example: the Anthraquinone Scarlet and Azo Red pencils are very close together in blue-relevant respects—they are side by side in the tin (once the pencils have been ordered)—even though neither is a blue pencil by any stretch of the imagination, while the Iron Blue and Light ²⁴ Throughout this paragraph I use names of Derwent brand pencils.
142 Blue pencils are not very close in blue-relevant respects—there are quite a few pencils in between them in the tin—even though both are clearly blue pencils. Now consider a non-colour predicate, and indeed a non-vague predicate: for example, ‘weighs over one kilogram’.²⁵ This term is likewise associated with closeness relationships on the set of visible objects in the room—but not the same closeness relationships as just considered. The black grand piano in the corner and the black fragment of ash on the window sill are very close in red-relevant respects, but not so close in respects relevant to whether a thing weighs over one kilogram, while the blue and red pen caps on the table are not very close in red-relevant respects, although they are very close in respects relevant to whether a thing weighs over one kilogram. Imagine locating every object in the room on a long line, with an object weighing x kilograms being placed x metres from the beginning of the line. The closeness relationships amongst these objects in respects relevant to whether a thing weighs over one kilogram correspond to the relationships of spatial closeness amongst them when they are located on the line in this way. The similarity relationships corresponding to other words are often not as easy to visualize as the ones associated with colour terms and the ones associated with the predicate ‘weighs over one kilogram’, but they are always just as apparent to users of these terms. Consider, for example, the predicate ‘heap’. A twenty-grain pile of sand is close to a twenty-one-grain pile of sand in respects relevant to whether something is a heap; a pile of ten olives is further away; but not as far away as an armchair, which itself is reasonably close (in heap-relevant respects) to a dishwasher. (In the respects relevant to whether something is a heap—e.g. size, the extent to which the thing is composed of separate smaller things held together only by gravity, shape²⁶—the armchair and the dishwasher are not very different.) These things are apparent to any competent user of the predicate ‘heap’, even though there is no familiar object—such as a colour wheel or a line—onto which we can map things in order to represent their ²⁵ I assume for the sake of example that weight terms such as ‘kilogram’, length terms such as ‘foot’ and ‘metre’, and other such terms, are precise—as opposed to predicates such as ‘heavy’ and ‘tall’, which are vague. If you think that really, the former too are somewhat vague (see Russell 1997 [1923], 63), then please just treat them as precise for the purposes of the examples, or else substitute other examples for mine. ²⁶ Cf. Hart 1992, 3.
143
relationships of similarity in heap-relevant respects as relationships of spatial closeness. Thus far I have simply been stating obvious facts (obvious, at least, on reflection). The question now arises as to how to give a more precise account of them. We need to distinguish two sorts of similarity or closeness relationship that are apparent to competent speakers: relationships of relative closeness and relationships of absolute closeness. First, relative closeness. In respects relevant to whether something is red, the orange things are closer to the red things than are the green things; in respects relevant to whether something is a heap, the twenty-grain pile of sand is closer to the twenty-one-grain pile of sand than is the ten-grain pile of sand, and so on. Given a set of objects and a predicate, we wish to represent the relative closeness relationships on that set associated with that predicate. The simplest and most general way to do this is directly F
in terms of a three-place relation x ≤z y: ‘x is at least as close to z as y is, in F-relevant respects’. However, it should be noted that in a given case—that is, for a given predicate F —it may be that what speakers directly discern is not a ternary similarity or closeness relation, but some other sort of structure which yields such a relation—for example, a metric structure. Consider, for example, the predicate ‘nearly home’. Bill, Ben, and Bob share a house. All three are heading home: Bill is 10 kilometres away, Ben is 9.5 kilometres away, and Bob is a few streets away. In ‘nearly home’-relevant respects, Ben is closer to Bill than Bob is (even though Bob is nearly home and the other two are not).²⁷ However, what competent users of ‘nearly home’ discern most directly is not such facts as this, but facts about the spatial distances between objects. They then calculate that Ben is closer to Bill than Bob is in ‘nearly home’-relevant respects if the difference between Ben’s and Bill’s distances from home is less than the difference between Bob’s and Bill’s distances from home. In general, I do not suppose that there is always a metric on a given set of objects associated with a given predicate (cf. §3.4.3); I assume only that there is a structure of relative closeness relationships, represented by a three-place relation. However, in ²⁷ Of course, whether or not someone is nearly home depends upon how far away he was when he set out: if Bill is travelling home from across the world, then when he is 10 kilometres away, he is nearly home. Thus, when I said ‘‘Consider, for example, the predicate ‘nearly home’ ’’, I should have said ‘‘Consider, for example, the predicate ‘nearly home’, as used on a particular occasion’’. Recall the point about ‘large-scale’ contextual variation in §2.6.
144 many actual examples there is an associated metric—and often it is what is most salient to speakers. The question arises as to what properties the three-place relation has. Extending terminology for binary relations in an obvious way, we may assume that it is transitive: ∀x, y, z, w(x ≤w y ∧ y ≤w z → x ≤w z).²⁸ We may also assume that it is reflexive: ∀x, y(x ≤y x). But what about antisymmetry: ∀x, y, z(x ≤z y ∧ y ≤z x → x = y)? What about connectedness: ∀x, y, z(x ≤z y ∨ y ≤z x)? I do not suppose that there are any general answers to these questions—that is, answers that apply to all predicates F. Whilst it is quite clear that for any predicate F, competent users of F can discern on a given set of objects a structure of relative closeness relationships associated with F (whether they discern this structure directly or via some other structure), specifying this structure in detail for a given predicate—and hence answering questions of the sort just posed—involves a great deal of work. Furthermore, this work will in general be not only laborious, but conceptually difficult. In some cases this work has been done: for example, the closeness relationships amongst objects in colour-relevant respects are codified in the colour solids. In other cases it has not been done, and very likely never will be: for example, we have no general theory that codifies our thoughts about closeness of objects in respects relevant to whether something is a chair.²⁹ Second, absolute closeness. Bill is 5 ft in height, Ben is one thousandth of an inch taller, and Bob is 6 ft 4 in. In respects relevant to whether something is tall, Ben is closer to Bill than Bob is (even though Bob is tall, and the other two are not). This is not all that is obvious to competent users of the word ‘tall’, however: it is also clear that in an absolute sense, Bill and Ben are very close in respects relevant to whether something is tall. Similarly, a 10,000-grain pile of sand, and a 10,001-grain pile of sand of a very similar shape, are in an absolute sense very close in respects relevant to whether something is a heap, and so on for other predicates. This notion ²⁸ Here and in what follows I omit the superscript F on the relation symbol in order to indicate F
generality of the schematic sort; i.e. I am here asserting transitivity for all relations ≤. ²⁹ Such a general theory would undoubtedly yield particular judgements about relative closeness concerning which competent speakers have no intuitions. This does not cast doubt upon the idea that there is a correct general theory to be given. Compare the case of the colour solids: not every instance of the relation ‘a is closer to c than b is, in respect of colour’ that they yield is one which we can perceive to hold. Nevertheless, we can perceive enough to ground a meaningful distinction between correct and incorrect general theories.
145
of absolute closeness is exploited in Sorites paradoxes: one of the essential ingredients of a compelling instance of the paradox is that adjacent items in the series of objects under consideration be very similar in respects relevant to the application of the predicate in question. Given a set of objects and a predicate, and an associated structure of relative closeness relationships (represented, in general, by a reflexive, transitive three-place relation), we represent this additional structure of absolute closeness relationships by a F
two-place relation x ≈ y: ‘x is very close to y, in F-relevant respects’. I will in all cases regard this relation as precise³⁰—that is, as an ordinary set of ordered pairs, such that for any pair of objects in the domain, either it is in this set (i.e. the two objects are very close in F-relevant respects) or it is not, with no middle way.³¹
3.4 Vagueness as Closeness The basic idea behind the definition of vagueness to be proposed here is as follows. If Bill and Ben differ in height by less than one millimetre, then the two claims that Bill is tall and that Ben is tall must be very much alike in respect of truth (‘is tall’ is vague), while the two claims that Bill is exactly ³⁰ The ordinary notion of absolute similarity in F-relevant respects may well be vague, for many predicates F. However, this would simply mean that we must regard the absolute closeness relations which will figure in my account of vagueness as precisifications of their ordinary counterparts. ³¹ As already stated, I do not think that the task of giving a general theory which codifies our ideas about closeness of objects in respects relevant to whether something is F —for a given predicate F —is easy; nor do I think it is hard simply because it involves filling in a lot of tedious details, the general framework in which the details are to be placed being clearly understood. But lest the reader suspect that the task is in fact impossible, I suggest the following as a possible starting point in giving a general theory of closeness of objects in respects relevant to whether a thing is, say, a table. First we determine the relevant respects. Solidity, flatness of upper surface, and possession of legs, for example, are relevant respects, while colour, monetary value, and location, for example, are not. Then we associate each respect with a numerical scale. (Where this cannot yet be done—e.g. if one of the respects is colour—we must analyse further, in order to reach respects which can indeed be associated with a linear scale.) This gives us a vector space, each object corresponding to a vector whose coordinates are the numbers with which the object is associated on each numerical scale. If we then define an appropriate norm on this vector space, we get a metric space. Relative closeness relationships may then be extracted via the idea that x is at least as close to z as y is, just in case the distance between x and z is less than or equal to the distance between y and z. Absolute closeness relationships may be extracted via the selection of a particular number d, the idea being that x and y are very close just in case the distance between them is less than d. I stress that this is not intended as a glib dismissal of the difficulties associated with constructing a general theory of closeness in, say, table-relevant respects: it is merely intended as an antidote to the view that the very idea of constructing such a theory is absurd. Note also that nothing in what follows depends upon our possessing such a general theory for any predicate F.
146 6 ft in height and that Ben is exactly 6 ft in height need not be—one might be true and the other false (‘is exactly 6 ft in height’ is precise). In general, if objects x and y are very similar in all respects relevant to whether or not something is F, then if F is a vague predicate, the two claims that x is F and that y is F must be very similar in respect of truth, while if F is a precise predicate, they need not be. I call this the closeness picture of vague predicates: closeness of x and y in F-relevant respects makes for closeness of ‘Fx’ and ‘Fy’ in respect of truth.³² One useful way to think of this is as a weakening of the idea that vague predicates are tolerant. To say that F is tolerant is to say that very small differences between objects in F-relevant respects never make any difference to the application of the predicate F. Closeness says that very small differences between objects in F-relevant respects never make a big difference to the application of the predicate F —but (in contrast to tolerance) they may make a very small difference.³³ I explain in §3.5 how my definition satisfies our desiderata for a correct definition of vagueness. In the present section, I formulate and explain the definition in detail. The final version of my proposed definition of vagueness will be presented in §3.4.4. Here is a first version. A predicate F is vague just in case it satisfies the following condition, for any objects a and b: Closeness If a and b are very close in F-relevant respects, then ‘Fa’ and ‘Fb’ are very close in respect of truth.³⁴ 3.4.1 Closeness in Respect of Truth The Closeness condition features two notions of closeness: closeness in F-relevant respects and closeness in respect of truth. We have said a lot ³² Here, and in some other places in this book, I use ‘x’ (‘y’, ‘a’, ‘b’, etc.) both inside and outside quotation marks, and sometimes in addition I quantify such expressions, in which case the quantifier(s) bind occurrences of ‘x’ (‘y’, ‘a’, etc.) both inside and outside quotation marks. I regard such usage as shorthand: it is sloppy, but I think it increases readability. Here is a key to interpreting the shorthand. Shorthand: Closeness of x and y in F-relevant respects makes for closeness of ‘Fx’ and ‘Fy’ in respect of truth. Longhand: Closeness of x and y in F-relevant respects makes for closeness of ‘F x’ and ‘F y’ in respect of truth, where ‘x’ is a singular term which refers to x and ‘y’ is a singular term which refers to y. NB: Furthermore, I often omit quotation marks, if it is obvious that an expression is being mentioned rather than used. ³³ For more on the relationship between Closeness and Tolerance, see §3.5.1. ³⁴ Recall n. 32. This is shorthand for: (for any objects a and b, and any singular terms ‘a’ and ‘b’) if a and b are very close in F-relevant respects, and ‘a’ refers to a and ‘b’ refers to b, then ‘F a’ and ‘F b’ are very close in respect of truth.
147
about the former, but nothing so far about the latter. The first thing to say is that it is a particular instance of the more general notion of closeness in respect of a property: the instance in which the property is truth. Note that there is a crucial difference between similarity in respects relevant to the possession of a property P —that is, in P-relevant respects—and similarity in respect of P itself. Consider, for example, Bill, who is 6 ft 4 in., and Ben, who is 7 ft 4 in. Bill and Ben are not very close in respects relevant to possession of the property tallness. After all, they differ in height by a full foot—and this is a big height difference between persons. Nevertheless, they are very close in respect of tallness itself: they both quite definitely possess this property. We need an understanding of closeness in respect of truth such that two sentences may be close in respect of truth while not being identical in respect of truth—or else Closeness will simply reduce to Tolerance. Fuzzy semantic theory provides one such understanding: assuming that sentences are assigned truth values in the interval [0, 1], we can say that two sentences S and T are very close in respect of truth if |[S] − [T]| ≤ 0.01. Note that this is not a definition of ‘close in respect of truth’, but a sufficient condition for two sentences to be close in respect of truth. For a start, it might be that sentences whose (fuzzy) truth values are further apart than this are still close in respect of truth. Second—and more fundamentally—I take it that the notion of two sentences being close in respect of truth is a (relatively) primitive one which pre-exists—and indeed provides one of the central motivations for—the formulation of the fuzzy semantic framework. Why are degree theorists unsatisfied with the classical semantic picture? One fundamental thought which they want to capture is that there are sentences whose truth status lies between true and false, and furthermore (for the previous thought does not get us further than the idea of three truth values, or two values and a gap) that we can make comparisons within the class of such intermediate sentences, potentially without limit (i.e. we can say that this one is truer than that one, which is less true than this other one, which itself is less true than the original one, which is nevertheless not as true as this other one, and so on). But even that does not exhaust the motivation of many degree theorists: for it gets us only to a set of degrees of truth with a certain order structure, whereas it is quite natural to think that there is, in addition, a meaningful notion of distance between truth
Truth value of ‘Man x is bald’
148 1
0
1
50,000 Men x
100,000
Figure 3.1. Ways of assigning truth values to sentences about baldness.
values.³⁵ One way to bring this out is to consider the assignment of degrees of truth to sentences about the objects in a Sorites series—say, sentences of the form ‘Man 1 is bald’, ‘Man 2 is bald’, . . . , ‘Man 100,000 is bald’, where the numbering of the men in the series goes by the numbers of hairs on their heads. Consider the systems of assignments of truth values to these sentences shown in Figure 3.1 (each of the eleven curves represents one system of assignments). If we think that the only thing that matters about our set of degrees of truth is its order structure, then we will have to think that all these systems represent exactly the same hypothesis about the meaning of ‘is bald’—for the ordering of sentences with respect to how true they are is uniform across the systems. It is quite natural to think, however, that these systems of assignments represent different hypotheses about the meaning of ‘is bald’. For example, one might naturally say something along these lines: ‘‘If the system of assignments represented by the straight line in the middle is correct, then the truth value of ‘This man is bald’ changes uniformly between man 1 and man 100,000. If the system of assignments represented by the curve on the right is correct, then at first each hair removed hardly changes the truth value of ‘This man is bald’ at all—although it does change it a little bit—until we get to about man 95,000, whereafter each hair removed makes a big difference to the truth value of ‘This man is bald’. If the system of assignments represented by the curve on the left is correct, then at first each hair removed makes a big difference to the truth value of ‘This man is bald’, until about man 5,000, ³⁵ Thanks to Robbie Williams for helpful comments here.
149
whereafter each hair removed hardly changes the truth value of ‘This man is bald’ at all, although it does change it a little bit.’’ This second way of looking at the matter makes use of a notion of distance between truth values—for example, in the idea of a uniform change of truth value—and furthermore, it makes use of the idea that there is a distance between truth values which is very small. This comes out in the parts about single hairs making a ‘big difference’ (here the distance between the truth values involved is not very small) or ‘hardly changing’ the truth value (here the distance between the truth values involved is very small). This notion of a very small difference between truth values hangs together with the notion of closeness in respect of truth: where there is a big difference in the truth values of two sentences, the sentences are not very close in respect of truth; where there is hardly any change in truth value from one sentence to another, the sentences are very close in respect of truth. I am not saying that all degree theorists are motivated by the idea that not only does the ordering of truth values matter, but also, there is a meaningful notion of distance between truth values—and furthermore, a meaningful notion of a distance between truth values which is very small (this notion going hand in hand with the idea of two sentences being very close in respect of truth). Some degree theorists are explicit that only ordering matters to them.³⁶ On the other hand, the majority of degree theorists have taken the real interval [0, 1] as their set of truth values, without any caveat to the effect that they are only concerned with its order structure, and hence that they consider any mapping of sentences to truth values to be completely interchangeable with any other mapping resulting from it by composing it with an order-preserving transformation of the interval [0, 1]—and this indicates that they had in mind some metric intuitions, as well as merely ordering intuitions, when they thought that we should countenance degrees of truth. In any case, what I am saying is that the second motivation for introducing degrees of truth makes sense. That is, it makes sense to think that there could be predicates which differ precisely in that an assignment something like the curve on the left in Figure 3.1 is correct for one of them, while an assignment something like ³⁶ See in particular the many-valued systems of Post 1920, 1921 (Malinowski 1993, ch. 6, gives an overview). Cf. also the passage from Goguen 1968–9 quoted on p. 297, and Goguen 1979, 59; Hájek 1999, 162–3; and Weatherson 2005, but note that he is not a degree theorist in my sense: ‘‘I do not have the concept of an intermediate truth value in my theory’’ (p. 53).
150 the curve on the right is correct for the other.³⁷ I am also saying that considering this second way of looking at things is one way to get a feel for the notion of closeness in respect of truth that figures in the Closeness definition. At this point the suspicion may arise that there is going to be something fishy about my proposed method of arguing that the correct theory of vagueness must countenance degrees of truth. Is it not somehow questionbegging to argue for this conclusion via a definition of vagueness which employs a notion (closeness in respect of truth) which, while admittedly not defined in terms of degrees of truth, nevertheless figures in the motivation of many degree theorists? No: there will be no question-begging. First, I shall not assume that non-degree theorists cannot accept the Closeness definition. Rather, I shall argue in Chapter 4 that, given the existence of a Sorites series for the predicate F, one cannot accept the claim that F conforms to Closeness unless one countenances degrees of truth. Second, my argument for the correctness of the Closeness definition takes the following form. In §3.5, I show that predicates which conformed to this definition would appear to have just the features that the predicates we call ‘vague’ actually do have. This provides a prima facie reason for believing that the predicates we call ‘vague’ do in fact have the same underlying nature as predicates which conform to the Closeness definition—that is, it provides a prima facie reason for thinking that Closeness is the correct definition of vagueness. But only a prima facie reason: for there might be another proposed definition which also has the feature that predicates which conformed to it would appear to have just the features that the ³⁷ Weatherson (2006, 12) gives the following example: ‘‘Consider the predicate is very late for the meeting. At least where I come from, a person who is roughly ten minutes late is a borderline case of this predicate. But which side of ten minutes late they are matters. (In what follows I make some wild guesses about how numerical degrees of truth, which aren’t part of my preferred theory, should operate. But I think the guesses are defensible given the empirical data.) If Alice is nine and three-quarters minutes late, and Bob is ten and a quarter minutes late, then the degree of truth of ‘Alice is very late’ will be much smaller than the degree of truth of ‘Bob is very late’. The later you are the truer ‘you are very late’ gets, but crossing conventionally salient barriers like the ten minutes barrier matters much more to the degree of truth than crossing other barriers like the nine minutes thirty-three seconds barrier.’’ Now just as Americans and Australians, for example, have great fun discovering that they mean different things by some terms (e.g. ‘sloppy joe’), we can imagine someone from another country going to the place Weatherson comes from and saying ‘‘Wow, that’s interesting: where I come from, it’s the twenty minute mark that really matters’’, and someone else saying ‘‘Really? Fascinating. In my country, ‘you are very late’ just gets truer at a uniform rate as time passes—there are no sudden increases.’’ My claim is simply that it makes sense to imagine these sorts of differences between predicates.
151
predicates we call ‘vague’ actually do have. So in Chapter 4, I consider alternative definitions which might be proposed by proponents of nondegree-theoretic treatments of vagueness, and I show that predicates which conformed to them would not in general look like vague predicates. Only at that point will the case for the correctness of the Closeness definition have been fully made—and it will have been made in a way that does not beg any questions against non-degree-theoretic treatments of vagueness. But before moving on to these issues, in the remainder of the present section (§3.4) I continue the task of formulating and explaining the Closeness definition in greater detail. 3.4.2 Other Formulations of Closeness The Closeness condition can be stated more precisely if we assume the following kind of setup. Suppose that we have a domain of discourse D, and a set T of truth values. (I make no assumptions here about the truth values: how many there are, how they are ordered, what—if any—operations on them correspond to the logical connectives, and so on.) Consider a function from D to T which assigns to each object x in the domain the truth value of the sentence Fx. Call this the characteristic function of the predicate F. Let [Fx] be the value of the characteristic function for F at F
the object x. Let ≈ be the relation on D of being very close in F-relevant respects. Let ≈T be the relation on T of being very close in respect of truth (where two sentences will be close in respect of truth—as discussed in the previous section—if their truth values stand in the relation ≈T ). Then the Closeness condition may be stated thus: F
x ≈ y ⇒ [Fx] ≈T [Fy] F
To deal with the possibility of truth value gaps, we add that if x ≈ y, then Closeness is satisfied if neither [Fx] nor [Fy] exists, but not if one exists and the other does not. This is not an arbitrary stipulation: for motivation, see the discussion of truth value gaps in §4.2. In many semantic frameworks (for example, classical model theory and fuzzy model theory—see Chapter 2), the same objects serve as membership values for sets and as truth values for sentences, and the truth value of the sentence Fa is whatever value is assigned to the referent of the name a by the membership function of the set which is the extension of the predicate F. In
152 such a system, what I have non-standardly called the characteristic function of the predicate F is identical to what is standardly called the characteristic function of the set which is the extension of F. When working in such a system, we can also state Closeness this way: If a and b are very close in F-relevant respects, then they are very close in respect of F. 3.4.3 Closeness and Continuity There is an affinity between Closeness and the idea of continuity. The intuitive idea behind the notion of a continuous function is that a small change in input produces at most a small change in the value of the function. Thus it might seem that we could say that a predicate is vague just in case its characteristic function is continuous.³⁸ While attractively neat, ultimately, this proposal will not work. The first issue with the vagueness-as-continuity proposal stems from the fact that in order for the notion of continuity to be well-defined in this context, we need to suppose that there are topologies defined on the domain and codomain of the characteristic function for F.³⁹ The topology on the codomain (which is the set of truth values of our logical framework, whatever that is) codifies the notion of closeness in respect of truth (≈T ). On the domain we need not just one topology, but one topology for each predicate F, with the topology associated with F codifying the notion of F
closeness in F-relevant respects (≈). We then say that F is vague if its characteristic function—from the domain of discourse endowed with the F topology to the topological space of truth values—is continuous. The issue is this: Is the requirement that there be a topology on the domain for each predicate F onerous? Where do these topologies come from? One idea is as follows. I am already assuming that there is a three-place similarity relation on the domain for each predicate F —and such a relation yields a basis for ³⁸ Cf. Novák et al. 1999, 4–5, and Goguen 1979, 52. ³⁹ A topology T on a set S is a set of subsets of S (known as ‘open sets’) which satisfies the conditions that ∅ and S are in T , and T is closed under finite intersections and arbitrary unions. A set S for which a topology has been specified is called a ‘topological space’. A function from a topological space S to a topological space T is continuous just in case the pre-image (under the function in question) of every open set in T is open in S. ( The pre-image of a subset Y of T under the function f : S → T is the subset X of S which contains every element of S which is mapped by f to an element of Y .)
153
a topology,⁴⁰ via the stipulation that a subset S of the domain is a basis element iff it satisfies the following condition: F
(x ≤z y ∧ y ∈ S ∧ z ∈ S ) → x ∈ S i.e. if x is at least as close to z as y is, and y and z are both in S, then x is in S. It is easy to see that this procedure always yields a topology on the domain: by reflexivity of the relative similarity relation (∀x, y(x ≤y x)), every singleton of a member of the domain satisfies the above condition and hence is a basis element; and then, taking arbitrary unions of singletons gives us every subset of the domain. Thus the topology resulting from the above procedure will always be the discrete topology: that is, the one in which every subset of S is open. This is a big problem, however—for any function from a set endowed with the discrete topology is continuous.⁴¹ So if the topology on the domain associated with the predicate F is discrete, then it does not matter what F’s characteristic function is like: it will automatically be continuous. Thus, on the present proposal, every predicate F will turn out to be vague—a disastrous result. So we need a different proposal concerning the origin of the topologies on the domain—and I do not know what such a proposal would look like.⁴² In the absence of such a proposal, we are left with the bare stipulation that for each predicate F, there is an associated topology on the domain of discourse. Is this any worse than my own requirement that for each predicate F, there is an associated three-place relative similarity relation on the domain and an associated two-place absolute similarity relation? Yes, it is worse. For my relations—unlike the stipulated topologies—are firmly grounded in ordinary experience and practice. We can all see that a twenty-grain pile of sand is at least as close as a pile of ten olives to a twenty-one-grain pile of sand, in respects relevant to whether something is a heap, and that ⁴⁰ A basis (or base) for a topology T on S is a set B of open subsets of S such that every open subset of S is a union of sets in B . ⁴¹ Recall that f : S → T is continuous just in case the pre-image of every open set in T is open in S. If every subset of S is open, this will automatically be the case, regardless of what f and T are like. ⁴² Bringing in the two-place absolute similarity relation does not seem to help. If we stipulate that an F
open set X must satisfy the condition that if x ∈ X and x ≈ y then y ∈ X, then whenever the domain D consists of a Sorites series for F, the only open sets will be ∅ and D; i.e. we end up with the trivial F
topology. If we stipulate that an open set X must satisfy the condition that for all x and y in X, x ≈ y, then except in the special case where the domain consists of objects all of which are very similar to one another in F-relevant respects, the open sets will not be closed under unions, and hence we do not get a topology at all.
154 furthermore the twenty-grain pile of sand is very close to the twenty-onegrain pile of sand in heap-relevant respects. As noted, it is precisely such judgements of closeness that underlie the force of Sorites paradoxes.⁴³ The second issue for the continuity proposal is that sometimes domains really are discrete—indeed, in most of the standard examples of Sorites paradoxes, we have discrete domains. We have a finite sequence of men, each differing by one hair (rather than a continuum of men, differing continuously in percentage of scalp covered by hair); we have a finite series of colour patches, each of a slightly different shade of red (rather than a strip of paper, differing continuously in shade); we have a finite sequence of piles of beans, each differing by one bean (rather than a continuum of piles, differing continuously in volume); and so on. The problem here for the continuity view is that even when we have discrete domains, we still want to distinguish between vague and precise predicates defined over these domains—yet as we have seen, every function from a discrete domain is continuous, and so if ‘vague’ means ‘has a continuous characteristic function’, then every predicate is vague relative to a discrete domain. Consider a concrete example. Suppose our domain consists of a line of men, from one with no hair up to one with a full head of hair, each differing from the next by just a hair. Consider the precise predicate ‘has 100 or less hairs on his head’ (as opposed to the vague predicate ‘is bald’). Its characteristic function assigns True to the first 101 men, and then jumps to False for the rest of the men. But this jump is not enough to make this function discontinuous—so on the proposal in question, this predicate comes out as being vague. That, it seems to me, is a flat-out refutation of the continuity view.⁴⁴ Note that it is not the case that every predicate F automatically satisfies Closeness relative to a discrete domain of discourse. The reason for this difference between the Closeness and continuity proposals is that Closeness makes use of the notion of absolute similarity between elements of the domain. This notion features in the intuitive statement of continuity (a ⁴³ I said that sometimes our similarity judgements arise from our discernment of an underlying metric structure. In these cases the continuity proposal is on safe ground. For a metric yields a topology—or we could simply work with the metric directly, using the metric space definition of continuity rather than the more general topological definition. ⁴⁴ It would be no better simply to stipulate that every predicate is precise relative to a discrete domain—for ‘is bald’ is vague relative to the discrete domain just considered, and ‘is large’ as a predicate of natural numbers is vague.
155
small change in input produces at most a small change in the value of the function), but not in the final official definition of continuity, which says roughly that for any positive-sized target area in the codomain (whether or not we would ordinarily regard it as ‘small’) we can find a positive-sized launch area in the domain (which again need not be ‘small’ in the ordinary absolute sense) such that everything sent by the function from that launch area lands in that target area. The notion of absolute closeness of items in the domain, and of values in the codomain, plays no role here. This is a good thing in the context of a highly general mathematical definition, but it is not a good thing in a definition of vagueness—for the reasons we have seen (and see further n. 57 below). On the discrete domain of men considered in the previous paragraph, the predicate ‘has 100 or less hairs on his head’ comes out as precise by the Closeness definition, because its characteristic function assigns True to man 100 and False to man 101, who differ by just a hair. These two men are very close, in absolute terms, in hair count, and hence in respects relevant to whether someone has n or less hairs on his head (for any n). Yet our predicate is flat-out true of one of them, and flat-out false of the other. That makes the predicate precise by my definition—and that is clearly the right result, in contrast to the continuity view, which, as already discussed, deems the predicate vague.⁴⁵ 3.4.4 Partially Vague Predicates Now for an important qualification. For a start, note that we would not want to say that a predicate F is vague if it satisfies Closeness trivially—that is, either because there are no a and b such that a and b are very close in F-relevant respects, or because for every a and b, Fa and Fb are very close in respect of truth (a sub-type of the latter type of case is where the characteristic function for F is a constant function—for example, F applies to everything or to nothing). Furthermore, note that we would not want to say that in order for a predicate to be vague, it must satisfy Closeness non-trivially across the entire domain of discourse. For consider the predicate ‘is tall or exactly four feet in height’ (abbreviated E)—or the predicate ‘is in his early thirties’.⁴⁶ If Bob is four feet in height, and Bill is one nanometre taller, then Bob and Bill are very close in respects relevant to whether ⁴⁵ Thanks to John Cusbert for helpful feedback on the material in this section. ⁴⁶ Thanks to Kit Fine for the first example, and to Brian Weatherson for the second.
156 a thing is E. Yet ‘Bob is E’ is true, while ‘Bill is E’ is false, and hence these two sentences are not very similar in respect of truth. So Closeness is violated here—and yet one might have the intuition that E is vague. Similarly, if Bill’s thirtieth birthday is tomorrow, and Ben’s is today, then Bill and Ben are very close in respects relevant to the application of the predicate ‘is in his early thirties’; yet ‘Bill is in his early thirties’ is false, while ‘Ben is in his early thirties’ is true—so Closeness is violated here, and yet, intuitively, ‘is in his early thirties’ is vague. Contrast the predicate ‘is tall or greater-than-or-equal-to exactly four feet in height’ (abbreviated O): this predicate also fails to satisfy Closeness, but intuitively it is not vague. So what is the difference between E and ‘is in his early thirties’, on the one hand, and O, on the other? A first thought is that there is some set of things such that for any objects a and b in this set, it is (non-trivially) the case that if a and b are very similar in E-relevant respects, then Ea and Eb are very similar in respect of truth, while there is no set of things such that for any objects a and b in that set, it is (non-trivially) the case that if a and b are very similar in O-relevant respects, then Oa and Ob are very similar in respect of truth. However this is not correct: there is a subset of the domain of discourse over which O —and other intuitively precise predicates such as ‘is greater-than-orequal-to exactly four feet in height’—non-trivially satisfies Closeness: the subset consisting of people who are either less than three feet in height or more than seven feet in height. We can avoid this problem—and at the same time avoid the need for a separate non-triviality clause—as follows. Say that a set S is F-connected iff for any two objects in S, either they are very close in F-relevant respects, or they can be connected by a chain of objects—all of which are in S —with adjacent members of the chain being very close in F-relevant respects. Say that a set S is F-uniform iff for every a and b in S, Fa and Fb are very similar in respect of truth. If a set is not F-uniform, say that it is F-diverse. Say that a predicate satisfies Closeness over a set S iff it satisfies Closeness when the initial quantifiers ‘for any objects a and b’ in Closeness are taken as ranging only over S. Now we arrive at our final definition of vagueness: Vagueness as Closeness A predicate F is vague iff there is some F-connected, F-diverse set S of objects such that F satisfies Closeness over S.
157
We can now classify vague predicates as follows. A predicate is totally vague iff it is vague (as just defined), and furthermore satisfies Closeness over every F-connected, F-diverse set of objects. A predicate is partially vague iff it is vague but not totally vague. Thus ‘tall’ is totally vague, E and ‘is in his early thirties’ are partially vague, and ‘is exactly four feet in height’ and O are not vague. Intuitively, these classifications are exactly the right ones. 3.4.5 Fundamental Properties What of predicates F ascribing fundamental properties—for example, ‘spin up’—where there are no F-relevant respects?⁴⁷ We have two choices: we F
can say that every x and y are such that x ≈ y, or that no x and y are such F
that x ≈ y. It makes no difference: either way, any such predicate F will count as non-vague on my definition.⁴⁸ This is a consequence I am happy to accept. I am also happy to accept the consequence of my definition that any predicate which applies equally to everything is non-vague.⁴⁹ 3.4.6 Beyond One-Place Predicates The Closeness definition of vagueness is framed in terms of one-place predicates, but can be generalized.⁵⁰ First, the generalization to many-place predicates. The n-place predicate R is vague if and only if the following holds: if the n-tuples (x1 , . . . , xn ) and (y1 , . . . , yn ) are very close in R-relevant respects, then R(x1 , . . . , xn ) and R(y1 , . . . , yn ) are very close in respect of truth.⁵¹ For example, if (Bill, Ben) and (Bob, Maisy) are very close in respects relevant to whether the first-mentioned person loves the second-mentioned person, then ‘Bill loves Ben’ and ‘Bob loves Maisy’ are very close in respect of truth. Second, the account applies not just to predicates, but to their worldly counterparts: namely properties and relations; see the formulation of Closeness at the end of ⁴⁷ Possessing the property F itself is not counted as an F-relevant respect. F
F
⁴⁸ Case (i): ¬∃x, y(x ≈ y). Then there are no F-connected sets. Case (ii): ∀x, y(x ≈ y). So every set is F-connected. Now pick an arbitrary F-diverse set of objects. By definition of F-diversity, there are a and b in this set such that Fa and Fb are not very similar in respect of truth. But by hypothesis of this case, a and b are very similar in F-relevant respects. So F does not satisfy Closeness over this set. ⁴⁹ If F applies equally to everything, then there are no F-diverse sets. ⁵⁰ This is an important advantage over some other proposed definitions—e.g. Greenough’s, which is limited (a) to predicates which (b) exhibit only one-dimensional vagueness. ⁵¹ Thanks to Gideon Rosen for this improvement of my original formulation.
158 §3.4.2. Finally, what about the idea that there might be vague objects? I argue elsewhere that for there to be vague objects is for certain special properties or relations to be vague: for example, the part–whole relation, the existence-at-a-world relation (which holds between an object and the worlds at which it exists),⁵² and the occupation relation (which holds between an object and the spacetime points it occupies).⁵³ Therefore the Closeness definition applies, by extension, to vague objects: their vagueness is a matter of the vagueness of certain properties and relations, and the vagueness of these is handled by the Closeness definition. The Closeness definition does not cover everything to which the term ‘vague’ has ever been applied. I regard this as a virtue, not a vice. As discussed in §3.1, a definition of vagueness should get to the essence of the phenomenon that philosophers have been investigating under the heading ‘vagueness’. It should classify as vague something which has at some time been called ‘vague’ by someone only if that thing is relevantly like central examples of the core phenomenon under investigation. Weatherson (2006, 3, 11) objects to the Closeness definition on the grounds that it does not seem to apply to predicate modifiers such as very—i.e. it does not classify them as vague or as non-vague. Weatherson is not arguing that ‘very’ is vague; but he does think that the question whether it is vague or not is a good one, and that it is a requirement on a definition of vagueness that it allow the question to be asked. Now at first sight it might indeed seem as though ‘very’ could well be vague in the sense in which we are interested, and so should be handled by my account. After all, isn’t ‘very tall’ vague in the same sort of way as ‘tall’? Yes, the latter seems true—but it does not imply the former. We have intuitions concerning the vagueness of ‘tall’, the vagueness of ‘very tall’, and perhaps the relationship between these (‘very tall’ might perhaps seem to be less vague than ‘tall’). But do we have any intuition concerning the vagueness of ‘very’, all by itself? I don’t; contra Weatherson, I cannot even make clear sense of the question whether ‘very’ (by itself) is vague. The genuine question in this area seems to be: What in general is the relationship between the vagueness of F and the vagueness of ‘very F’? The Closeness definition has the resources to ⁵² See Smith 2005a.
⁵³ But not the identity relation; see Smith 2008.
159
make sense of this question. Is there a separate question as to whether ‘very’ by itself is vague? I think not, and so I am not concerned if the Closeness definition does not have the resources to make sense of such a question.
3.5 The Advantages of Closeness Having presented a definition of vagueness, I shall now present the positive case for accepting it as correct. The argument proceeds by showing that predicates which, by supposition, conform to the Closeness definition behave in just the ways that vague predicates actually do behave—across a range of cases and to a rich level of detail. This gives us reason to think that vague predicates have the same underlying nature as predicates which conform to the Closeness definition—i.e. it gives us reason for thinking that Closeness is the correct definition of vagueness. (As noted at the end of §3.4.1, the case for Closeness will not have been completely made until we show, in Chapter 4, that alternative definitions, which might be proposed by non-degree theorists in light of my discussion of Closeness, do not share these advantages.) 3.5.1 Tolerance Intuitions without Incoherence Wright has identified a certain position in the philosophy of language which he calls the governing view. It has two parts. First, there is the thesis that mastery of a language consists in the internalization of a set of semantic and syntactic rules that are definitive of that language. Second, there is the thesis that masters of a language can gain an explicit knowledge of the rules of which—according to the first thesis—they have an implicit understanding, by reflecting on such things as speakers’ known limitations, for example of perception and memory; standardly accepted criteria of misunderstanding a given expression; the generally accepted purpose of certain expressions; and the standard ways in which new users are trained to use certain terms. Wright argues that if the governing view is correct, then vague predicates are tolerant, where a predicate F is tolerant with respect to φ if there is some positive degree of change in respect of φ that things may undergo, which is ‘‘insufficient ever to affect the justice with which F is applied to
160 a particular case’’ (Wright 1975, 334). In present terms, we can express a close relative of this idea as follows: Tolerance If a and b are very close in F-relevant respects, then ‘Fa’ and ‘Fb’ are identical in respect of truth.⁵⁴ The great problem with Tolerance is that, when conjoined with the claim that we can construct a Sorites series for the predicate F, it leads to contradiction (in particular, to the claim that each object in the Sorites series both is and is not F). Nevertheless, Tolerance itself—considered apart from its unpalatable consequences—has been found very appealing: whether or not they accept the full-blown governing view, many have been strongly inclined to accept Wright’s use of this view in deriving the conclusion that vague predicates are tolerant. In fact, all the considerations in favour of the idea that vague predicates conform to Tolerance are, equally, considerations in favour of the idea that vague predicates conform merely to Closeness (and not Tolerance as well—note that Tolerance is a special case of Closeness); and Closeness (without Tolerance) generates no contradictions.⁵⁵ Thus one of the great advantages of Closeness is that it gives us tolerance intuitions without incoherence. Let us see how this works. Wright’s argument that, given the governing view, vague predicates are tolerant, proceeds by way of examples.⁵⁶ First, we have the case of ‘heap’. If we look at the occasions of use of this predicate, it seems it must be tolerant with respect to small changes in number of grains. For ‘heap’ is ‘‘essentially a coarse predicate’’: we use it on the basis of ‘‘rough and ready judgement’’, and in these contexts would have no use for ‘‘a precisely demarcated analogue’’. In short, ‘‘Our conception of the conditions which justify calling something a heap of sand is such that the justice of the description will be unaffected by any change which cannot be detected ⁵⁴ That is, they have the same truth value, or they both lack a truth value. Note that where Wright is concerned with applying predicates, Tolerance is concerned with the truth of sentences which express such applications; this difference has no significance for what follows. ⁵⁵ If a Sorites series consists of objects x1 , . . . , xn , then Closeness tells us that Fxi and Fxi+1 must always be very similar in respect of truth—which is quite compatible with Fx1 being true simpliciter and Fxn being false simpliciter. Note, however, that given bivalence, Closeness reduces to Tolerance. So Closeness plus bivalence plus a Sorites series leads to contradiction. This issue will be discussed in detail in Ch. 4. ⁵⁶ I shall discuss the examples in Wright 1975, 331–8 (see also Forbes 1983, 239–41 for a hearty endorsement of Wright’s views). Wright 1987, §VII introduces a number of qualifications, and draws (somewhat) less far-reaching conclusions.
161
by casual observation’’ (p. 335). Second, we have the case of predicates such as ‘infant’, ‘child’, ‘adolescent’, and ‘adult’. If we look at the social importance of these predicates, we see that they must be tolerant with respect to small changes in maturity. For ‘‘it would be irrational and unfair to base substantial distinctions of right and duty on marginal—or even non-existent—such differences’’ (pp. 336–7). Third, we have the case of colour terms. If we look at the means of acquisition of these predicates, we see that they must be tolerant with respect to small changes of shade. We learn such terms by ostension, so it must be the case that ‘‘changes too slight for us to remember . . . never transform a case to which such a predicate applies into one where such is not definitely correctly the right description. The character of . . . basic colour training . . . presupposes the total memorability of the distinctions expressed by our basic colour predicates; only if single, unmemorable changes of shade never affect the justice of a particular, basic colour description, can the senses of these predicates be explained entirely by methods reliant upon our capacity to remember how things look’’ (p. 336). Overall, the idea is that if we pay attention to the sorts of consideration which the second thesis of the governing view deems relevant to determining the meanings of our terms, we must conclude that vague predicates are tolerant. Wright sums up the lesson of the examples as follows: ‘‘Our embarrassment about where to ‘draw the line’ with these examples is . . . a reflection . . . of the tolerance of the predicates in question’’ (p. 337). However, in all cases, the facts cited support not only the view that vague predicates are tolerant, but also the view that vague predicates conform merely to Closeness (and not Tolerance). Consider the first example. If ‘heap’ is a predicate of casual observation, then certainly there cannot be a difference of just one grain between a thing to which ‘heap’ clearly applies and a thing to which ‘heap’ clearly does not apply, for such a difference would not be noticeable to casual observers. It could be the case, however, that a negligible or insignificant difference (say, of one grain) between two objects makes a negligible or insignificant difference (i.e. one which we are entitled to ignore for all practical purposes) to the applicability of the word ‘heap’, and also that many insignificant differences add up to a significant one: this does not conflict with casual observationality, because many insignificant differences put together are noticeable to casual observers. The crucial point here is this: in order for a predicate to be usable
162 in a context of casual observation, there does not need to be no difference in application of the predicate to objects that cannot be told apart by casual observation—there just needs to be no such difference in application that cannot be safely ignored in the context of casual observation. Of course, negligible differences in application add up and cannot be ignored—but that is not a problem, because negligible differences in the objects to which they apply also add up, and can be detected by casual observation.⁵⁷ Similarly, the social importance of predicates of degree of maturity is incompatible with there being a heartbeat that makes the difference between someone to whom ‘child’ clearly applies and someone to whom ‘adult’ clearly applies. It is not, however, incompatible with a negligible or insignificant difference (say, of one heartbeat) between two persons making a negligible or insignificant difference (i.e. one which we are entitled to ignore for all practical purposes) to the applicability of the word ‘child’, or with many heartbeats making a significant difference. Thus in these first two cases Wright is unwarranted in saying ‘‘Our embarrassment about where to ‘draw the line’ with these examples is . . . a reflection . . . of the tolerance of the predicates in question’’—for this embarrassment could just as well be a reflection of the fact that these predicates conform to Closeness (and not Tolerance). What about the third example? Could the use of colour predicates be taught ostensively if these predicates were not tolerant, but merely conformed to Closeness? Yes. Wright notes that we could not learn colour words ostensively if we did not have the capacity to remember how objects look—but he ignores another of our capacities, which is equally crucial to our being able to learn colour words ostensively: we are able to discern a structure of relationships of closeness in colour-relevant respects on a ⁵⁷ Recall the discussion in §3.4.3 of the idea of defining a vague predicate as one whose characteristic function is continuous. We can now see a further reason why the continuity proposal is inferior to the Closeness proposal. The predicate ‘is six∗ feet in height’, whose characteristic function assigns 0 to persons whose height differs from 6 ft by 1 nm or more, 1 to persons whose height is exactly 6 ft, and in the intervals (6 − 1 nm, 6 ) and (6 , 6 + 1 nm) changes smoothly from 0 to 1 and from 1 to 0 respectively—and hence is vague according to the continuity definition, but non-vague according to the Closeness definition—is just as unusable from the point of view of casual observation as the precise predicate ‘is exactly six feet in height’. The fact—which presumably underlies their ubiquity—that vague predicates are essentially useful in contexts of rough and ready observation is thus intimately linked to the inclusion in their definition of an absolute notion of similarity (where two things are very similar in some respect precisely if they are not distinguished in that respect in contexts of casual observation) which would be out of place in a highly general, abstract definition of a mathematical concept such as continuity.
163
given set of objects. Children, for example, will quite naturally sort their coloured pencils into a rainbow pattern, without instruction, and without an actual rainbow pattern to copy: they can just see that red and orange are closer together than red and green (that is, they are closer together in respect of colour, and hence belong closer together in space, when the pencils are sorted in their tin). Now, the ostensive teaching of colour words can proceed by the indication of paradigms and foils for each colour, with the (implicit) instruction that ‘red’, for example, applies to an object in proportion to its similarity to the red paradigms. Thus, if an object is very close to a red paradigm, it is red, or as good as for all practical purposes; if it is very close to an orange paradigm, it is orange, or as good as for all practical purposes; and if it is somewhere between red and orange, then it is to some extent red and to some extent orange. This account of the learning of colour terms in fact fits perfectly with our actual practice with these terms. We do not know how to describe the colour of every object we see, just as the story I have told would predict. The instructions do not fix a particular way of describing every object we might come across: they fix what to say about things very close to a paradigm, but what to say about other objects is more open—although it is not totally open. Things would not work the way they actually do if an undetectable change from a red paradigm could make it false to say ‘That’s red’; but things would work exactly as they do now if such negligible changes made it negligibly less true to say ‘That’s red’. Thus, predicates which satisfy Closeness (but not Tolerance) would exhibit precisely the features Wright notes. Wright’s examples do not, then, establish that vague predicates are tolerant: the examples are just as compatible with the view that vague predicates are intolerant (small changes need not make no difference) but conform to Closeness (small changes never make a big difference). Given also that we believe that some things are red and some not, that some men are bald and some not, and so on, one conclusion we might draw is that vague predicates are tolerant, and subject to inconsistent requirements of use. Of course, the more reasonable conclusion to draw is that vague predicates are not subject to inconsistent requirements: they do not conform to Tolerance; they simply conform to Closeness. Accepting Closeness takes us far enough along the road to Tolerance to capture the intuitions which Wright uses to motivate Tolerance, without taking us so far as to run into
164 contradiction. Closeness is everything that is right about Tolerance, and nothing more. 3.5.1.1 Observational Predicates Wright has a further argument from the second thesis of the governing view to the conclusion that colour predicates are tolerant. The argument turns on the claim that such predicates are observational, and generalizes to all observational predicates.⁵⁸ Wright (1975, 338) writes: ‘‘The information of one or more senses is decisive of the applicability of an observational concept.’’ Thus, in the case of colour predicates, whether or not such a predicate applies to an object can be determined just by looking: one does not need to use chemical analysis, instruments, or so on. Wright now argues as follows: Since colour predicates are observational, any pair of objects indistinguishable in point of colour must satisfy the condition that any basic colour predicate applicable to either is applicable to both. It is, however, familiar that we may construct a series of suitable, homogeneously coloured patches, in such a way as to give the impression of a smooth transition from red to orange, where each patch is indiscriminable in colour from those immediately next to it. . . . So, since precise matching is to be sufficient for sameness of colour, we can force the application of ‘red’ to all the patches in the series, some of which are not red but orange. That is: since ‘red’ is observational, its sense must be such that from the premises, that x is red and that x looks just like y, it follows that y is red, no matter what objects x and y may be. This rule enables us to conclude that each successive patch in our series is red, given only the true premise that the first patch is red. (1975, 338–9)
The problem with this argument is that Wright conflates precise matching and indistinguishability. Certainly, if colour predicates are observational, then they must apply equally to objects which are indistinguishable by ordinary observational means. They need not, however, apply equally to objects which precisely match. This is because objects which precisely match (i.e. which cannot be distinguished in colour on the basis of direct comparison) need not be observationally indistinguishable. For ordinary observational means of distinguishing two objects extend beyond direct comparison of them: one can compare each with a third object, compare them under different coloured lights, and so on. ⁵⁸ Wright’s argument here has roots in Dummett 1997 [1975] and in Russell 1997 [1923], 64.
165
What about the predicate ‘looks red’, as opposed to ‘is red’? Here there are two responses that we might make. First, there is the response that parallels the one given in the case of ‘is red’. Just as ‘is red’ need not apply to exactly the same extent to objects which match precisely, if these objects are distinguishable by ordinary observational means, so too for ‘looks red’. But this option is not quite as attractive in the case of ‘looks red’: it is more natural to say that adjacent patches in the Sorites series are ever so slightly different in respect of redness than it is to say that they look ever so slightly different. However, there is a second possible response, based on Raffman (2000). We need not deny that if two objects match, then when they are viewed side by side in direct comparison, they look the same (in respect of colour). Rather, we claim that whether or not two things look the same (and likewise, whether or not they look red) depends upon the context in which they are viewed. Viewed one pair at a time, adjacent patches in the Sorites series look the same, and ‘looks red’ applies to each to exactly the same extent. Viewed together in series, the first and last patches are easily distinguishable in colour: the first patch looks red and the last patch looks orange. No problem: for when viewed in the context of the whole series, it is not the case that both members of every adjacent pair look the same, and it is not the case that ‘looks red’ applies to each to exactly the same extent. Do not try to focus on two adjacent patches that do not look exactly alike, however: as soon as you look at the pair alone, its two component patches do look exactly the same as one another. Whilst it is not plausible to suggest that the context of viewing alone might affect the colour of an object, it is plausible to suggest that the context of viewing affects the look of an object: the look of a thing depends not only on the lighting and so on, but upon what else is in the visual field when the thing is observed. Thus, the fact that adjacent patches look the same when viewed alone provides no basis for concluding that if the first patch in the series looks red, then so does the last. 3.5.2 Blurred Boundaries In the next three sections I show that predicates which conform to Closeness have the three features which we have taken jointly to provide a surface characterization of vagueness. The Closeness definition may be regarded as a spelling-out of Frege’s blurred boundary metaphor. Closeness is a constraint upon the relationship
166 between the extension or characteristic function of a vague predicate and the absolute closeness relationships associated with that predicate: the extension cannot impose a big difference in F-ness between two objects which are not very different in F-relevant respects. Now assuming that for any predicate and any possible object, there is another possible object that is very close to the first in F-relevant respects,⁵⁹ consider the set of all possible objects, structured by relationships of closeness in F-relevant respects. Given Closeness, the extension of F amongst this set cannot consist in a sharp line between the F’s and the non-F’s: rather, F-ness must gradually fade away as one travels further from the definite F objects. To take a concrete example, consider the term ‘red’, and suppose that it conforms to Closeness. This term does not cut a sharp band out of the rainbow: as one moves across the points of the rainbow, small steps in red-relevant respects—which in this case correspond to small steps in space—can never, given Closeness, make for big changes in the truth of the claim that the point one is considering is red. By small steps one can move from full-fledged red points to full-fledged non-red points: but there is no sharp boundary between them that can be crossed in one small step. Thus Closeness yields an explanation of the blurred boundaries phenomenon. 3.5.3 Borderline Cases A predicate which satisfied Closeness would admit of borderline cases. Consider a predicate F, which conforms to Closeness, and a Sorites series x1 , . . . , xn for F. Fx1 is true, and Fxn is false; but given Closeness, it cannot be that there is an i such that Fxi is true and Fxi+1 is false (for in a Sorites series for F, adjacent items are always very close in F-relevant respects—and a sentence which is true is not very close in respect of truth to one which is false). There must then be sentences Fxi which are neither true nor false—and the corresponding objects xi are borderline cases for F. Thus, if we characterize vagueness in terms of Closeness, we can see why vague predicates admit of borderline cases, without being committed to the false converse claim that every predicate which admits of borderline cases is vague. ⁵⁹ See the end of §3.5.4 for further discussion.
167
3.5.4 Sorites Paradoxes Sorites arguments have two striking and (in combination) perplexing features: they are compelling, and yet we do not accept them. We find ourselves inclined to accept the premisses and the reasoning, but we will not accept the conclusion. We are taken in, but not convinced. Giving rise to Sorites paradoxes—which have this perplexing combination of features—is one of the characteristics of vague predicates. As discussed near the beginning of this chapter, a definition of vagueness should yield an understanding of why vague predicates have this characteristic. One of the key advantages of the Closeness definition is that it yields such an understanding. That is, if we suppose that a predicate conforms to Closeness, we can see both why a Sorites paradox for this predicate is compelling, and also how the paradox is mistaken. Consider a standard version of the Sorites paradox: 1. This 10,000-grain pile of sand is a heap. 2. If we remove one grain of sand from a heap, we still have a heap. 3. So even if we removed 10,000 grains from this heap, we would still have a heap. In general, say that a Sorites series for a predicate F is a series of objects which begins with an object which is F and ends with an object which is not F, and in which adjacent items are very close in F-relevant respects. Given a Sorites series for F, a typical formulation of a Sorites paradox for F will be as follows: 1. The first object in the series is F. 2. For any object in the series (except the last), if it is F, then so is the next object. 3. Therefore the last object in the series is F. Distinguish two readings of the second premiss: the Closeness reading and the Tolerance reading. On the Closeness reading, the second premiss expresses (a consequence of) the claim that ‘heap’ (or in general F) conforms to Closeness. Two piles of sand a and b which differ by just a grain are very similar in heap-relevant respects, thus the two claims ‘a is a heap’ and ‘b is a heap’ must be very similar in respect of truth. So if a is a heap, then for all
168 practical purposes we can just say that b is a heap too. On the Tolerance reading, the second premiss expresses (a consequence of) the claim that ‘heap’ (or in general F) conforms to Tolerance. Two piles of sand a and b which differ by just a grain are very similar in heap-relevant respects; thus the two claims ‘a is a heap’ and ‘b is a heap’ must be exactly the same in respect of truth. So if a is a heap, then b is a heap too: not just for all practical purposes, but without qualification. On the Tolerance reading, the conclusion follows from the premisses. On the Closeness reading, it does not: each successive statement ‘this is a heap’ (said after removing one grain) must be very similar in respect of truth to the one before, but need not be exactly the same in respect of truth; and so by the end, the final statement may be simply false. Suppose we accept that a predicate F satisfies Closeness, but not Tolerance. The very fact that we accept Closeness will mean that in many ordinary circumstances, we act as if we believe Tolerance. For Closeness tells us that a negligible or insignificant difference between a and b in F-relevant respects makes for at most a negligible or insignificant difference between Fa and Fb in respect of truth—and a negligible or insignificant difference is one which we are entitled to ignore for practical purposes. So for practical purposes, when there is a negligible or insignificant difference between a and b in F-relevant respects, we will simply ignore any difference between Fa and Fb in respect of truth, and so treat them as being identical in respect of truth. This practice is licensed by our acceptance of Closeness. This explains why someone who accepts Closeness, but not Tolerance, will find the Sorites paradox compelling. Of course, she accepts the first premiss—everyone does.⁶⁰ She accepts the second premiss—taken as an expression of Tolerance—because, as we have seen, someone who accepts Closeness will in ordinary circumstances regard Tolerance as an acceptable (and indeed useful: we needlessly clutter up our thought if we do not ignore negligible things) approximation of what she accepts. But then the unacceptable conclusion follows: for the argument is valid when the second premiss is read as expressing Tolerance. So that is why the paradox has force. But we can also explain why someone who accepts Closeness, but not Tolerance, will regard the paradox as ultimately mistaken, even though initially compelling. Tolerance is an acceptable approximation of Closeness ⁶⁰ Well, some philosophers reject it, e.g. Unger 1979a, b.
169
only in ordinary circumstances—and a Sorites series is not something we ordinarily encounter. When the person who accepts Closeness, but not Tolerance, encounters a Sorites series and runs through the Sorites reasoning, she will then retreat from Tolerance—which is an approximation of her real belief—to Closeness—which is what she actually accepts—thereby avoiding the unpalatable conclusion (for the Sorites argument is invalid when the second premiss is read as expressing Closeness). Consider an analogy.⁶¹ We do not believe that dust particles are weightless: we believe that the weight of a dust particle is negligible. But this very belief licenses us to accept the claim that dust particles are weightless as a useful approximation of our real belief, in ordinary circumstances. Given that the weight of a dust particle is negligible, we do well to ignore it! We do not demand that the delicatessen assistant remove all specks of dust from the scale arms before weighing our smallgoods, and we do not wash and dry our hair (to remove all dust particles) before weighing ourselves. Nevertheless, the claim that dust particles are weightless is revealed as a mere approximation to our real belief—and not something we actually believe—in certain situations: for example, when we are arranging to empty the bag from the dust extraction system at our carpentry shop, which weighs 85 kg when full (of nothing but dust particles). The claim that a dust particle weighs nothing is a useful approximation to our true belief, except when we come across many dust particles together, at which time we see clearly that the claim is just an approximation to what we really believe, which is that the weight of a dust particle is very very small. Similarly, the claim that the predicate ‘heap’ applies equally to two things which differ negligibly in heap-relevant respects is a useful approximation of the real belief of someone who accepts Closeness (but not Tolerance), except when she encounters many pairs of things which differ negligibly in heap-relevant respects put together—i.e. a Sorites series—at which time it becomes clear that the claim is just an approximation of what she really believes, which is that the difference in truth value between ‘x is a heap’ and ‘y is a heap’ is at most very small, when x and y are very similar in heap-relevant respects. In sum, then, the Closeness definition yields an explanation of both how Sorites paradoxes are mistaken and why they are nevertheless compelling: if ⁶¹ Thanks to John Cusbert for first suggesting a similar analogy in this context, and for helpful feedback on the material in this section.
170 we believe Closeness, but not Tolerance, we will thereby be licensed to accept Tolerance as a useful approximation of our real belief. Being thus accustomed to working with Tolerance as a useful approximation, we will be inclined to accept the second premiss of the Sorites reasoning, when it is read as saying that ‘heap’ conforms to Tolerance. That—together with the fact that the reasoning is valid when the second premiss is given the Tolerance reading, and the fact that the first premiss is obviously true—explains the force of the paradox. However, the reasoning itself shows us that this is one of those contexts in which the approximation is inappropriate, and needs to be replaced by our real belief: that ‘heap’ conforms to Closeness (not Tolerance). And with the second premiss of the Sorites reasoning read as saying that ‘heap’ conforms to Closeness, the argument is invalid. That explains why the paradox is mistaken. I have shown that someone who accepts that F satisfies Closeness (and not Tolerance) will initially be taken in by a Sorites paradox for F, but will not ultimately be convinced of the conclusion. This behaviour matches perfectly the reactions of ordinary speakers to Sorites paradoxes involving vague predicates—and this gives us reason to believe that ordinary speakers do accept that vague predicates satisfy Closeness (and not Tolerance). As one presents the paradox, a typical audience agrees that removing a grain of sand from a heap leaves a heap, that removing a hair from a hirsute man leaves him hirsute, and so on. That is, they assent to claims of tolerance on the part of the vague predicates in question. This is compatible with the hypothesis that they really believe that vague predicates are tolerant—but it is also compatible with the hypothesis that they do not really believe this, but assent to it as a useful approximation of what they really believe, which is that vague predicates conform to Closeness. When one then says, ‘‘But then, by your reasoning, one grain of sand is a heap!’’, the audience invariably baulks. They say we can remove one grain, or two grains, or even quite a few grains, but we cannot go on removing grains indefinitely and still have a heap. So they accept the major premiss of the Sorites paradox, but then deny commitment to the conclusion that all men are bald (etc.). Unless we suppose that our audience is simply very confused—and we should not suppose that, unless we can find no better explanation of what they say—we must conclude that what they really believe is not Tolerance, but Closeness, and that they accepted the major premiss of our argument not as a literal truth, but as a useful approximation to one.
171
Contrast the proposal that vague predicates conform to Tolerance. Someone who accepted that F satisfies Tolerance would find the Sorites argument for F compelling all right: so compelling that she would have to regard it as a sound argument which establishes its conclusion. But of course that response does not match the reaction of ordinary speakers to actual Sorites paradoxes involving vague predicates. In Chapter 4 we shall see examples on the other side: views about what vagueness is which can explain why Sorites paradoxes are mistaken but not why they are compelling. Only the Closeness definition can run the gauntlet here, explaining why vague predicates give rise to Sorites arguments which are at the same time compelling yet unconvincing. Note that the definition of vagueness in terms of Closeness leaves open the possibility of a predicate which is vague but not Sorites-susceptible. For what we have shown is that if S believes that F conforms to Closeness, then given a Sorites series for F, we can construct a Sorites paradox for F which S will find compelling (and mistaken). But sometimes it is difficult to imagine a Sorites series for F: a series of (possible) things ranging from one which is F to one which is non-F, with adjacent items in the series being very close in F-relevant respects. This is the possibility that Soames (1999, 217) has in mind when he denies that all vague predicates are Sorites predicates, and as he says, in this case ‘‘compelling versions of the Sorites paradox are much harder to construct’’.⁶² Nevertheless, they can usually be constructed by artificial devices. For example, imagine beginning with a paradigm F object, and removing one microscopic speck of matter at a time until eventually one reaches an object that is non-F (Unger 1979a). Alternatively, begin with a paradigm F object a and a paradigm non-F object b, and take as the intermediate objects in the series the things depicted by the frames of a piece of movie footage which consists in a slow-motion morph from a to b. These ideas do not cover every possible case, but there are other tricks that can be employed. Thus I think that vagueness and Sorites-susceptibility are closely bound together. Nevertheless, Soritessusceptibility does not belong in the definition of vagueness. The Closeness definition puts Sorites-susceptibility in its proper place: a symptom, rather than a constitutive feature, of vagueness. For if a Sorites series exists for a predicate F, then supposing that F conforms to Closeness, we can ⁶² See also Shapiro 2006, 4.
172 explain why the associated Sorites argument is compelling (and mistaken). However, if F is not susceptible to a Sorites paradox, this does not mean that F is not vague: it might be that F is vague, but there is no readily imaginable Sorites series for F.⁶³ 3.5.5 Higher-Order Vagueness A prominent issue in the vagueness literature is that of higher-order vagueness. (I said in Chapter 2 that in my view, two distinct issues have been conflated in the literature under this heading. One issue concerns how the meanings of vague predicates could be determined by our usage—the location problem. This problem was discussed at several points in Chapter 2, and will be discussed further in Chapter 6. In the present section I am talking about the other issue—the jolt problem.) This issue arises from the view that a predicate is vague if it has borderline cases. We begin by saying that a vague predicate divides objects into three sets: the positive cases, the negative cases, and the borderline cases. But then, as Sainsbury (1991, 168) notes, ‘‘it soon appears that the idea that there is a sharp division between the positive cases and the borderline ones, and between the borderline cases and the negative ones, can no more be sustained than can the idea that there is a sharp division between positive and negative cases’’. So we now posit two new sorts of borderline case: between the original borderline cases and the positive cases, and between the original borderline cases and the negative cases, thus dividing objects into five sets. We may now generalize in an obvious way: in Sainsbury’s terminology, a predicate is vaguen if it divides objects into 2n + 1 sets; a vague predicate is vaguen for some n > 0; a higher-order vague predicate is vaguen for some n > 1; and a radically higher-order vague predicate is vaguen for all n. As discussed in §3.2, merely possessing borderline cases is not enough to make a predicate vague. Hence Sainsbury’s comment quoted in the previous paragraph, and later on in the same paper: ‘‘it is a theoretical possibility that ⁶³ Note that if a predicate F is vague according to the Closeness definition, then there is an F-connected, F-diverse set S of objects such that F satisfies Closeness over S. But an F-connected, F-diverse set S of objects does not automatically yield a Sorites series for F. An F-diverse set S is one for which it is not the case that for every a and b in S, Fa and Fb are very similar in respect of truth. Thus an F-diverse set need not contain an object which is clearly F and an object which is clearly not F —whereas a Sorites series begins with an object which is clearly F and ends with an object which is clearly not F.
173
there be predicates which are vague1 without being higher-order vague. Arguably, some predicates meet this condition, but, intuitively, meeting it is inconsistent with being a paradigm of vagueness’’ (1991, 169; see also p. 173). On a similar note, Fara and Williamson (2002, p. xxii) write: ‘‘For those unwilling to accept epistemicism, it might seem that vagueness just is higher-order vagueness.’’⁶⁴ It seems, then, that higher-order vagueness as just described is inaptly named: a predicate which is—according to the conception just outlined—vague without being higher-order vague is not, intuitively, vague at all. What is going on here is that a poor characterization of vagueness is accepted (i.e. vagueness as possession of borderline cases), and then our intuitive reservations about the characterization are given outlet in the positing of an additional phenomenon, over and above mere vagueness—i.e. higher-order vagueness. (This process can be seen, for example, in Fine 1997 [1975]. Fine asserts that the natural number predicate ‘nice1 ’ is vague, where the meaning of this predicate is given by the two clauses (i) n is nice1 if n > 15 and (ii) n is not nice1 if n < 13, and writes that ‘‘A predicate is . . . vague if it has borderline cases’’ (p. 120). He then devotes §5 of his paper to a discussion of higher-order vagueness, which he says is a ‘‘distinctive feature of vagueness’’ (p. 140).) The proper thing to do would be to accommodate these residual intuitions within the definition of vagueness itself. That is exactly what the Closeness definition does. The demand for higher and higher orders of borderline cases is just the demand for a gradual transition from the cases where the predicate clearly applies to the cases where it clearly does not apply, rather than a transition marked by a series of jolts—whether one big jolt or two smaller jolts or four even smaller jolts, and so on—and this idea of a gradual transition is captured in Closeness: as we proceed along the Sorites series for F, taking small steps in F-relevant respects as we move from one object x to the next (e.g. a difference of one hair in the case of ‘bald’, a difference of one nanometre in the case of ‘tall’, etc.), the sentence Fx takes correspondingly small steps in respect of truth. Thus, in light of the Closeness definition, the phenomenon of ‘higher-order vagueness’ may be seen as really being part of plain old vagueness—which is a good thing, given the earlier observation that ⁶⁴ For a similar view see Keefe 2000, 31. Cf. also Burns 1995, 29.
174 predicates which are vague but not higher-order vague (on the borderline case conception) are not vague at all (in the ordinary sense).⁶⁵ Another advantage of the Closeness definition of vagueness is, then, that it incorporates the residual intuitions which the borderline case definition leaves out, and which lead—when vagueness is defined in terms of possession of borderline cases—to the problem of higher-order vagueness. The Closeness definition thus gives us a clear understanding of the intimate relationship between vagueness and so-called higherorder vagueness: properly understood, the phenomenon of higher-order vagueness is simply part of vagueness.⁶⁶ 3.5.6 Formal Properties Last but not least among its advantages, the Closeness definition is clear and easy to grasp, and does not define vagueness in terms of other contentious concepts. The Closeness definition thus has all the features we sought in a definition of vagueness. It is clear and perspicuous, and it gets to the heart of the matter, offering a crisp account of the essence of vagueness, in terms of which we can understand why vague predicates have not only the features that figure in our informal characterizations of vagueness, but also a number of other distinctive features as well. ⁶⁵ Cf. Sainsbury 1991, who rejects the borderline case definition and characterizes vagueness in terms of what Sainsbury calls ‘boundarylessness’. He writes: ‘‘The phenomena which, from a classical viewpoint, lead to notions of ‘higher-order vagueness’ are accounted for by boundarylessness. But these phenomena are not bolt-on options; they are integral to the very nature of vagueness’’ (p. 179). ⁶⁶ Note that a predicate which satisfies Closeness is not necessarily radically higher-order vague, i.e. vaguen for all n (see p. 189 for further discussion of this point). Thus I am not forced to disagree (or agree) with Burgess 1990, 1998, who argues that higher-order vagueness peters out at a finite level. Nor am I required to rebut Wright’s 1992 argument to the conclusion that higher-order vagueness is paradoxical (for replies to Wright see Edgington 1993 and Heck 1993). This is because Wright’s argument turns on a certain principle concerning the logic of the definitely operator. Higher-order vagueness has traditionally been discussed in terms of such an operator—but the Closeness definition involves no such operator, and hence, while it accommodates the intuitions which motivate the conception of ‘higher-order vagueness’ discussed in this section, it does not inherit the problems associated with this operator.
4 Accommodating Vagueness In this chapter, I ask which of the types of theory of vagueness discussed in Chapter 2 can accommodate vague predicates—where ‘vague’ is understood in terms of the definition of Chapter 3. That is, I ask which types of theory of vagueness can allow for the existence of predicates that satisfy the Closeness definition. I show that only those theories that countenance degrees of truth can do so. More precisely, I argue that given the existence of a Sorites series for the predicate F, there is no way to accommodate the claim that F conforms to Closeness without accepting the idea that truth comes in degrees. The upshot is that we need a theory of vagueness that countenances degrees of truth—provided, of course, that vagueness is correctly defined in terms of Closeness. I provided the positive argument in favour of the latter claim in the previous chapter. In this chapter, I conclude the case for Closeness by providing negative arguments to the effect that alternative definitions which might be proposed as its replacement do not share its advantages.
4.1 Epistemicism No matter how sophisticated her epistemology, as long as her semantics is classical, the epistemicist cannot accommodate predicates which satisfy the Closeness definition—that is, given the correctness of the Closeness definition, she cannot allow that there are any vague predicates. Suppose that we have a Sorites series x1 , . . . , xn for the predicate F. If bivalence—thought of as the claim that not only are there only two truth values, but also that every declarative sentence has exactly one of them—is true, then Closeness must be violated here. Fx1 is true. x1 and x2 are very similar in F-relevant respects, so if F conforms to Closeness, then Fx1 and Fx2 must be very
Truth value of ‘Point x is red’
176
True
False k Points x on the strip (red points at left)
Figure 4.1. The strip according to the epistemicist.
similar in respect of truth. Given bivalence, the only way this can be the case is if Fx2 is true too. Similarly on down the series. But by the end, when we get to xn , it must be the case that Fxn is false. Thus if Closeness is to be respected, we need to abandon bivalence. Another way to make the point is to note that given bivalence, Closeness reduces to Tolerance: the only way in which two sentences can be very similar in respect of truth is by having the same truth value. So to capture Closeness without Tolerance—and hence without incoherence—we must reject bivalence. Before considering possible epistemicist responses to this objection, let us consider a concrete example. (We shall return to this example below; it will help us to focus ideas as we discuss the various different theories of vagueness.) Consider a strip of paper which is red at the left end and orange at the right end, and in between changes colour continuously from red to orange (so that points on the strip which are very close together in space are very similar in respect of colour). Now consider the sentence ‘Point x is red’ for each point x on the strip. For a point near the left end of the strip, the corresponding sentence is straightforwardly true, and for a point near the right end, the corresponding sentence is straightforwardly false. For points in the middle, the truth values of the corresponding sentences are unclear. According to the epistemicist, this unclarity is purely epistemic: every one of the sentences is either true or false. Figure 4.1 represents the situation according to the epistemicist. This flouts Closeness: the classical truth values True and False are not close (in respect of truth), and yet according to the classical semantic theory which underlies the epistemic theory of vagueness, there are points k (shown on the diagram) and k +
177
(a point to the right of, and arbitrarily close to, k) whose colours are very close together, while the sentence ‘Point k is red’ is True and the sentence ‘Point k + is red’ is False. This sort of flouting of Closeness is unavoidable for the epistemic theory. This objection to epistemicism is often phrased as follows: Contrary to what the epistemic theory tells us, there is no last point on the strip which is red. This is an unfortunate way of phrasing a good objection.¹ Point k is indeed the last point which is red (or else k + is the first point which is not red), according to the epistemicist. But this is not the problem with the epistemic account. On the contrary: if there were no last red point (and no first non-red point), then all the points in the series would be red—even the points at the end, which are clearly orange. This is absurd, and thus it is in fact the denial of the claim that there is a last red point or a first non-red point that is problematic. The real problem with the epistemic picture is that point k is a jump point of the characteristic function of the predicate ‘is red’ (the function which assigns to each point x on the strip the truth value of the claim ‘Point x is red’)—i.e. a point at which the value of the function suddenly jumps significantly—while k is not a location at which adjacent points on the strip have significantly different colours. Thus, the idea that if points x and y are very close in respect of colour, then the sentences ‘Point x is red’ and ‘Point y is red’ should be very close in respect of truth, is violated. The epistemicist might respond to this objection in several ways. First, she might note that I have been assuming that there are Sorites series associated with the vague predicates under discussion. If, however, there is no Sorites series for the predicate F, then my demonstration that the epistemicist account of the meaning of F violates Closeness does not go through. This would not get the epistemicist very far, however. For even if there are some vague predicates with no associated Sorites series, these are a special case (cf. §3.5.4). Certainly, the epistemicist cannot allow that the standard examples of vague predicates—for which there are associated Sorites series—are vague (i.e. satisfy Closeness), and this is problem enough. Second, the epistemicist might object to the use of a continuum of points in the example of the coloured strip. She might attempt to argue that in ¹ I am being charitable. Some people who say this about epistemicism may well be phrasing a bad objection accurately!
Truth value of ‘Patch x is red’
178
True
False k Patches x (red patches at left)
Figure 4.2. The patches according to the epistemicist.
all actual cases, we deal with finite, or at least countable, sets of objects. Setting aside the question of whether this is plausible, it would not help the epistemicist if it were true. Consider a finite series of distinct uniformly coloured patches (rather than a continuous strip of paper), with each patch very similar in colour to the next—but not indistinguishable from it—and the first patch red and the last patch orange (Fig. 4.2). Given certain natural assumptions discussed in §3.4.3, the epistemicist can correctly claim that the characteristic function of ‘is red’ in this situation is continuous. Yet while point k is not now a point of discontinuity of the characteristic function of the predicate ‘is red’ (as it was in the case of the continuous strip), it is still a jump point —that is, a point at which the value of the function suddenly jumps significantly—while k is not a point at which adjacent patches have significantly different colours. Thus Closeness is still violated. (Recall the discussion in the second half of §3.4.3, where I explicitly took cases of this sort as reason to define vagueness in terms of Closeness rather than continuity.) The epistemicist’s next move might be to argue that her framework can accommodate vague predicates in the following way. Closeness is satisfied by constant characteristic functions: functions which map every object to one value, in this case to True or to False. However, saying that vague predicates are true (or false) of everything is absurd. One of our most basic intuitions—one which contributes crucially to making Sorites paradoxes compelling—is precisely that some things are heaps and some are not, that some things are tall and some are not, that some things are red and some are not, and so on. The current proposal retains the intuitive idea
179
that one grain of sand never makes the difference between a full-fledged heap and a full-fledged non-heap, but only by saying either that everything is a full-fledged heap or that nothing is—that is, at the expense of denying either the claim that one grain of sand is not enough to make a heap or the claim that 10,000 grains of sand is enough to make a heap. In any case, the proposal falls foul of the final version of the Closeness definition in §3.4.4: a predicate F is vague iff there is some F-connected, F-diverse set S of objects such that F satisfies Closeness over S. If the characteristic function of F is constant, then there are no F-diverse sets, and so F is not vague according to the Closeness definition. (Indeed, part of the point of modifying the initial version of the Closeness definition was precisely to avoid classifying predicates which are true, or false, of everything as vague.) As her next move, the epistemicist might claim that for any vague predicate F, the structure of relationships of similarity in F-relevant respects is impoverished, in the following ways. First, the ternary relation ‘x is at least as close to z as y is, in F-relevant respects’ divides any set of objects into just two subsets: the F’s and the non-F’s. The members of any one of these two subsets are indistinguishable in terms of this relation: for any two objects v and w in one of these subsets, the relation holds with v in one of its three places (and objects a and b in the other two places) if and only if it holds with w in that place (and a and b in the other two places). The relation distinguishes v and w only if v is in one subset and w is in the other. Second—and more important in the present context—x is very close to y in F-relevant respects if and only if x and y are in the same subset, i.e. if and only if x and y are both F’s, or both non-F’s. The latter move means that epistemicism no longer violates Closeness: although it posits a jump in a Sorites series between the last F and the first non-F —it is true of the former that it is F, and false of the latter—this is now compatible with Closeness, because these two objects are not very close in F-relevant respects. Under this proposal, whether or not something is F becomes an Frelevant respect—that is, a respect relevant to whether or not that thing is F. This is something I earlier ruled out (see Ch. 3, n. 47)—for it leads to circularity and ungroundedness in the application of the predicate F to any object. Even apart from leading to this undesirable result, however, the proposal under consideration can be seen to be untenable by direct inspection of what it says about familiar cases. Consider piles of sand, for
180 example. The proposal tells us that as we remove grains from a large pile of sand, we make no difference at all to the pile, in any respect relevant to the application of the predicate ‘heap’—until we remove the single grain which renders the heap a non-heap, at which point we make a big difference to the pile both in respects relevant to the application of ‘heap’ and in respect of application of the predicate ‘heap’. This is absurd. Removing a grain of sand from a heap does make a difference to it, in a respect relevant to the application of the predicate ‘heap’. It might not make it a non-heap—that is, it might not stop the predicate applying. Indeed, it might not affect the application of the predicate at all. But it does change the heap in a respect that is relevant to the application of the predicate. That, when we think about it, is undeniable. Indeed, if it was not the case, then Sorites paradoxes would never be compelling: for the paradox turns precisely on having a series of objects in which adjacent members—every pair of adjacent members—are very similar in respects relevant to the application of the predicate in question.² Next, rather than try to accommodate it, the epistemicist might claim that the Closeness definition of vagueness is (subtly) incorrect. Of course, this reply is no good by itself: she must propose an alternative definition. I have defined vagueness in terms of the following principle: Closeness If a and b are very similar in F-relevant respects, then ‘Fa’ and ‘Fb’ are very similar in respect of truth. The epistemicist might claim that the true idea to be captured is somewhat different: JA-Closeness If a and b are very similar in F-relevant respects, then ‘Fa’ and ‘Fb’ are very similar in respect of justified assertibility (i.e. to whatever extent it is justifiable (reasonable, warranted . . . ) to assert (believe, judge . . . ) that Fa, it is to a very similar extent justifiable to assert that Fb). ² Note that I am ruling out as absurd and untenable the denial of (A): in a Sorites series, every object is very similar to its neighbour(s) in respects relevant to the application of the predicate in question. I am not ruling out as absurd and untenable the denial of (B): in a Sorites series, every object is very similar to its neighbour(s) in respect of application of the predicate in question. To say that any two objects which differ by just one grain of sand are very similar in heap-relevant respects is not to say that there is no grain of sand that takes us all the way from full-fledged heaps to full-fledged non-heaps. I am, in effect, in the process of arguing in this chapter that (B) must be true, on the grounds that given (A) (which, I have just been saying, is obviously true), denying (B) means denying that vague predicates satisfy Closeness; but I do not regard denying (B) as absurd.
181
Thus the epistemicist’s claim would be that the true idea to be captured by a theory of vagueness is an epistemic/pragmatic one, rather than—as the Closeness definition would have it—an alethic/metaphysical one. As things currently stand in the literature, epistemicists have not shown that their theories can accommodate predicates which satisfy JA-Closeness. This does, however, seem plausible. Williamson (1994, 244–7) argues for the following principle: If it is reasonable to believe that n grains make a heap, then it is not reasonable to believe that n − 1 grains do not make a heap. This is not yet JA-Closeness. But let us set aside the issue of how good Williamson’s arguments for this principle are, and the issue of getting from this principle to JA-Closeness (or of getting to JA-Closeness in some other way), and grant for the sake of argument that the epistemicist can accommodate JA-Closeness. The point that I wish to stress is that JA-Closeness does not yield an adequate definition of vagueness. There are at least two problems with defining vagueness in terms of JA-Closeness: this definition does not explain why vague predicates draw blurred boundaries, or why they are Sorites-susceptible. Recall that a definition of vagueness is supposed to provide a statement of the essence of vagueness, in light of which we can understand why vague predicates behave in the characteristic ways they do. Defining vagueness in terms of Closeness meets this desideratum, as we saw in the previous chapter. Defining vagueness in terms of JA-Closeness does not: a predicate which satisfied JA-Closeness would not thereby draw blurred boundaries, or generate a compelling Sorites paradox. Consider first the point about blurred boundaries. Suppose that we have a Sorites series for F, and someone asserts that Fxi and not Fxi+1 , where xi and xi+1 are adjacent items in the series. Intuitively there is a problem here. The epistemicist now tells us that this person has violated epistemic or pragmatic norms: she has unjustifiable beliefs, or she has asserted something which she does not believe. But the problem seems deeper: there is a problem with the very idea that there is a sharp jump between the F’s and the non-F’s, not merely with the idea that someone might believe or assert that the jump is in this or that particular place. Someone who asserts that such a jump does exist, in such-and-such place, seems not merely to have violated epistemic or pragmatic norms, but to
182 have violated the norm of truth. This comes out clearly if we imagine someone guessing that the jump between the F’s and the non-F’s comes between xi and xi+1 . Someone who makes such a guess seems to have misunderstood the nature of vagueness just as much as someone who makes the corresponding assertion. Yet the guesser—unlike the asserter—has not violated JA-Closeness. Thus JA-Closeness does not capture the ordinary conception of vagueness. Epistemicism plus JA-Closeness yields a view on which the extensions of vague predicates are sharp in themselves, but we cannot know or justifiably believe that their borders are in this or that particular place—and so they appear blurry to us. But the problem with the above guess seems to be that it could not be true: because vague predicates have genuinely blurry boundaries—blurry in themselves—not sharp but unknowable ones. Phenomenologically, there is an enormous difference between a proposition such as ‘The jump from baldness to non-baldness comes at the four hundredth hair’ and a proposition such as ‘The least upper bound of velocities reached by polar bears on 11th January 2004 was 31.35 kilometres per hour’, which may well be true—and could certainly be guessed to be true, if one wanted—but should not be asserted or accepted, as no one could justifiably believe it to be true. Satisfaction of JA-Closeness is not, then, of the essence of vagueness. In contrast to the foregoing, Greenough (2003, 272–4) writes: ‘‘Arguably, our naïve intuitions concerning vagueness are not sophisticated enough to make the distinction between tolerance and epistemic tolerance, between lacking sharp boundaries and lacking known boundaries . . . arguably the phenomenological data merely supports a thesis of epistemic tolerance and not a thesis of boundarylessness: close inspection simply shows that there is no clear or known boundary . . . not that there is no boundary.’’ But if there is indeed an argument to back up Greenough’s position here, it would seem that the argument must rest on a conflation of the content of an utterance and its assertibility conditions. Suppose a witness tells us that there was no one there. This is incompatible with the claim that there was someone there. The barrister responds ‘‘Ah, you are saying that from where you were positioned you could not see anybody’’—a claim which is compatible with there having been someone there. This is clearly an insidious move: the witness said what he said because, from where he was positioned, he could not see anybody; but this is not what he said, which is that there was nobody there. Likewise in
183
the case of vagueness. We can all agree that the ordinary speaker says that there is no sharp jump from red to orange because she cannot see a sharp jump; but it is illegitimate to move from this to the claim that what she says is that there is no clearly visible jump from red to orange. What any ordinary speaker, when faced (say) with an unobscured, not-too-distant rainbow—and whilst wearing her glasses or contact lenses and not under the influence of drugs or alcohol, etc.—would say is not the absurdly timid ‘‘I cannot perceive a clear boundary between red and orange’’, but the stronger claim that there is no sharp boundary. This should not be controversial. If ordinary speakers made only the weaker claim, then epistemicism would not have been met with howls of disbelief from both its opponents and some of its proponents³ when it first came onto the philosophical scene, and furthermore Frege—to whom the blurred boundaries idea can be traced—would not have thought that vagueness has no place in a logically perfect language (i.e. a language with, among other things, a bivalent semantics). (A reminder about the dialectic here: I am not objecting to epistemicism on the grounds that it is counter-intuitive; I am objecting to the definition of vagueness in terms of JA-Closeness, on the grounds that it does not capture an essential feature of the ordinary conception of vagueness: namely, that vague predicates draw blurred boundaries.) Second, the point about Sorites-susceptibility. Suppose that we have a predicate F which conforms to JA-Closeness, but not Closeness. (If a predicate conforms to Closeness, then it will also conform to JA-Closeness, given some very natural assumptions. We know that a predicate which conforms to Closeness will be Sorites-susceptible. What we want to know now is whether JA-Closeness has the advantages of Closeness. Thus we need to suppose JA-Closeness by itself.) Suppose that we have a Sorites series for F. Will the corresponding Sorites paradox be compelling? We have no reason to think so. Given JA-Closeness, and the fact that adjacent objects in the Sorites series are very close in F-relevant respects, we know that for any object in the series, to whatever extent it is justifiable, reasonable, or warranted to assert, believe, or judge that it is F, it is to a very similar extent justifiable, reasonable, or warranted to assert, believe, or judge that ³ See e.g. Williamson 1994, p. xi: ‘‘This book originated in my attempts to refute its main thesis. . . . For years I took this epistemic view of vagueness to be obviously false, as most philosophers do.’’
184 the next object is F. But we have no reason at all to conclude from this that our Sorites premiss is true, i.e. that for any object in the series, if it is F, then so is the next object. Whether or not this is compelling depends crucially on why we think JA-Closeness holds for F. If it holds because F satisfies Closeness, then the Sorites premiss does become compelling, as we have seen—but as we have also seen, this is of no help to the epistemicist in the present context. If, on the other hand, F satisfies JA-Closeness because F works in the way in which the epistemicist thinks vague predicates work—i.e. it draws sharp but unknowable boundaries—then obviously we have no reason at all to accept the Sorites premiss, and so the Sorites argument will not be compelling in the slightest. But this shows that JA-Closeness—by itself—yields no account of Sorites-susceptibility. If we define vague predicates in terms of JA-Closeness, we are left with no understanding of why they generate Sorites paradoxes. What about the epistemicist’s proposed solution of the Sorites paradox (§2.1)? The idea was that the paradox is mistaken because the inductive premiss is false—there is a sharp cut-off between the F’s and the nonF’s—and that we are nevertheless taken in by the paradox because our ignorance of where the cut-off is makes us think that there is no cut-off at all: that is why we are inclined to accept the inductive premiss, even though, according to the epistemicist, it is false. We can see now that this solution is not robust, in the following sense: it provides an account of how the paradox could be compelling and yet mistaken which is compatible with the truth of epistemicism; but it does not provide an account of how the paradox could be compelling to someone who had come to accept epistemicism. If we become epistemicists, we should no longer find the paradox compelling at all.⁴ It is therefore part of the epistemicist’s account that we were taken in by the paradox only because we did not realize how vague predicates work. We must, then, have thought they worked differently from the way the epistemicist says they work. So how did we think they work? Well, not according to JA-Closeness: we did not think that JA-Closeness was the defining feature of vague predicates, because if we ⁴ If we define vagueness in terms of Closeness, on the other hand, we have a robust explanation of Sorites-susceptibility: for we saw that accepting Closeness licenses being taken in by the paradox (and also finding it ultimately unconvincing). I think this gets the phenomenology of the Sorites paradoxes right: their pull is more resilient than the epistemicist can allow. Thanks to Mark Colyvan for helpful discussion here.
185
had thought that, we would not—as we have just seen—have found the Sorites paradox compelling. This brings us back to the point that the definition of vagueness in terms of JA-Closeness does not capture a second essential feature of the ordinary conception of vagueness: namely, that vague predicates generate compelling Sorites paradoxes. Finally, in light of the problems with the JA-Closeness definition of vagueness, the epistemicist might try to repackage her view as as error theory of vagueness. The idea would be that the correct way to define vagueness is indeed in terms of Closeness; but in fact there are no predicates which satisfy Closeness—i.e. there are no vague predicates. There are only predicates which satisfy JA-Closeness, and of which the epistemicist theory is correct—i.e. predicates which draw sharp but unknowable boundaries. Thus the view is not that ‘tall’, ‘bald’, etc. are vague and work the way the epistemicist says they do—vagueness being characterized not by Closeness but in some other way. That line of approach has just been discussed, and rejected. The current approach is to agree that vagueness is correctly defined in terms of Closeness, and yet to maintain that ‘tall’, ‘bald’, etc. do not satisfy Closeness—for they work the way the epistemicist says they do—and hence are not vague. Why would anyone adhere to such a view? The view that Closeness is of the essence of vagueness is a result, in part, of reflection on intuitions about canonical examples of vague predicates, such as ‘tall’ and ‘bald’. Turning around at the end of such reflection and saying that these canonical examples are in fact not vague at all—indeed, that no predicates are vague—thus leaves one in an uneasy position. The only legitimate reason for believing such a view would be that the alternatives—the theories of vagueness which do accommodate Closeness—are no good.⁵ For example, one might have general reasons for thinking that no non-classical semantic theory is acceptable—and to the extent that these reasons were good ones, this would mitigate the unease of the error theory under consideration. The key point is that one could not adequately defend this sort of error theory without attacking its rivals: for if there is an alternative view which is quite acceptable in itself and which allows us to accommodate both the view that vagueness is correctly defined in terms of Closeness and the view that apparently canonical examples of vague predicates really are vague, ⁵ Recall the discussion in §3.1 of Williamson’s 1994 dialectical strategy.
186 then this will, by default, be more attractive than a view which accepts that vagueness is correctly defined in terms of Closeness, but denies that apparently canonical examples of vague predicates are vague. The way to argue against an epistemicist error theory of the sort under consideration is, then, to argue in favour of an alternative theory of vagueness which does accommodate Closeness. I turn to the task of defending such an alternative theory in Part III.
4.2 Additional Truth Values and Truth Gaps If one sentence is True and another False, then they are as far apart as can be in respect of truth—and furthermore, they are in an absolute sense very far apart in respect of truth.⁶ Given that Truth and Falsity are poles apart in this way, no third truth status can be very close to both of them. For if one thing is very similar to each of two other things in some respect, then those two things must at the very least be reasonably similar to one another in that respect—yet Truth and Falsity are not similar at all in respect of truth. Thus, to the extent that a sentence is very close to True, it is not very close to False, and vice versa. This means that ‘third status’ views—views which posit truth value gaps, or a third truth value—cannot accommodate predicates which satisfy Closeness (and for which there exists a Sorites series). For on these sorts of views, as we move along our Sorites series x1 , . . . , xn for the predicate F, there will come a point at which Fxi is true and Fxi+1 has the third status, and another point at which Fxj has the third status and Fxj+1 is false. Every adjacent pair of objects in the Sorites series is very close in F-relevant respects, but as we have seen, a sentence with the third status cannot be very close in respect of truth both to a true sentence and to a false sentence, and so Closeness must be violated at at least one of these two points (i.e. xi /xi+1 and xj /xj+1 ). (Given the natural assumption that the third status is symmetric with respect to truth and falsity, Closeness will be violated at both points.) Consider our strip of paper again (recall Fig. 4.1 and the surrounding discussion). Figure 4.3 represents the situation according to the truth ⁶ These are separate points: two cats in opposite corners of a small room are as far apart as can be, but they are not very far apart.
Truth value of ‘Point x is red’
187
True
False l1
l2
Points x on the strip (red points at left)
Truth value of ‘Point x is red’
Figure 4.3. The strip according to the gap view.
True ∗
False l1
l2
Points x on the strip (red points at left)
Figure 4.4. The strip according to the third value view.
gap view, and Figure 4.4 represents the situation according to the third value view. According to these views, point l1 is the last point which is clearly red, and point l2 is the first point which is clearly non-red. But, as in the case of the epistemicist, this is not the problem. The real problem is just as before. Points l1 and/or l2 are jump points of the characteristic function of the predicate ‘is red’ (the function which assigns to each point x on the strip the truth value of the claim ‘Point x is red’)—that is, points at which the value of the function jumps significantly—while they are not locations at which adjacent points have significantly different colours. Thus, the idea that if points x and y are very close in respect of colour, then the sentences ‘Point x is red’ and ‘Point y is red’ should be very close in respect of truth, is violated at one or both of l1 and l2 .
Truth value of ‘Point x is red’
188
1
0 m Points x on the strip (red points at left)
Figure 4.5. The strip according to the fuzzy view.
Can the fuzzy theory accommodate Closeness? Yes. The fuzzy theory takes as its set of degrees of truth the real numbers between 0 and 1 inclusive. Now there are certainly pairs x and y of reals in [0, 1] such that if sentence S’s truth value is x and sentence T’s truth value is y, then S and T are very similar in respect of truth (e.g., let |x − y| = 0.01). Thus, the fuzzy framework can accommodate Closeness: it has a sufficiently rich structure of truth values to allow arbitrarily small steps in F-relevant respects to correspond to arbitrarily small steps in truth. Consider our strip of paper again. Figure 4.5 represents the situation according to the fuzzy view. If we are accustomed to phrasing what was in fact a good objection to the epistemic, truth gap, and third value accounts of vagueness in a misleading way—i.e. in terms of last points rather than jump points—then we might think that this objection applies also to the fuzzy account of vagueness. For point m is indeed the last point which is definitely red, according to the fuzzy view. But this is not a problem—it never was. The true problem was that in previous cases, points k, and l1 and/or l2 , were jump points of the function which assigns truth values to sentences concerning points on the strip—i.e. points at which the value of the function jumps significantly—and in the present case, m is not such a jump point. (The value of the function does change at m—but the change is gradual.) The existence of a last red point does not, in and of itself, necessitate a violation of Closeness.⁷ ⁷ Some might still think that the existence of a last red point is a problem for the fuzzy view. This objection will be discussed in §6.2.
Truth value of ‘Patch x is red’
189
1−∆
0
m m′ Patches x (red patches at left)
Figure 4.6. The patches according to the fuzzy view.
So, if we have two or three truth values, we cannot accommodate Closeness, while if we have a continuum of degrees of truth, we can. A continuum of degrees of truth is not, however, necessary for accommodating Closeness. A large finite number of truth values would be sufficient.⁸ For suppose that we replace the continuous strip of paper depicted in Figure 4.5 with a discrete series of coloured patches, as in Figure 4.6. Patches m and m are very similar in respect of colour. One of the advantages of the fuzzy approach is that it can allow that ‘m is red’ and ‘m is red’ are very similar in respect of truth. The truth value of ‘m is red’ is 1, and the truth value of ‘m is red’ is slightly less: 1 − . But whether or not two sentences are very close in respect of truth is not meant to depend upon what they say: it is meant to be a function of how true they are. This means that any two sentences whose degrees of truth are within of each other are very close in respect of truth.⁹ And this means that an account on which our set of truth values is the large but finite set {0, , 2, . . . , 1 − 2, 1 − , 1} can accommodate Closeness.¹⁰ Figure 4.7 represents our strip of paper according to this account. On this view, for any two points x and y on the paper which are very close in respect of colour, the sentences ‘x is red’ and ‘y is red’ differ in degree of truth by at most , and are thus—we ⁸ This is why I said in Ch. 3, n. 66, that a predicate which satisfies Closeness is not necessarily radically higher-order vague. ⁹ Strictly speaking, we could say that S and T are very close in respect of truth if S’s truth value is 1 and T’s is 1 − , but not if S’s truth value is, say, 0.5 and T’s is 0.5 − . I can, however, see no motivation for such a view. ¹⁰ Given the way the case was set up, we may assume without loss of generality that divides evenly into 1.
Truth value of ‘Point x is red’
190
1
0
Points x on the strip (red points at left)
Figure 4.7. The strip according to the large finite degrees of truth view.
have already agreed—very similar in respect of truth. Closeness is thus accommodated. I do not know exactly how many degrees of truth we need in order to accommodate Closeness. The point is simply that we need a significant number of them. In our Sorites series x1 , . . . , xn for the predicate F, we need it to be the case that Fx1 is True and Fxn is False, and that Fxi and Fxi+1 are always very similar in respect of truth. This can happen only if there are gradations or degrees of truth in between Truth and Falsity, such that two sentences can have different gradations of truth and yet it still be the case that the two sentences are very similar in respect of truth. As far as accommodating Closeness goes, we might have a large finite number of degrees of truth (as discussed at the beginning of §2.2.1), or we might have continuum-many degrees of truth (as in the fuzzy picture).¹¹ One might think that the fuzzy approach has advantages over the large finite approach. First, the fuzzy approach allows predicates such as ‘is red’ to have continuous characteristic functions over continuously varying domains such as our strip of paper, whereas the finite approach does not make room for this possibility. However, while one might feel that this is nice—I do—it is not really clear why it should be thought desirable. Certainly, the nature of vagueness does not require that vague predicates have continuous characteristic functions over continuously varying domains. What Closeness requires is that the characteristic functions have no jump points, not that they have no points of discontinuity. Second, there might ¹¹ Recall from Ch. 2, n. 40, however, that we do not want to take the rationals between 0 and 1 as our set of truth values.
191
seem to be a problem of arbitrariness. On the finite approach, we need to suppose that there is some specific number n of truth values. Suppose, for the sake of argument, that n is 1,000. Well then, why not 1,001 or 999? Why, for that matter, not 100 or 1,000,000? It might seem as though there could not be any principled reason why the number of truth values should be n rather than some other number. In fact, however, I think there is a principled choice of n. I am assuming that we are trying to choose one member of the family of sets of truth values of the form 2 , 1 (see the beginning of §2.2.1): so we just need a reason 0, n −1 1 , . . . , nn − −1 to fix on some particular value of n.¹² We have seen that some n are too small: for example, three truth values is not enough to accommodate Closeness. We have also seen that there are some n that are large enough. Given that the numbers we are referring to here are natural numbers, and that the relation ‘very close in respect of truth’ is precise (see p. 145), it follows that there is a least n which is large enough. This is a non-arbitrary choice: it yields a set of truth values which is large enough to accommodate Closeness, and no larger.¹³
4.3 Supervaluationism The (standard) supervaluationist view cannot accommodate vagueness (defined in terms of Closeness). My comments at the beginning of §4.2 about truth gaps and third truth values made no reference to whether the truth values/gaps of compound sentences were assigned recursively, or according to the supervaluationist method: those comments apply equally to supervaluationist and recursive versions of truth gap and three-valued views. Furthermore, the comments apply to any third truth status, whether it is a truth gap, a third value, or a truth glut—thus subvaluationism is in the same boat as supervaluationism here. Nevertheless, the supervaluationist view uses more machinery than the recursive view—in particular, the admissible classical extensions of ¹² A little while ago I mentioned a set of truth values of the form {0, , 2, . . . , 1 − 2, 1 − , 1}. This fits the present pattern: = n −1 1 . ¹³ Of course I do not know what the relevant number n is, but that was not the problem. The problem was supposed to be that any choice of number of truth values is arbitrary. I have just argued that there is one choice of number that is not arbitrary. It is irrelevant in this context that I do not know which number it is.
192 the intended partial (or three-valued) interpretation—and it might be thought that these resources will somehow allow the supervaluationist to accommodate Closeness in a way that recursive views cannot. For example, someone might think that even if ‘Bill is bald’ is true in the partial (or three-valued) base interpretation and ‘Ben is bald’ lacks a truth value (or has the value ∗), nevertheless these two sentences are close in respect of truth, because they are classically true/false on almost exactly the same admissible extensions of the base interpretation. However, this confuses similarity in respects relevant to truth with similarity in respect of truth. Assuming the supervaluationist framework, if two sentences are classically true/false on almost exactly the same admissible extensions of the intended base interpretation, then they are very similar in respects relevant to whether a sentence is true (i.e. true simpliciter, on the base interpretation); but if one sentence is true and the other lacks a truth value (or has the value ∗), then they are not very similar in respect of truth. The admissible extensions are used to determine the truth values/gaps of sentences on the unique intended base interpretation. Those truth values/gaps themselves then represent the complete semantic story of those sentences. Similarity of truth values on the admissible extensions thus makes two sentences similar in the respects that determine truth; but if one of the sentences is, in the final analysis, true, while the other lacks a truth value (or has the value ∗), then—as discussed at the beginning of §4.2—the sentences are not similar in respect of truth.¹⁴ Claiming otherwise would be analogous to claiming that in 2001, Al Gore was very similar to George Bush in respect of being President of the USA. This is false, because Bush was (definitely, totally) President then, and Gore was not (at all, in any way). What is true is that Gore had been very similar to Bush in the respects that determine who is President.¹⁵ I have said that the problem of ‘higher-order vagueness’ divides into two problems: the location problem and the jolt problem. The latter is the topic of the present chapter. Supervaluationists have proposed treatments ¹⁴ Here, and elsewhere, for ease of exposition I assume symmetry of the third status with respect to truth and falsity (see p. 186). More cautiously, the point is that a sentence with the third status cannot be very similar in respect of truth both to true and to false sentences. ¹⁵ The above is the situation regarding sentences whose truth values are given by the supervaluation. If atomic sentences get their truth values directly from the base model, not from the supervaluation (see p. 78), then even this much isn’t true. If ‘Bill is bald’ and ‘Ben is bald’ get their truth values from the base model, and ‘Bill is bald’ is true and ‘Ben is bald’ lacks a truth value, then even if these sentences are classically true/false on almost exactly the same admissible extensions of the base interpretation, they are still not similar even in respects relevant to truth, let alone in respect of truth.
193
of ‘higher-order vagueness’. Can these help with the jolt problem? No: as we shall now see, they do not help at all. The basic supervaluationist claim is that a sentence is true/false (i.e. in the intended base interpretation) just in case it is (classically) true/false in all the admissible classical extensions of that base interpretation. This leads to a situation whereby in a Sorites series for a vague predicate F, there are objects x in the middle of the series such that Fx is neither true nor false, and—the important point in the present context—there is a semantic jolt as we progress down the series from an object a such that Fa is true to an adjacent object b such that Fb is neither true nor false. Thus we have two objects a and b which—being adjacent in a Sorites series for F —are very close in F-relevant respects, and yet Fa and Fb are not very close in respect of truth. Now the fundamental idea behind all the supervaluationist treatments of higher-order vagueness is to assert that there are borderline cases of admissible extensions. Here is how this would seem to help with our present problem: There could thus be a borderline case admissible specification.¹⁶ Sentences that are false in such a specification but true in all definitely admissible specifications will be borderline cases of ‘true on all complete and admissible specifications’; and ‘ ‘‘p’’ is true on all complete and admissible specifications’ will then be indeterminate. The truth-conditions provided by supervaluationism will not then definitely determine any truth-value for p or definitely determine that the sentence lacks a truth-value. For the indeterminacy over whether the truth-condition is fulfilled implies that we should not conclude that p is true, but nor should we call p neither true nor false as we would if the condition (determinately) failed to be fulfilled. The truth-value status of p (whether it is true, false or lacks a value) remains unsettled. And sharp boundaries between the true predications and the borderline cases of ‘tall’ are avoided. . . . A sorites series can be described as starting with true predications, having borderline ones in the middle, and false ones at the end but being such that there are no sharp boundaries between those categories. (Keefe 2000, 203)
In fact, however, when we begin to spell out the basic idea (that there are borderline admissible extensions) in any detail, we see that it fails to solve the jolt problem. There are two main ways to spell out the idea. One way, advocated by Keefe (2000, ch. 8, §I), is to keep the standard supervaluationist story just as it was originally told, but then to say that certain predicates in ¹⁶ Keefe uses ‘specification’ where I use ‘extension’.
194 the language of the story (i.e. the supervaluationist’s metalanguage)—most notably ‘admissible’—are vague, and are to be treated in accordance with the supervaluationist story itself. So we tell a semantic story, and then say that this story is to be applied to some of the language used to tell the story. Keefe notes—and attempts to address—the worry that this procedure is circular and/or uninformative. I think—Keefe’s arguments notwithstanding—that the procedure is problematically uninformative. But my reasons for thinking this are precisely analogous to those given in §6.2.1 for rejecting the idea of presenting fuzzy semantics in a fuzzy metalanguage, and I will not duplicate these reasons here. The second way of spelling out the idea that there is not just one, unique set of admissible extensions is due to Fine (1997 [1975], §5).¹⁷ Instead of keeping the simple supervaluationist semantic picture the same, but trying to apply it to parts of itself, we make the story more complex, but tell it just once (without iteration). The base interpretation remains exactly as before, but we replace the admissible extensions of the base interpretation—which previously were classical models—with what Fine calls ‘ω-order boundaries’. Then, just as before, we say that a wf is true/false in the base interpretation if it is true/false in all admissible ωorder boundaries. So we need now to say what an ω-order boundary is, and under what conditions a wf is true/false in one. An ω-order boundary is an infinite sequence s0 s1 s2 . . . , where s0 is a classical model, s1 is a set of classical models, s2 is a set of sets of classical models, and so on, and with si ∈ si+1 for each i. The idea is this. In place of a set of admissible classical extensions of our base interpretation, we now have a set of admissible boundaries, where each boundary itself makes a ruling as to which extensions are admissible, and where, in general, these rulings differ from boundary to boundary. A boundary says first—in its first term s0 , which is a classical model—‘This is the way I precisify the language (whose intended interpretation is given by the base model)’. But the boundary also admits that this way of precisifying the language is not the only admissible one, and so it then says—in its second term s1 , which is a set of classical ¹⁷ Fine’s approach was later taken up by Williamson 1994, §5.6; 1999; 2003, 698–9. Williamson 1994, 160–1 presents reasons for thinking that this second approach needs to be supplemented by the idea that supervaluationism ‘‘must conduct its business in a vague metalanguage’’ (p. 161), but the latter idea—which, as I have already said, I reject—is an optional addition to the second approach, rather than an inherent part of it.
195
models—‘These are the ways of precisifying which I regard as admissible’. But the boundary also admits that its ruling as to what is admissible is not the only admissible ruling, and so it then says—in its third term s2 , which is a set of sets of classical models—‘These are the ways of ruling which ways of precisifying are admissible which I regard as admissible’. And so on. The requirement that each si ∈ si+1 is then the requirement that the boundary is internally consistent: it must regard its lower-level rulings as admissible by its own higher-level lights. We can use the latter idea to define a relation R which holds between boundaries s and t when s regards all t’s lowerlevel rulings as admissible by s’s higher-level lights. That is, for boundaries s = s0 s1 s2 . . . and t = t0 t1 t2 . . . , sRt iff ti ∈ si+1 for each i. Now that we know what a boundary is, we can give the conditions under which a wf is true at a boundary. Wfs which do not contain ‘definitely’ (on which more in a moment) are true/false at a boundary just in case they are (classically) true/false at s0 (which is a classical model). Syntactically, ‘definitely’, or D, is a one-place sentential connective: if A is a wf, then so is DA . Semantically, DA is true at a boundary s just in case A is true at every boundary which s regards as admissible—that is, at every boundary t such that sRt. If we consider just the admissible boundaries (without the base model), what we have here is a frame semantics, with the points in the frame (the analogues of the possible worlds in the frame semantics for modal logic) being boundaries, the frame relation (the analogue of the accessibility relation in the frame semantics for modal logic) being the admitting relation R defined above, and D (the analogue of the necessity operator in modal logic) functioning as a universal quantifier over points (in the sense that DA is true at a point x just in case A is true at all points which x can see). Given the definitions of R and of a boundary, R has to be reflexive. Thus the analogue of the modal T axiom holds: DA → A will be true at every boundary, and hence true in the base model. However, nothing has been said to ensure that R will be transitive, and so the analogue of the modal S axiom will not hold: in general, DA → DDA will not be true at every boundary, and hence will not be true in the base model. Moreover, the analogue of the modal S axiom also will not hold: in general, ¬DA → D¬DA will not be true at every boundary, and hence will not be true in the base model. This is supposed to allow for higherorder vagueness. The thought is that first-order vagueness consists in the fact that A is neither true nor false; second-order vagueness consists in the
196 fact that DA is neither true nor false; third-order vagueness consists in the fact that DDA is neither true nor false; and so on. If R were transitive and symmetric, as well as reflexive, then the logic of D would be S, and while it might be the case that A was true at some points and not others, in this case DA , DDA , and so on would be false at all points. However given that R is not transitive, it might be the case that not only A , but also DA , DDA , and so on are true at some points and false at others. There is a deep conceptual problem with this proposal. The relative admissibility relation R between boundaries is supposed to model the idea that each precisification makes a ruling as to which other precisifications are admissible. But the background set on which R is defined—the set of points in the frame, the set of boundaries—is already the set of admissible precisifications. We cannot allow every boundary into our frame. That is, we cannot allow every classical model to be the first level of some boundary in our frame. For if we did, then every D-free compound formula which was not a classical logical truth or logical falsehood would come out as neither true nor false in the base model—including, for example, statements expressing penumbral connections. So we must restrict the classical models which can be the first level of a boundary in our frame to the admissible extensions. But then it is entirely irrelevant if a boundary comes along later and tells us that it does not regard one of these models as admissible. ‘So what?’, we should reply: ‘What does your opinion matter?’ For the set of all classical models which figure as the first level of some boundary in our frame simply is the set of admissible extensions, and whether or not some boundary agrees is beside the point. But of course, the set of all classical models which figure as the first level of some boundary in our frame is a unique, precise set—and so the machinery of boundaries has ultimately taken us no way towards implementing the initial idea that there should be borderline admissible extensions. Even setting this problem aside, however, the current proposal does not help with the jolt problem. The problem was that there are adjacent objects a and b in the Sorites series for F such that Fa is true and Fb is neither true nor false (or has the value ∗). We are now told that this is in fact all right, because while DFa is true, it is not the case that DFb is false: rather, DFb is neither true nor false. But this does not help at all with the problem that Fa and Fb are not very close in respect of truth—because it does not alter the truth values of Fa and Fb. On the modified supervaluationist proposal, it
197
is still the case that Fa is true and Fb is neither true nor false (or has the value ∗)—and so it is still the case that these two sentences are not very close in respect of truth, even though a and b are very close in F-relevant respects. The modified supervaluationist approach is simply on entirely the wrong track (as far as solving the jolt problem goes). For we cannot render two sentences close in respect of truth by changing the truth values which we assign to other sentences. Distributing our original three truth values to more sentences (ones containing ‘definitely’) does nothing; what we need to do is countenance more truth values, and distribute them to our original sentences (Fa, Fb, and so on).¹⁸ That, of course, is exactly what the degree-theoretic form of supervaluationism does. For reasons that will now be obvious, the form of supervaluationism where the base interpretation is fuzzy and we have a measure over the set of admissible classical precisifications of the base interpretation can accommodate vagueness (defined in terms of Closeness)—that is, it does solve the jolt problem. On this sort of view, there need be no point in the Sorites series for F at which we encounter adjacent objects a and b such that Fa and Fb are not very close in respect of truth.
4.4 Plurivaluationism I have said that the problem of ‘higher-order vagueness’ divides into two problems: the location problem and the jolt problem. I have also argued that epistemicism—based as it is on the view that each vague discourse has a unique intended classical interpretation—can solve neither of these problems. The location problem—how could our practice with vague terms single out one classical interpretation as uniquely correct?—was discussed in Chapter 2. The jolt problem was discussed in §4.1: the epistemicist must posit a sharp jump in a Sorites series between some persons a and b, who are adjacent in the series and thus very similar in F-relevant respects, by deeming Fa True and Fb False—and thus cannot accommodate vagueness. I argued in Chapter 2 that plurivaluationism—the view that each vague ¹⁸ Similar comments apply to Fine’s other proposal for handling higher-order vagueness in the supervaluationist framework, which is to distribute the three supervaluationist truth values not to sentences containing the new operator ‘definitely’, but to sentences containing new truth predicates (Fine 1997 [1975], 146–7).
198 discourse has many acceptable classical interpretations, rather than a unique intended interpretation—solves the location problem. By moving to a multiplicity of interpretations, we solve the problem of how our practice could single out a unique interpretation as correct—by denying that it does so. However, plurivaluationism does not solve the jolt problem—it does not make room for vague predicates, i.e. predicates which satisfy Closeness. Any acceptable interpretation of our language must make F true of the things at the beginning of the Sorites series for F, and false of the things at the end. But (on the plurivaluationist view) every acceptable interpretation is classical. Thus, on every acceptable interpretation, F will violate Closeness (for reasons seen in §4.1). Now multiplying interpretations relative to each of which Closeness is violated goes no way at all towards creating a situation in which Closeness is accommodated. If Closeness is violated on every acceptable interpretation, then—on the plurivaluationist picture—it is violated everywhere. There are only the acceptable interpretations, and so there simply is nowhere else for Closeness to be accommodated. In response, a plurivaluationist might argue as follows. For any adjacent persons Bill and Ben in the Sorites series for ‘is tall’, ‘Bill is tall’ and ‘Ben is tall’ are very similar in respect of truth—because they are classically true/false on almost (if not exactly) the same acceptable interpretations (mutatis mutandis for other vague predicates). The supervaluationist tried a similar move in §4.3, and there we pointed out that she was mistaking similarity in respects relevant to truth for similarity in respect of truth. The plurivaluationist is in an even worse position. In the plurivaluationist framework (as opposed to the supervaluationist framework), the fact just pointed out—that ‘Bill is tall’ and ‘Ben is tall’ are classically true/false on almost exactly the same acceptable interpretations—does not even make these sentences similar in respects relevant to truth, let alone similar in respect of truth. For similarity at this level, in the plurivaluationist framework, is similarity in a semantically irrelevant respect. I pointed out in Chapter 2 that plurivaluationism, unlike supervaluationism, has no further semantic machinery, besides its classical interpretations. We can talk about what is going on in all of them, but this is merely a convenient way of speaking: we are not describing a further level of semantic fact. The semantic facts are exhausted by what is happening in the classical interpretations. Closeness is violated
199
in each of them. Multiplying the classical interpretations does not bring back Closeness at a higher semantic level, because (in contrast with supervaluationism) there is no such higher semantic level in the plurivaluationist view. We can put the current objection to plurivaluationism the other way around. A constraint on any correct interpretation of our vague language—a condition any interpretation must satisfy if it is to be an acceptable interpretation—is that vague predicates do indeed come out vague on that interpretation (i.e. they satisfy Closeness on that interpretation). Thus no classical interpretation can be an acceptable interpretation of vague discourse, because every classical interpretation renders that discourse nonvague. So the plurivaluationist ends up with no acceptable interpretations, and hence a view according to which vague discourse means nothing at all. Put this way, the objection to plurivaluationism is similar to an argument of Fodor and Lepore (1996).¹⁹ However, Fodor and Lepore direct their argument against supervaluationism, not plurivaluationism—and in fact supervaluationism is not prey to this objection.²⁰ The crucial difference here between supervaluationism and plurivaluationism concerns the different roles they give to the classical models they both countenance. The supervaluationist says that our practice fixes a unique intended partial/three-valued interpretation of the language. She then considers various classical extensions of this interpretation. She does not think of these classical models as correct interpretations of vague language (i.e. as giving the content of vague language, as it is), but merely as admissible extensions of a uniquely correct non-classical interpretation (i.e. as giving the content that vague language would have, were it precisified). The plurivaluationist, on the other hand, bypasses the unique intended partial/three-valued interpretation, and says that each of what the supervaluationist calls the admissible extensions of the base interpretation is in fact an acceptable interpretation of the vague language, as originally used (i.e. in its actual, vague state). This is where the problem lies. A model cannot be a correct interpretation of vague language—in its actual, vague state—if the language comes out as non-vague relative to that model. But such a model can, of course, ¹⁹ Cf. also McGee and McLaughlin 1995, 227. ²⁰ The latter point has been noted before—although not, of course, in a context of distinguishing supervaluationism and plurivaluationism. For a correct reply to the Fodor–Lepore objection qua objection to supervaluationism, see McGee 1997, 155; cf. also Keefe 2000, 190.
200 represent an admissible precisification of vague language: it can give a correct account of how the language would be were all traces of its vagueness removed.²¹ A plurivaluationist might respond to the objection that no classical model is an acceptable interpretation of vague language by arguing that this disadvantage of classical models is outweighed by the disadvantages of going non-classical. She might agree that classical models are not good interpretations of vague discourse, for precisely the reason that, relative to each of them, vague predicates come out as precise. Nevertheless, she might argue on broader grounds that classical models are the only kind we should countenance. How could such a strong position against non-classical semantics be motivated? Only by arguing against non-classical alternatives. I shall defend such an alternative against all known objections in Part III. The upshot for the present discussion will be that the plurivaluationist’s violation of Closeness is not a necessary evil, but a good reason to reject plurivaluationism.
4.5 Contextualism It is evident that whether or not a predicate satisfies Closeness at any given point in time or in any given context turns entirely on the nature of the base models on which the contextualist story is overlaid. For what is distinctive about the contextualist story is what it says about how the intended interpretation changes over time, from context to context. Within a given context, the contextualist’s semantic story is identical to that of the proponents of the non-contextualist version of the semantic picture which goes with the type of base model (classical, partial, three-valued, fuzzy, etc.) chosen by the contextualist. Thus, for reasons seen in earlier ²¹ Shapiro 2006, 71 n. 6 says that the Fodor–Lepore argument is related to a criticism of supervaluationism by Tye (1989) and Sanford (1976), who argue that the truth of a sentence involving a vague predicate cannot be determined by the truth values of other sentences involving other (sharp) predicates. (Cf. also McGee and McLaughlin 1995, 227–8.) Clearly the two arguments are in some sense related; but it is worth noting that there is a crucial difference between them. The Fodor–Lepore argument turns on (and fails as an objection to supervaluationism because it turns on) the idea that the classical models are interpretations of vague language (i.e. they give the meaning of vague language, as it is). The Tye–Sanford argument does not make use of such an idea: it holds that whatever role the classical models are supposed to be playing, they are simply irrelevant to the semantics of vague language—in other words, that they can play no role in giving the semantics of vague language.
201
sections of this chapter, only versions of contextualism built on models which countenance degrees of truth can accommodate vagueness within a context. For example, consider contextualism built over a base of partial interpretations, and consider the predicate ‘is tall’. Assuming the domain of discourse contains enough objects of different heights—as it will if it contains the members of a Sorites series for ‘tall’—at any stage of the discourse, on the interpretation which is correct at that stage, there will be objects which are very close in tall-relevant respects—say, Bill and Ben, whose heights differ by one nanometre—such that ‘Bill is tall’ is true, and ‘Ben is tall’ is neither true nor false, and hence these two sentences are not very similar in respect of truth. A non-degree-theoretic contextualist might try to argue that nevertheless ‘Bill is tall’ and ‘Ben is tall’ are very similar in respect of truth, because Bill and Ben are very similar in tallrelevant respects; and so according to the contextualist story, if at any stage of the discourse either of these men is explicitly characterized as tall, it will become true of both that they are tall. This is certainly a respect of similarity between the sentences ‘Bill is tall’ and ‘Ben is tall’. However, it is similarity in a respect relevant to truth, not similarity in respect of truth. In the contextualist framework, whether or not ‘Bill is tall’ is true depends on Bill’s hair count and distribution (etc.), and also on the classifications that have been made earlier in the conversation or discourse. The similarity noted above—where Bill or Ben goes directly (i.e. into the extension or antiextension of ‘is bald’, if he is explicitly classified as bald or as non-bald respectively), the other goes indirectly (because of his similarity, in tallrelevant respects, to the other man)—is therefore a similarity in a respect relevant to truth. But it is not similarity in respect of truth. It may happen at some stage of the discourse that Bill goes one way (indirectly—i.e. because of his similarity to a third party, who is explicitly classified as bald or as non-bald) and Ben goes nowhere—in which case ‘Bill is tall’ will be true (or false) and ‘Ben is tall’ will lack a truth value. And when this happens, ‘Bill is tall’ and ‘Ben is tall’ are not similar in respect of truth—for reasons discussed in §4.2. In the contextualist framework, for two sentences to be similar in respect of truth, they must have similar truth statuses in the current intended interpretation. For there is no further level of semantic fact, beyond the various intended interpretations, at which level two sentences may be similar in respect of truth, even though they do not have similar truth statuses in the current intended interpretation. There is a whole further
202 part to the contextualist story, a whole level of machinery above and beyond the various intended interpretations—and it is machinery which is importantly relevant to the semantic facts—but it is not itself a further level of semantic fact. Rather, the extra machinery concerns semantic dynamics: how the semantic facts change over time, from context to context. Compare a law which gives the position of a projectile at any time during its flight. Facts about how the position of a projectile is changing at a given time are further facts, beyond the facts about its position at that time—and they are facts which are relevant to the facts about its positions at various times, but they are not further positional facts. Thus, suppose we are interested in whether the projectile ever passes through a certain region of space. If we check its position at every time of its flight and it is never in this region, then our question is settled, and no checking of the facts about (for example) the projectile’s velocity at each point of its flight will change the answer. Similarly in the case of contextualism. When we know the truth statuses of all our sentences at each stage of the discourse, and we see that at each stage, there are no vague predicates—i.e. Closeness is violated at every stage—checking the semantic dynamics can make no difference. Whether or not Closeness is satisfied turns on the semantic facts, not on how the semantic facts change over time. Thus the contextualist’s distinctive contribution—a semantic dynamics—cannot bring back satisfaction of Closeness, if the chosen kind of base model rules out such satisfaction in the first place. Rather than try to accommodate it, the contextualist might claim that the Closeness definition of vagueness is (subtly) incorrect. Of course, this reply is no good by itself: she must propose an alternative definition. Recall Bill and Ben: they are very close in tall-relevant respects (they differ in height by a nanometre), and yet on the interpretation that is the intended one at the current stage S of the discourse, ‘Bill is tall’ is true and ‘Ben is tall’ lacks a truth value (and hence these claims are not very close in respect of truth, and so Closeness is violated). The contextualist will note at this point, however, that if we turn our attention to Bill, and say that he is tall, then the context shifts in such a way that both ‘Bill is tall’ and ‘Ben is tall’ are true in the newly created context. Thus the contextualist, unlike the epistemicist (recall §4.1), can accommodate the thought that we can never say truly that Bill is tall but Ben is not. The contextualist might therefore claim that the Closeness definition of vagueness is incorrect,
203
and that we should rather define vagueness in terms of the following principle: TA-Closeness If a and b are very similar in F-relevant respects, then it can never truthfully be asserted that Fa but not Fb. This is not the same as JA-Closeness. JA-Closeness tells us that we can never justifiably assert or believe that Fa but not Fb—but that does not rule out the possibility that we may yet, by luck, do so truthfully. TA-Closeness, on the other hand, tells us that we can never truthfully assert that Fa but not Fb. There are at least two problems with defining vagueness in terms of TA-Closeness: this definition does not explain why vague predicates draw blurred boundaries, nor why they are Sorites-susceptible. Recall that a definition of vagueness is supposed to provide a statement of the essence of vagueness, in light of which we can understand why vague predicates behave in the characteristic ways they do. Defining vagueness in terms of Closeness meets this desideratum, as we saw in the previous chapter. Defining vagueness in terms of TA-Closeness does not: a predicate which satisfies TA-Closeness will not thereby draw blurred boundaries, or generate a compelling Sorites paradox. Consider first the point about blurred boundaries. We just noted that JA-Closeness does not rule out the possibility that we may truthfully assert that Fa but not Fb, while TA-Closeness does rule this out. Even this, however, does not rule out the possibility that it may yet be true on many occasions that Fa but not Fb (just not on occasions when we assert this)—and this means that TA-Closeness does not capture the idea that the extensions of vague predicates have blurry boundaries. According to the contextualist, the following claim is true:²² If uttered using ‘tall’ in the sense in which I used it at stage S, the sentence ‘Bill is tall’ would be true, while the sentence ‘Ben is tall’ would lack a truth value. This conflicts with the idea that vague predicates draw blurred boundaries. The claim that I cannot in fact utter the two sentences in the sense specified does not solve the problem: it simply adds insult to injury. The injury is ²² I am writing in terms of contextualism built over a base of partial interpretations, but my comments apply, mutatis mutandis, to all non-degree-theoretic forms of contextualism.
204 that on any occasion on which I use a vague term such as ‘tall’, I cleanly divide the domain of discourse into three sets, in such a way that some objects which are very close in tall-relevant respects are not assigned nearby values by the characteristic function with which ‘tall’ is associated on that occasion of use. The insult is that when I turn to the offending cases, the context shifts in such a way that these cases no longer offend—now other cases offend, but if I turn to them, the offence will be yet elsewhere, and so on. Contextualism plus TA-Closeness yields a view on which, rather than drawing blurred boundaries, the extensions of vague predicates are sharp but shifty: at any instant we have a sharp boundary to the extension of a vague predicate (the injury), but the boundary moves if we try to get a fix on where it is (the insult). Taking her cue from the passage from Greenough (2003) quoted on p. 182 above (in the discussion of my criticism that epistemicism plus JA-Closeness allows only sharp but unknowable boundaries, not genuinely blurry boundaries), the contextualist might respond that ordinary speakers cannot tell the difference between sharp but shifting boundaries and genuinely blurred boundaries. However, this would not be plausible at all. If blurriness and shiftiness were indistinguishable to ordinary speakers, then we could not think—as surely we do—that the boundary of our visual field is not just shifty (i.e. we cannot focus on it) but blurry as well. Phenomenologically, there is an enormous difference between the (blurry) boundary of the red region of the rainbow and the (sharp but shifty) boundary between, say, the water and the concrete boat ramp as the waves lap against it. It would be absurd to accuse ordinary speakers of being unable to make this sort of distinction. Soames (1999, 216–17) considers a similar objection to the one I have just made, and responds: ‘‘The critic wrongly takes the inability to display a sharp dividing line between things that are F and things for which the predicate is undefined, according to some conversational standard, to imply that there is no sharp dividing line, according to that standard. This ignores the dynamic feature of the model. Once this feature is recognized, the objection is defused.’’ But my objection is not defused by this: for the objection does not ignore the dynamic feature of the contextualist view; rather, the objection is precisely that the contextualist view ignores the static aspects of vagueness. The idea that vague predicates draw blurred boundaries concerns any particular use of a vague predicate, not a collection
205
of uses over time. According to this idea, a vague term is one which does not draw sharp boundaries—not one which draws a succession of different sharp boundaries in such a way that we can never focus attention on just one of them. The TA-Closeness definition does not, then, capture an essential feature of the ordinary conception of vagueness—the blurred boundaries idea. Nor does it capture the idea that vague predicates generate Sorites paradoxes. Suppose that we have a predicate F which conforms to TA-Closeness, but not Closeness. (If a predicate conforms to Closeness, then it will also conform to TA-Closeness. We know that a predicate which conforms to Closeness will be Sorites-susceptible. What we now want to know is whether TA-Closeness has the advantages of Closeness. Thus we need to suppose TA-Closeness by itself.) Suppose that we have a Sorites series for F. Will the corresponding Sorites paradox be compelling? We have no reason to think so. Given TA-Closeness, and the fact that adjacent objects in the Sorites series are very close in F-relevant respects, we know that if we were to assert of some object in the series that it is F, then we could not truthfully assert, in the same breath, that the next object was not F. Thus TA-Closeness does yield an understanding of why the forced march Sorites is compelling. But it yields no account of why the regular Sorites argument is compelling. For, given only TACloseness, we have no reason at all to think that the Sorites premiss is true: i.e. that for any object in the series, if it is F, then so is the next object. Indeed, if we have come to accept the contextualist story, we will see immediately that this is false (even though we cannot point to a pair of objects in the series which provides a counterexample). Thus TA-Closeness is no better than JA-Closeness when it comes to yielding an understanding of why vague predicates generate compelling Sorites arguments. Non-degree-theoretic versions of contextualism might, at this point, try to stay in the game as error theories. The idea would be that the correct way to codify our ordinary conception of vagueness is indeed in terms of the Closeness definition. But lo and behold, there are no predicates out there which satisfy Closeness—i.e. there are no vague predicates. There are only predicates which satisfy TA-Closeness, and of which a non-degree-theoretic contextualist theory is correct—i.e. predicates that draw a succession of different sharp boundaries in such a way that we can never focus attention
206 on just one of them. Exactly the same comments apply to this proposal as applied to the proposal for an epistemicist error theory in §4.1. The upshot of the present chapter is as follows. If we understand ‘vague’ in terms of the Closeness definition, then if we want to say that there exist vague predicates with associated Sorites series, we must adopt a theory of vagueness that countenances degrees of truth. Furthermore, we should understand ‘vague’ in terms of the Closeness definition. We saw the advantages of doing so in the previous chapter; in the present chapter we have seen that rival definitions do not share these advantages.
P A RT III
Degrees of Truth
This page intentionally left blank
5 Who’s Afraid of Degrees of Truth? I argued in Part II that we need a theory of vagueness that countenances degrees of truth. In Chapter 2, we encountered three main sorts of degree theory: theories with a large finite number of truth values; the (recursive) fuzzy theory; and the degree-theoretic form of supervaluationism. I argued there that the last of these faces problems numerous and deep enough to render it a non-starter. That leaves us with large finite and fuzzy theories. Yet the former have been largely ignored in the literature, while the latter has been subjected to a large amount of criticism. So is this really where we want to end up? In this third part of the book, I consider objections to the fuzzy view of vagueness in particular, and to degree-theoretic treatments of vagueness in general. In some cases, we can see that the objections do not carry weight. In other cases, I shall propose certain modifications and/or additions to the fuzzy view, in order to overcome the objections. The main objections covered in the present chapter are as follows. The very idea of truth coming in degrees is in some way confused or mistaken (§5.1). The fuzzy theory involves an objectionable violation of classical logic—in response to this objection, I propose a new account of fuzzy consequence which allows us to combine degree-theoretic semantics with classical logic (§5.2). Degrees of truth cannot be integrated with key developments elsewhere in philosophy of language, outside the study of vagueness—as part of my response to this objection, I propose a new account of the relationship between degrees of truth and degrees of belief (§5.3). Degree theories which treat the logical connectives as truth functions cannot account for ordinary usage of, and/or intuitions about the truth and/or assertibility of, compound sentences about borderline cases (§5.5). And finally, denying bivalence leads to contradiction (§5.6).
210 In Chapter 6, I turn to the major remaining objections to the fuzzy view: the problems of artificial precision and sharp boundaries.¹
5.1 On the Very Idea of Degrees of Truth It is sometimes said that the very idea of degrees of truth is confused. There are three variants of this objection, which I shall discuss in turn. First, there is the blunt objection that truth is simply an all-or-nothing matter: either a sentence is true simpliciter, or it is false simpliciter, and while it makes sense to say that Bob is taller than Bill, it would be sheer nonsense to say that ‘Bob is tall’ is truer than ‘Bill is tall’.² Now why might someone believe that the very idea of degrees of truth is nonsensical? Perhaps there is a picture at work here, of a sentence as a template fitting the world: either the template fits or it does not, and if it does not, then while it may be very far from fitting, or very close to fitting, the fact is that it does not fit, and that is all there is to it.³ The objection under consideration involves no argument: it involves an assertion, guided by a picture. In order to respond to it, we need to provide an alternative picture, according to which the idea of degrees of truth makes sense. We might continue to talk of templates, but allow that some of them are spongy: such a template might fit the part of the world it is intended to fit to a greater or lesser degree, where fitting to a high degree is not the same as definitely not fitting but being close to doing so. Alternatively, we might discard this obscure template talk altogether, in favour of the much clearer way of thinking about the relationship between language and the world given to us by set-theoretic model theory. That is what Part I was all about, and §2.2.1 shows that, contra the present objection, the fuzzy picture is perfectly coherent. ¹ In this chapter and the next I generally write in terms of the fuzzy view, rather than large finite degree theories, simply because, as noted on p. 190, I find the idea that there are continuum-many truth values more appealing than the idea that there are only finitely many. Most everything I say in these chapters could be applied—in some cases with minor changes of wording or detail, in the remaining cases as is—to large finite degree theories. ² Ramsey 1990 [1926], 83 is often cited as an authority here—e.g. by Haack 1996 [1980] and Sainsbury (1986, 97)—although the textual evidence does not seem so clear to me. ³ The identification of this picture as a motivating force behind the general opposition to degrees of truth is due to Gideon Rosen, in conversation. (He identified the picture, but did not endorse it.)
211
The second version of the objection is weaker: rather than claiming that the idea of degrees of truth makes no sense, the objector claims not to know what degrees of truth are, and puts the onus on the degree theorist to explain them properly. For example, Fara (2000, 54) complains of ‘‘the absence of some substantial philosophical account of what degrees of truth are’’. Given the generally sketchy level of description of the fuzzy view in philosophical discussions, this objection is fair enough. At this point in the present book, however, the objection has already been answered. To recapitulate: The classical picture begins with the idea that objects may possess or lack properties, and sentences may be true or false—and if (for example) Bob possesses the property baldness, then ‘Bob is bald’ is true. These ideas are captured by positing two truth values, 0 and 1: properties are modelled by subsets, which are functions from a background set of objects to the set of values; sentences are assigned the value 0 or 1 according to whether they are false or true; and the link between truth and property possession is captured in the model theory. Now the fuzzy picture begins with the thought that there is something wrong here: the classical picture is adequate in the realm of precise discourses such as mathematics (for which the picture was developed), but not in contexts in which there is vagueness. The thought is that some objects possess some properties to intermediate degrees: for example, Bob is neither bald simpliciter nor non-bald simpliciter, he is somewhere in between. From here, the basic correspondence intuition about truth tells us that if I say that Bob is bald, my statement will be neither true simpliciter nor false simpliciter —the truth of my statement mirrors Bob’s baldness. Now all these ideas are captured by positing a continuum of truth values, [0, 1], in place of the two classical truth values, {0, 1}, and then modifying the rest of the classical picture in ways dictated by this basic alteration. The result is the fuzzy picture: properties are modelled by fuzzy subsets, which are functions from a background set of objects to the set of values; sentences are assigned a greater or lesser value according to whether they are more or less true; and the link between truth and property possession is captured in the model theory—for example, if Bob’s degree of baldness is 0.3, then ‘Bob is bald’ is 0.3 true. Thus, in place of the classical picture, we get a richer picture, motivated by the basic idea that outside precise realms such as mathematics, objects may possess properties to intermediate degrees, in between complete possession and complete lack thereof.
212 Suppose that now the objection is raised, ‘‘But what are these degrees of truth?’’ In this context, what we are being asked is, ‘‘What are these truth values of which you speak?’’ Now often, amongst philosophers, talk of the truth values of sentences is simply a stylistic variant of talk of whether sentences are true. This is fine in itself, but it should not blind us to the fact that truth values, properly so-called, are not mere fac¸ons de parler: they are objects. There is a particular point to positing these objects: we wish to bring certain useful mathematical machinery—most notably the machinery of functions—to bear on the analysis of phenomena such as truth and validity. Using this machinery, we can achieve a very elegant and useful picture of language and its relationship to the world, and hence we do not baulk at positing the objects required to get the picture off the ground: we need objects to serve as the arguments and values of functions, and in particular, we need certain objects called truth values. Depending upon our antecedent ideas about the phenomena we wish to model—for example, whether properties may be possessed to intermediate degrees—we will posit different sets of truth values, with different structural properties. Now, to return to our question as to what these truth values are: this is not a special question for the fuzzy theorist. If we want to ask, ‘‘What are these fuzzy truth values—these degrees of truth?’’, then we should also ask, ‘‘What are these classical truth values, 1 and 0?’’ In both cases, the answer is that the truth values are elements in a particular sort of algebraic structure—and what matters is the structural properties of the latter, not the intrinsic nature of its elements. There are alternative pictures available here. According to the Fregean picture, the classical truth values are two definite, particular objects. What are these objects like in themselves? Well, we just don’t care—we do not ask this question. It is not that there is no correct answer—it is just that beyond the structural properties of the set of truth values, we do not know what they are like; but nor do we need to know. This goes not just for classical truth values, but for fuzzy ones as well: they are particular objects, but the question as to what intrinsic properties these objects have (as opposed to what structural properties the set of all of them has) is beside the point. The fuzzy theorist no more owes us an answer to this question than the classical theorist owes us an answer to the corresponding question about the classical truth values. According to the structuralist picture, on the other hand, there is not a unique set of objects which are our truth
213
values. Rather, there are many different sets of objects, any of which is suited to play the role of the truth values, and none of which is singled out as the unique player of this role. What suits a set of objects to play this role is precisely that it has the right structural properties. Thus, once again, beyond the structural properties of the set of truth values, we should not ask what the truth values are like in themselves: not because, as in the Fregean picture, the answer exists but is unimportant, but because what is true of the truth values in general is what is true of any and all sets of objects which are suited to play the role of the truth values, and this extends no further than structural properties.⁴ The Boolean algebra {0, 1} of classical truth values (whether thought of in the concrete Fregean way or in the abstract structuralist way) serves very well as the foundation of an elegant and useful theory which captures the classical intuitions. The fuzzy theorist asserts that the Kleene algebra [0, 1] of fuzzy truth values (whether thought of in the concrete Fregean way or in the abstract structuralist way) serves very well as the foundation of an elegant and useful theory which captures the fuzzy intuition that, outside mathematics, objects may possess properties to intermediate degrees, and hence that sentences may be true to intermediate degrees. Now there can certainly be legitimate disagreements concerning which intuitions are correct and hence worthy of being captured in our formal picture, and legitimate disagreements concerning how best to capture a given set of intuitions. Disagreements of the latter sort may lead to the positing of new algebras of truth values with associated set theories and logics. But once such an algebra of truth values has been described, we have said all that needs to be said about what the truth values are—and hence, in the fuzzy case, about what degrees of truth are. Any further question as to what the truth values are like is out of place: all we need to know (and, if we are structuralists, all there is to know in general) is the structural properties of the set of all of them. The third version of the objection is that, rather than being confused or inadequately explained, the idea of degrees of truth is unmotivated. In particular, the claim is that the fuzzy theorist is confused on a particular ⁴ The structuralist view of truth values is analogous to the view of numbers presented in Benacerraf 1965. There is also a hybrid picture, in which truth values are particular objects—but shadowy, insubstantial ones, with no properties aside from those they have in virtue of being part of a certain sort of structure. Thanks to Amitavo Islam, in conversation and in his 1996, for clarification of the issues in philosophy of mathematics alluded to in this paragraph.
214 point, and that once the confusion is cleared up, we see that we have no need for degrees of truth. This objection occurs in a number of places; I shall take the presentation in Keefe 1998b as my starting point, because it is particularly clear.⁵ Keefe begins by noting that ‘‘In many paradigm cases of a vague predicate F there is a corresponding measurable attribute related to F in such a way that the truth-value status of Fx . . . is determined by x’s quantity of that attribute. For example, the truth-value status of ‘a is tall’ is determined by, or supervenes on, a’s height . . . ; similarly for the relation between ‘a is hot’ and a’s temperature’’ (p. 575). I agree: these underlying attributes associated with F are what I have called F-relevant respects.⁶ Keefe continues: But although the measure of the underlying quantity may determine the applicability of the vague predicate, it does not follow that this measure is reflected in nonclassical numerical truth-values. . . . Are degree theorists thus mistaken in claiming that vague predicates come in degrees? I suggest that there is a sense in which F can be said to come in degrees—call it coming in degreesm —whenever there is a measure of the attribute F-ness, and where things have different degreesm of F-ness by having more or less of the attribute. The degreem of heat of an object will be a matter of its quantity of heat and we happen to call the measure degrees Celsius. . . . But the fact that many vague predicates come in degreesm is not enough for the degree theorist, who needs there to be implications for truth-values or degrees of truth, so that if F comes in degrees, predications of F can be true to intermediate degrees . . . coming in degreesm is not the sense of ‘‘coming in degrees’’ required by the degree theorist. (1998b, 575–6)
Again, I agree: the distinctive claim of the fuzzy theory (and of degreetheoretic treatments of vagueness in general) is that there are degrees of truth of predications of F, as well as degreesm of F. So far, however, we have no objection to the fuzzy view: we have a warning not to confuse degrees of truth with degreesm , but we have no argument to the effect that the fuzzy theory is involved in such confusion. In fact the fuzzy view—at least as I have presented it—is not confused on this point at all. Recall: Closeness If a and b are very similar in F-relevant respects, then ‘Fa’ and ‘Fb’ are very similar in respect of truth. ⁵ Keefe 1998b appears in a slightly revised form as Keefe 2000, ch. 5. ⁶ The converse does not hold, however: the F-relevant respects need not be measurable attributes, for every predicate F.
215
Suppose we are dealing with the property tallness, which is associated with the underlying attribute height. The truth status of the antecedent of Closeness depends upon the heights of a and b—in Keefe’s terminology, on their degreesm of tallness. The truth status of the consequent of Closeness depends upon the truth values of the claims Fa and Fb. I have argued that unless we have degrees of truth, it cannot be the case that the consequent is true whenever the antecedent is true, and hence we need to countenance degrees of truth. There is no confusion or conflation here of degreesm and degrees of truth. I do not slide from the commonplace observation that there are degreesm of tallness to the claim that we need degrees of truth. Rather, I make a substantive claim, defended in detail in Part II—namely, that vagueness is correctly understood in terms of Closeness—which provides the link between the two sorts of degrees. The fact that there is no conflation here is underscored by my claim that precise predicates do not conform to Closeness. Consider the predicate ‘is exactly six feet tall’. This predicate comes in degreesm , because its application to an object a is determined by a’s height. However, I deny that the truth of claims to the effect that some object is exactly six feet tall come in degrees. Thus there is no slide from degreesm to degrees of truth: this move is not always made, and where it is made, it is made on the basis of the Closeness definition, which is a substantive claim about how vagueness is to be understood correctly.⁷ Keefe’s core claim against degree theories is this: Take the vague predicate ‘tall’: I claim that any numbers assigned in an attempt to capture the vagueness of ‘tall’ do no more than serve as another measure of height. More generally, in so far as it is possible to assign numbers which respect certain ⁷ Keefe has warned against confusing or conflating degreesm and degrees of truth. I have agreed that this confusion is to be avoided, and have avoided it. So far, then, there is no problem for the fuzzy view. However, some fuzzy theorists do apparently conflate the things Keefe warns us to keep apart. Keefe 1998b, 576–7 attributes the following argument to Forbes 1983, 241–2; see also Forbes 1985, 170: Consider a pair of people, a and b, such that (1) (2) (3) (4)
a is taller than b. We can infer a is tall to a greater degree than b; so a satisfies the predicate ‘is tall’ to a greater degree than b; and hence ‘a is tall’ has a higher degree of truth than ‘b is tall’.
This argument does not go through, for the reasons Keefe notes: with the ‘degreesm ’ sense of ‘degree’ in play, (2) follows from (1), but (4) does not follow from (2); whereas with the ‘degrees of truth’ sense of ‘degrees’ in play, (4) follows from (2), but (2) does not follow from (1).
216 truths about, for example, comparative relations, this is no more than a measure of an attribute related to, or underlying, the vague predicate. (Keefe 1998b, 575)
This claim is incorrect. The numbers that we assign to objects to measure their heights serve a purpose quite distinct from that of the numbers that the fuzzy theory assigns to objects to measure their degrees of tallness. Consider the two words ‘tall’ and ‘taller’.⁸ There is certainly some important connection between these two words, but it is not totally straightforward. It is certainly not the case that if a is taller than b, then ‘a is tall’ is truer than ‘b is tall’. Nevertheless, there is still room for the idea that sentences of the form ‘a is tall’ might be true to intermediate degrees. I think that the clearest way to think of matters in this area is as follows. First, there are objects that have heights: persons, mountains, and so on. Then, there are the heights that these things have: these heights are also objects. So we have two sets: a set O of persons, mountains, and so on; and a set H of heights, which is equipped with an ordering relation ≤. There is a mapping h from O to H, which assigns to each object its height. There is also a third set of objects: the set R of real numbers. There are various mappings from the set of heights to the set of real numbers; each of these mappings may be thought of as giving a name to each height. Suppose that Bob’s height is x, i.e. h(Bob) = x. One mapping f from the set of heights to the set of reals assigns x the number 6; intuitively, f (h(Bob)) is Bob’s height in feet. Another mapping m from the set of heights to the set of reals assigns x the number 1.8; intuitively, m(h(Bob)) is Bob’s height in metres. A third mapping c from the set of heights to the set of reals assigns x the number 180; intuitively, c(h(Bob)) is Bob’s height in centimetres; and so on. There are familiar relations between these mappings, e.g. c(x) = 30f (x) (‘‘There are thirty centimetres in a foot’’). Now, the situation with regard to ‘taller’ is straightforward. For any objects x and y in O, x is taller than y just in case h(y) < h(x).⁹ But what about ‘tall’? As a first try, we might say that there is a distinguished subset T of H, such that for any object x in O, x is tall just in case h(x) ∈ T. (See Fig. 5.1.) The idea is that x is tall just in case x is of a ⁸ The following discussion applies, mutatis mutandis, to all such pairs of words, e.g. ‘loud’ and ‘louder’, ‘heavy’ and ‘heavier’, etc. ⁹ I.e. h(y) ≤ h(x) and h(x) h(y). (Recall that H is equipped with an ordering relation ≤.)
217
T
tall persons
O (persons) H (heights) key:
h
R (reals)
f
Figure 5.1. Tall and taller (I).
sufficient height. Implicit in the word ‘sufficient’ here is the idea that T should be closed upwards: for any x and y in H, if x ≤ y and x ∈ T, then y ∈ T. This immediately gives us an important relation between ‘tall’ and ‘taller’: for any x and y in O, if x is taller than y and y is tall, then x is tall.¹⁰ So far, so good—and no degrees of truth in sight (only degreesm ). But (as discussed in Chapter 4) there is something wrong with this model: it ignores the vagueness of ‘tall’. If two objects a and b in O are very close in respect of height, then ‘a is tall’ and ‘b is tall’ should be very close in respect of truth. In the picture outlined above, however, assuming that O contains a series of objects ranging from one which is not tall to one which is tall, in very small steps of height, there will be a pair of things a and b in O whose heights are very close, one of which is tall and the other not—i.e. ‘a is tall’ is true, and ‘b is tall’ is false. Thus the proposed picture does not allow for the vagueness of ‘tall’. In response to this problem, the fuzzy theorist proposes that we replace the classical subset T of H with a fuzzy subset T, and modify the requirement that T be closed upwards to the requirement that for any x and y in H, if x ≤ y then x’s degree of membership in T ¹⁰ One might think that nothing is tall simpliciter, but rather tall for an F. In order to accommodate this observation, we would need—instead of a single distinguished subset T of H —different subsets TF for different kinds F of thing, with x being tall for an F just in case h(x) ∈ TF .
218 1
0 [0, 1]
O (persons) H (heights) key:
h
f
T R (reals)
Figure 5.2. Tall and taller (II).
is less than or equal to y’s degree of membership in T. (See Fig. 5.2. For clarity, [0, 1] is drawn separately from R.) Now, ‘a is tall’ will be true to whatever degree h(a) is in T, and thus we have the following important relation between ‘tall’ and ‘taller’: for any x and y in O, if x is taller than y, then the degree of truth of ‘x is tall’ is at least as great as the degree of truth of ‘y is tall’. We now have the resources to accommodate the vagueness of ‘tall’ (if a and b in O are very close in respect of height, then it can now be the case that ‘a is tall’ and ‘b is tall’ are very close in respect of truth), and we are not committed to the idea that if a is taller than b, then ‘a is tall’ is truer than ‘b is tall’. That is, we can also accommodate the idea that while Kareem Abdul Jabbar is taller than Larry Bird, ‘Kareem Abdul Jabbar is tall’ is not truer than ‘Larry Bird is tall’, for both sentences are 1 true (i.e. true to degree 1).¹¹ ¹¹ Kareem Abdul Jabbar is 7 ft, 2 in. in height, and Larry Bird is 6 ft, 9 in. in height. I owe this example to Scott Soames.
219
In the first picture (where T is a classical subset of H), we have degreesm of height and no degrees of truth. In the second picture (where T is a fuzzy subset of H), we have degreesm of height and degrees of truth of sentences of the form ‘a is tall’. Thus, in the second picture we have what Keefe says we cannot have: numbers assigned in an attempt to capture the vagueness of ‘tall’ which do not simply serve as another measure of height. In the second picture, we have maps f from H to R, and then composite maps f ◦ h from O to R which serve as measures of height. We also have something entirely distinct: a fuzzy subset T of H, or (identifying T with its characteristic function) a map T from H to [0, 1], and then a composite map T ◦ h from O to [0, 1], which captures the vagueness of ‘tall’, and respects the comparative relation that if x is taller than y, then the degree of truth of ‘x is tall’ is at least as great as the degree of truth of ‘y is tall’. These maps are formally and conceptually distinct, and there is no reason why we cannot have both. Note that one cannot claim at this point that T ◦ h is simply another measure of height: for T ◦ h(Kareem Abdul Jabbar) = T ◦ h(Larry Bird) = 1, but it is not the case that Kareem Abdul Jabbar and Larry Bird have the same height.¹² Note that in the theory of measurement, the set H of heights is often ignored: we deal directly with mappings from the set O of objects that have heights to the set R of real numbers. It is not my concern here to debate whether objects such as heights really exist. My aim has been to get a clear picture of the relationship between ‘tall’ and ‘taller’, and it seems to me that the picture becomes less clear if we leave heights out of it.¹³ Nevertheless, my point can still be made without countenancing heights as objects. If we ignore the set H of heights, then we have maps from O to R, which assign heights to objects (these heights now being thought of simply as real numbers). There are various maps, one giving the heights of objects in metres, one giving the heights of objects in feet, and so on. For the sake of convenience, let us fix on one such map h. In Keefe’s terminology, h assigns degreesm of tallness to objects. The situation with regard to ‘taller’ is now straightforward. For any objects x and y in O, x is taller than y just in case h(y) < h(x).¹⁴ Turning to ‘tall’, as a first try we might say that there ¹² A similar point is made (independently) by Sainsbury 1986, 103–4. ¹³ Thanks to Amitavo Islam for first helping me to appreciate the role of objects such as heights and lengths. ¹⁴ It does not matter here that we fixed on one particular map h, because height is measured on a ratio scale, meaning that for any height-measuring map j : O → R, there is a positive real number α
220 is a distinguished subset T of R, such that for any object x in O, x is tall just in case h(x) ∈ T. T will be closed upwards, so that for any x and y in O, if x is taller than y and y is tall, then x is tall. For reasons discussed above, however, this picture ignores the vagueness of ‘tall’, and so the fuzzy theorist proposes replacing the classical subset T of R with a fuzzy subset T: that is, a map T from R to [0, 1]. The requirement that T be closed upwards becomes the requirement that for any x and y in R, if x ≤ y then T(x) ≤ T(y). In this picture we have two maps, h : O → R which assigns degreesm of tallness to objects, and T ◦ h : O → [0, 1] which assigns degrees of tallness (in the degrees of truth sense) to objects. As before, these maps are formally and conceptually distinct, and there is no reason why we cannot have both.
5.2 Classical Logic One reason why some philosophers do not like the fuzzy view is that it involves a rejection of classical logic.¹⁵ Some adherents of the fuzzy view might meet this objection with proud defiance, claiming that departure from classical logic is one of the great advantages of the fuzzy position. I do not agree with this: while we have seen that in order to accommodate vagueness, we must abandon classical semantics—in particular, we need to countenance degrees of truth—I see no advantages in departing from classical logic: that is, in denying the validity of some classically valid formulae or inferences (or asserting the validity of some classically invalid ones). Fortunately, while the standard version of the fuzzy view does depart from classical logic, we do not have to make such a departure, in order to accept fuzzy semantics. In §2.2.1, we looked at the fuzzy semantic picture. In order to determine whether the fuzzy view violates classical logic, we need a definition of consequence or validity: we can then determine whether all classically valid inferences are valid according to this definition. The standard way of defining validity in many-valued logics involves the idea of preservation of such that for any object x in O, j(x) = αh(x); thus if h(y) < h(x), then it is also the case that j(y) < j(x). For more details on measurement theory see §6.1.5. ¹⁵ See e.g. McGee and McLaughlin 1995, 237, and Williamson 1994, 118.
221
designated values. In classical logic, we have two truth values—True and False, or 1 and 0—and an inference is valid if there is no interpretation on which the premisses are all true, or all have the value 1, and the conclusion is false, or has the value 0. We can think of this in terms of 1 being the only designated value out of the two truth values 0 and 1, and of valid inference as requiring preservation of designated values from premisses to conclusion. That is, whenever the premisses all have designated truth values, the conclusion must also have a designated truth value. A tautology will be a formula which has a designated value on every interpretation. We can now carry this idea of validity as preservation of designated values directly to many-valued logics, although when we have more than two truth values, we have some leeway in deciding which of these values should be designated. In the fuzzy framework, one obvious choice is that only the value 1 should be designated. If we go this way, then neither p ∨ ¬p nor ¬(p ∧ ¬p)—both classical tautologies—will come out as valid, for on an interpretation on which p is 0.5 true, both p ∨ ¬p and ¬(p ∧ ¬p) are 0.5 true. We could say that all values greater than or equal to 0.5 are designated, but then disjunctive syllogism—a classically valid inference form—would not be valid: on an interpretation in which p is 0.5 true and q is 0 true, ¬p ∨ q is 0.5 true, hence both premisses of the inference p, ¬p ∨ q/ ∴ q are designated, but the conclusion is not. The intuitive idea behind the designated values approach is that a valid argument is one where, if the premisses are true enough—i.e. designated—then the conclusion is true enough too. Implicit in the word ‘enough’ here is the idea that the set of designated values is closed upwards: if x is designated, then every value between x and 1 must be designated too. Given this upward closure requirement, it is clear from the foregoing that the only set of designated values which will make both p ∨ ¬p and p, ¬p ∨ q/ ∴ q valid is the entire interval [0, 1]—but on that choice, p ∧ ¬p will be valid too (indeed, everything will be valid). Hence we cannot, on the ‘validity as preservation of designated values’ approach, make fuzzy validity coincide with classical validity.¹⁶ We do not, however, have to define fuzzy validity in terms of preservation of designated values. I propose the following definition of validity. B is a ¹⁶ Another idea is to say that B is a fuzzy consequence of if it is a consequence in the ‘preservation of designated values’ sense no matter what interval [k, 1] we take as our set of designated values (k > 0). A third idea is to say that B is a fuzzy consequence of just in case on every model, the truth value of B is greater than or equal to the infimum of the truth values of the members of . In fact these two
222 fuzzy consequence of a set of wfs just in case there is no interpretation M such that [A ]M > 0.5, for every A in , and [B ]M < 0.5. That is, B is a consequence of just in case on any interpretation on which the value assigned to every A in is strictly greater than 0.5, the value assigned to B is greater than or equal to 0.5. Correspondingly, I shall say that B is a fuzzy tautology just in case there is no interpretation M such that [B ]M < 0.5. That is, B is a tautology just in case it is greater than or equal to 0.5 true on every interpretation. It is not hard to show that the fuzzy consequence relation just defined on our standard first-order language is identical to the classical consequence relation on that language.¹⁷ Thus, all classically valid formulae and inferences are fuzzy-valid, and vice versa. Given this result about consequence, the question of proof theory for fuzzy logic becomes entirely straightforward. Any proof theory for our standard first-order language that is sound and complete with respect to classical set-theoretic models is sound and complete with respect to fuzzy models: for example, any standard classical proof theory, whether in the axiomatic, natural deduction, tableau, or some other style. Note that our definition of consequence cannot be recast in terms of preservation of designated values. For it to be the case that B is a consequence of A , it is required that on any interpretation on which the value assigned to A is strictly greater than 0.5, the value assigned to B is greater than or equal to 0.5.¹⁸ What is the motivation for this departure from ideas yield the same consequence relation (Priest 2001, 216–17)—and it is not classical (e.g. disjunctive syllogism is not valid). Other ideas have also been explored in the literature: for example that an argument is valid if the degree of falsity (i.e. one minus the degree of truth) of its conclusion can never exceed the sum of the degrees of falsity of its premisses (Edgington 1992). ¹⁷ Proof. Let |=c be the classical consequence relation and |=f the fuzzy consequence relation. (i) |=c B ⇒ |=f B . Given a fuzzy model Mf = (M, If ), we can construct a corresponding classical model Mc = (M, Ic ) which (so to speak) treats numbers ≥0.5 as 1 and numbers <0.5 as 0. More precisely, Ic assigns the same referents as If to singular terms, and where If assigns the n-place predicate R the function f : M n → [0, 1], Ic assigns R the function f : M n → {0, 1} defined thus: ∀x ∈ M n , if f (x) ≥ 0.5 then f (x) = 1, and if f (x) < 0.5 then f (x) = 0. By a straightforward induction on complexity of formulae we can show that for any fuzzy model Mf and any closed wf A , if [A ]Mf < 0.5 then [A ]Mc = 0, and if [A ]Mf > 0.5 then [A ]Mc = 1. Now suppose there is a fuzzy model Mf on which every member of is > 0.5 true and B is <0.5 true; then on the corresponding classical model Mc , every member of is true, and B is false. So if there is no classical model of the latter sort, there is no fuzzy model of the former sort. (ii) |=f B ⇒ |=c B . A classical model is (a special case of ) a fuzzy model (one in which only the extreme values 0 and 1 get assigned). If there is no fuzzy model on which every member of is > 0.5 true and B is <0.5 true, then a fortiori there is no classical model on which every member of is true and B is false. ¹⁸ This does not mean that the fuzzy consequence relation is not transitive. It is transitive. (It must be, because it is the same relation as the classical consequence relation, and that is transitive!)
223
standard practice? Part of the answer is simply that my definition yields a classical consequence relation, and this is important. Epistemicists have lorded their classicism over their opponents, and supervaluationists have claimed their greater adherence to classical principles (see p. 82) as an advantage over their fuzzy rivals. An important constraint on a definition of validity is that it counts intuitively valid forms of reasoning as valid—and the classically valid inference forms are all prima facie paradigms of valid reasoning, even in contexts involving vagueness. But, this answer is not enough: if it were, we could give whatever semantics we pleased for a language, and then simply say, ‘‘B is a consequence of just in case it is a classical consequence’’. What would be missing here is a meaningful relationship between the semantics and the definition of consequence. In the present case, however, there is such a meaningful relationship, as I shall now explain. In vague contexts, there is a natural distinction between inference grade statements and assertion grade statements. A natural response to Sorites reasoning is that as we continue to take the output of one stage of the reasoning and feed it back into the inductive premiss at the next stage of the slide down the series, our conclusions become progressively shakier. ‘Man 1 is bald’ is certainly true. From this and ‘For any n, if man n is bald then man n + 1 is bald’ it follows that man 2 is bald. Now if we feed this output back in, we get ‘man 3 is bald’; if we feed this output back in, we get ‘man 4 is bald’; and so on. But intuitively, our conclusions become more and more shaky. A natural thought is that while an output at one stage may be safe enough to assert, it may not be safe enough to serve as the start of the next stage of reasoning: it’s like a terminating pass, as opposed to a pass which enables you to enrol in the next level of studies. The idea is that a sentence needs to meet more stringent standards of truth if it is to be used as the basis for further argument than if it is merely to be asserted—just as building codes place more stringent standards of load-bearing capacity on foundations than on superstructures. Thus there is a natural distinction between inference grade statements and assertion grade statements. Now if we take a statement to be inference grade if it is strictly greater than 0.5 true, and assertion grade if it is at least 0.5 true, then we can say that a valid inference in the sense introduced above always yields at least an assertion grade conclusion, when the premisses are all inference grade. This fits perfectly with the intuition that Sorites reasoning
224 is valid, even though it cannot be continued indefinitely and still yield secure results.¹⁹ Of course, the mere distinction between assertion grade and inference grade statements does not in itself require that ‘inference grade’ be taken to mean strictly greater than 0.5 true, and ‘assertion grade’ to mean at least 0.5 true. We could, for example, take ‘inference grade’ to mean 1 true. The advantage of my proposal, however, is that it is minimal. If a sentence S is at least 0.5 true, then one cannot make a truer statement by asserting the negation of S than by asserting S. What more than this could be required for a statement to be ‘assertion grade’ or ‘true enough to assert safely’? Any higher standard would need further justification, and I cannot see what such justification would consist in.²⁰ Now given that we have set the cut-off for assertion grade statements at 0.5, and want to make the cut-off for inference grade statements strictly higher than this, the minimal cut-off for inference grade statements will be the one I have proposed: they must be more than 0.5 true. Again, any higher standard than this would need further justification, and I cannot see what such justification would consist in. We thus have a well-motivated definition of fuzzy consequence to accompany the fuzzy model theory presented in §2.2.1, and unlike existing definitions in the literature, it validates classical logic. Thus the charge that the fuzzy view of vagueness involves rejecting classical logic (as opposed to classical semantics) simply does not apply to the version of the fuzzy view being presented in this book.
5.3 A Gated Community in Theory Space? A big-picture worry looms for degree-theoretic approaches to vagueness. The worry concerns the integration of degrees of truth with key developments outside the study of vagueness. The semantic notions—most notably truth—of which, for reasons having to do with vagueness, the degree theorist gives a non-classical account play a crucial role right across philosophy ¹⁹ Williamson 1994, 124 also claims that this is an intuitive view of the Sorites. ²⁰ See n. 57 below for an important point of clarification about the notion of a statement being ‘assertion grade’.
225
of language. Can the idea that truth comes in degrees be made to mesh with important developments in other areas of philosophy of language, where bivalence has hitherto been taken for granted? Or is the degree-theoretic approach to vagueness a gated community in theory space? At first sight, things look good for degree theories. The degree-theoretic framework, as presented in §2.2.1, is designed to be interoperable with any view which has at its heart ordinary logical, semantic, and set-theoretic machinery. As we saw, any theory built on classical logic and/or set theory can be seen as querying the underlying algebra of classical truth values to return values for this or that function, and to return the results of applying the operations , , and to some given values. If we unplug the Boolean algebra of classical truth values and plug in the Kleene algebra of fuzzy truth values instead, then the higher-level theories will, to be sure, get different answers to their questions; but the crucial point is that everything will still work smoothly. Thus, to take just one type of example, there is no problem extending degrees of truth from the first-order semantics employed in discussions of vagueness to frame semantics for modal logics, and to extensions and variations thereof such as are involved in Stalnaker–Lewis-style treatments of counterfactuals and in Kaplan-style treatments of indexicals. In the modal case, for example, a predicate is assigned an intension: a function which assigns to each possible world a crisp subset of the domain of that world. To accommodate degrees of truth, we simply suppose that the function assigns to each possible world a fuzzy subset of the domain of that world.²¹ In the case of Kaplan-style semantics, for example, we countenance both contexts of utterance and circumstances of evaluation, and each expression is assigned both a content (a function from circumstances to extensions/referents) and a character (a function from contexts to contents). To accommodate degrees of truth, we simply suppose that the extension assigned by a content is a fuzzy (rather than crisp) subset of the domain of the circumstance. So far so good, but there is at least one theoretical viewpoint in philosophy of language which has great explanatory power and hence has gained considerable currency and found many applications, and which has at its heart more than ordinary logical, semantic, and set-theoretic ²¹ For details see Smith 2005a.
226 machinery. This is the viewpoint of conversational pragmatics, as spelled out by Stalnaker (1999). On this view, any conversation takes place in a context. A key part of the context is the set of presuppositions of the speakers—the information which, in the context, they take for granted, and which forms the background for the conversation. A conversation is then taken to consist in a series of assertions by the conversationalists, where the purpose of an assertion is to change the context by adding the content of what is asserted to the set of presuppositions. Assertions may be accepted or rejected by the other conversationalists. If an assertion is rejected, the context remains the same.²² If an assertion is accepted, then its purpose is achieved, and the presupposition set is indeed adjusted in the way indicated above (i.e. the content of what is asserted is added to the set of presuppositions). The explanatory power of the approach comes from the two-way interaction between context and assertion: contexts constrain assertions (e.g. one should not assert something which is already presupposed to be true, or which is already presupposed to be false), and assertions modify contexts. The former direction of interaction opens the way to Gricean explanations of how assertibility can diverge from truth (note that presuppositions do not have to be true: they do not even have to be believed by the conversationalists—only assumed for the purposes of the conversation); the latter direction opens the way to fruitful accounts of conversational dynamics. In Stalnaker’s framework, a proposition is a function from possible worlds to the truth values (True and False, or 1 and 0) or, equivalently, a (crisp) set of possible worlds (the set mapped to True on the first conception). The set of presuppositions in a context is a set of propositions—i.e. a set of sets of worlds—or, ‘‘more fundamental[ly]’’, a set of possible worlds: the worlds compatible with what is presupposed. Stalnaker calls this set of worlds the context set: ‘‘the set of possible worlds recognized by the speaker to be the ‘live options’ relevant to the conversation. A proposition is presupposed if and only if it is true in all of these possible worlds’’ (1999, 84–5). Assertion (i.e. of a proposition) then works by narrowing the context set: if the assertion is accepted, worlds in which the proposition asserted is false are struck out of the context set. That is, the new context set—after the assertion has been accepted—is the intersection of the old context set with ²² Subject to the qualification at Stalnaker 1999, 87 n. 9.
227
the proposition asserted.²³ Each person has her own context set—her own set of worlds which she regards as live options. However, ‘‘it is part of the concept of presupposition that a speaker assumes that the members of his audience presuppose everything that he presupposes’’ (Stalnaker 1999, 85). We say that a context is non-defective if each participant does in fact have the same context set. Much of this is standard logico-set-theoretic fare—but the notion of assertion poses special problems. For once we have degrees of truth in the picture, we should also countenance degrees of assertion—i.e. degrees of confidence of assertion—and corresponding degrees of belief.²⁴ For suppose we have a Sorites series leading from tall men down to short men. Suppose also that we have accepted a degree-theoretic account of vagueness—so we think that ‘This man is tall’ goes gradually from 1 true, said of men at the beginning of the series, down to 0 true, said of men at the end. Then what attitude should we adopt to (the proposition expressed by) ‘This man is tall’ as we consider various men in the series? Surely we should go from being fully committed to the proposition, both in thought and in talk, at the beginning of the series, to fully rejecting it, in thought and in talk, by the end of the series, via a gradually changing series of intermediate states of partial belief and partially confident assertion, which decrease in degree of confidence as we progress down the series. Here is Schiffer (2000, 223–4) on this issue: Sally is a rational speaker of English, and we’re going to monitor her belief states throughout the following experiment. Tom Cruise, a paradigmatically non-bald person, has consented, for the sake of philosophy, to have his hairs plucked from his scalp one by one until none are left. Sally is to witness this, and will judge Tom’s baldness after each plucking. The conditions for making baldness judgments—lighting conditions, exposure to the hair situation on Tom’s scalp, Sally’s sobriety and perceptual faculties, etc.—are ideal and known by Sally to be such. . . . Let the plucking begin. Sally starts out judging with absolute certainty that Tom is not bald; that is, she believes to degree 1 that Tom is not bald and to degree 0 that he is bald. This state of affairs persists through quite a few pluckings. At some point, however, Sally’s ²³ By ‘intersection’ here I mean ordinary set-theoretic intersection, for context sets and propositions are sets of worlds. ²⁴ Cf. Sainsbury 1986.
228 judgment that Tom isn’t bald will have an ever-so-slightly-diminished confidence, reflecting that she believes Tom not to be bald to some degree barely less than 1. The plucking continues and as it does the degree to which she believes Tom not to be bald diminishes while the degree to which she believes him to be bald increases. . . . Sally’s degrees of belief that Tom is bald will gradually increase as the plucking continues, until she believes to degree 1 that he is bald. Although I’ll have a little more to say about this later, for now I’m going to assume that the qualified judgments about Tom’s baldness that Sally would make throughout the plucking express partial beliefs. After all, the hallmark of partial belief is qualified assertion, and, once she was removed from her ability to make unqualified assertions, Sally would make qualified assertions in response to queries about Tom’s baldness.
Other things that we might say about the case—things that would avoid admitting that Sally has degrees of belief—are (i) that Sally fully believes that Tom is not bald until a particular hair is removed, from which point on she fully believes he is bald; (ii) that Sally fully believes that Tom is not bald until a particular hair is removed, at which point she enters an indeterminate state in which she does not believe (to any degree, even 0) that Tom is not bald and does not believe (to any degree, even 0) that Tom is bald, and then when another particular hair is removed Sally comes to fully believe that Tom is bald; and (iii) that Sally does not have attitudes towards propositions such as ‘Tom is bald’, but only towards propositions such as ‘Tom is bald to degree x’, each of which she either fully believes or fully rejects. The problem with these approaches is that they do not fit the phenomena. Contra (i) and (iii), Sally certainly seems to be unsure as to what to believe and say about Tom’s baldness, at various points in the process; and contra (ii), she does not have one catch-all ‘confused state’, which she enters, remains in, then leaves: rather, she seems clearly to become less and less sure that Tom is not bald, and then later more and more sure that he is. The proponent of (iii) may reply that Sally’s qualified assertion that Tom is bald—behaviour which seems clearly to indicate that there is some P such that Sally is unsure whether P —should in fact be understood as a full-on assertion that Tom is bald to an intermediate degree. But whether or not this is plausible, (iii) faces other problems too, most notably that it leads to a strange separation between truth, on the one hand, and belief and assertion, on the other: we have a semantics which assigns degrees of truth to atomic propositions such as ‘Tom is bald’, but we are then told we
229
cannot believe or assert such propositions—rather, we must believe and assert meta-level propositions of the form ‘ ‘‘Tom is bald’’ is true to degree x’ or, equivalently, propositions about degrees, such as ‘Tom’s degree of baldness is x’. This kind of separation should be regarded as a last resort, to be considered only if it is shown that we cannot, for some reason, adopt what should be the default position: namely, that the very same things both have degrees of truth and are the contents of belief. So once we have degrees of truth in the picture, we need to countenance corresponding degrees of belief, and—as a means of expressing such beliefs—degrees of confidence of assertion. But we already have degrees of belief in our conceptual repertoire—degrees of belief arising from uncertainty about the (full-on or full-off) truth of propositions, and handled formally by means of probability theory—and corresponding degrees of assertion (typically, the first sign that someone has an intermediate degree of belief that P will be the hesitant or tentative nature of her assertion or denial that P). So now we face the crucial issue of whether, and if so how, degrees of belief arising from vagueness and degrees of belief arising from uncertainty can live together. I shall spend §§5.3.1–5.3.3 exploring this issue, which is of considerable independent interest. Once we have a satisfactory understanding of the relationship between subjective probabilities, degrees of truth, and degrees of belief, I shall return in §5.3.4 to the question of integrating degrees of truth with Stalnakerian pragmatics. 5.3.1 Vagueness-Based and Uncertainty-Based Degrees of Belief One immediate thought concerning the place of degrees of belief arising from vagueness is that such degrees of belief—the degrees of belief that Sally experiences in the middle of Schiffer’s experiment—just are degrees of belief of the kind we are already familiar with, and hence should be handled by probability theory. The problem with this—as has been pointed out by Schiffer and others—is that partial beliefs arising from vagueness do not and should not behave in the same ways as partial beliefs of the familiar kind arising from uncertainty. To adapt and augment an example of Schiffer’s: suppose that Sally is about to meet her long-lost brother Sali. She has been told that he is either very tall or very short, but she has no idea which (so she does know that he is not a borderline case), and she has been told that he is either hirsute or totally bald, but she has no idea which (so she does know that he is not a borderline case). As a result of her
230 uncertainty, she believes both of the propositions ‘Sali is tall’ and ‘Sali is bald’ to degree 0.5. Suppose also that Sally regards these two propositions as independent: supposing one to be true would have no bearing on her beliefs about the other. Then, for familiar reasons, she should believe ‘Sali is tall and bald’ to degree 0.25. Now suppose that midway through Schiffer’s experiment, when Sally’s degree of belief that Tom is bald is 0.5, she also believes to degree 0.5 that Tom is tall—on the basis of looking at him and seeing that he is a classic borderline case of tallness.²⁵ Then what should be her degree of belief that Tom is tall and bald? The answer 0.5 suggests itself very strongly: certainly the answer 0.25 seems wrong. If you don’t think so, then just add more conjuncts (e.g. funny, nice, intelligent, cool, old—where Sally knows of Sali only that he is not a borderline case of any of them, and of Tom that he is a classic borderline case of all of them): the more independent conjuncts you add, the lower the uncertainty-based degree of belief should go, but this is clearly not the case for the vagueness-based degree of belief (Schiffer 2000, 225; MacFarlane 2006, 221–2). A second thought—Schiffer’s—is that there are two kinds of degree of belief: uncertainty-based degrees of belief, or SPB’s (‘standard partial beliefs’), and vagueness-based degrees of belief, or VPB’s (‘vagueness-related partial beliefs’). In Schiffer’s view, we have two distinct systems of degrees of belief: an assignment of SPB’s to propositions, which obey the laws of probability, and an assignment of VPB’s to propositions, which obey the laws of standard fuzzy propositional logic.²⁶ But there is a grave problem for any proposal which posits two different systems of degrees of belief, where it is allowed that a subject may have a degree of belief of one kind of strength n in a proposition P and a degree of belief of another kind of strength m = n in the same proposition P. The problem is that the very idea of degree of belief is made sense of via the thought that a degree of belief that P is a strength of tendency to act as if P. As Ramsey (1990 [1926], 65–6) puts it: the degree of a belief is a causal property of it, which we can express vaguely as the extent to which we are prepared to act on it. . . . it is not asserted that a ²⁵ Suppose, for the sake of the example, that Tom Cruise is borderline tall. ²⁶ VPB(¬p) = 1 − VPB(p), VPB(p ∧ q) = min{VPB(p), VPB(q)}, and VPB(p ∨ q) = max{VPB(p), VPB(q)}.
231
belief is an idea which does actually lead to action, but one which would lead to action in suitable circumstances . . . The difference [between believing more firmly and believing less firmly] seems to me to lie in how far we should act on these beliefs.
But one simply cannot have two different strengths of tendency to act as if P, in a given set of circumstances. Consider, for example, the proposition that Fido is dangerous. When Fido enters the room, one will do some particular thing: for example, sit still or jump and run. When Fido looks at one, one will do some particular thing: for example, tremble or offer him some beef jerky. When Fido barks, one will do some particular thing: for example, scream; and so on. One cannot both back away slowly and run screaming (at the same time), and it cannot both take Fido getting within two metres of one to make one run away and require Fido getting within one metre to make one run. So one cannot both tend strongly to act as if Fido is dangerous and tend weakly to act as if Fido is dangerous—at least, not if there is to be any sort of transparent relationship between these tendencies and the way one actually acts. But given that a degree of belief just is a strength of tendency to act, this means that one cannot have two different degrees of belief in the same proposition. The proponent of two kinds of degrees of belief might offer a number of responses here. (1) She might deny that there is a transparent relationship between tendencies to act and the way one actually acts. So, in the case of Fido, one might have both a strong tendency to act as if Fido is dangerous and a weak tendency, and these interact so as to make one behave in particular ways in particular situations (ways that we would like to describe as indicating that one has a mid-strength tendency to act as if Fido is dangerous—although on the current proposal, we cannot straightforwardly say this). But for this view to get off the ground, we would need to be told how exactly degrees of belief of the two sorts combine to produce certain behaviour; furthermore, the view threatens to make it impossible for us ever to know (even roughly) someone’s degree(s) of belief in a given proposition. (2) She might say that although there are indeed two kinds of degrees of belief, they always have the same strength, for every proposition. But clearly this would run us headlong into the problem discussed above, that partial beliefs arising from vagueness
232 do not and should not behave in the same ways as partial beliefs arising from uncertainty. (3) She might deny that degrees of belief are to be understood in terms of strength of tendency to act. But any view which disconnects degree of belief from tendency to act threatens to undermine the utility of the notion of degree of belief, and furthermore any candidate replacement proposal—for example, the view that the difference between believing more firmly and believing less firmly is a matter of strength of feeling²⁷—would seem to face the very same problem (one cannot have two different intensities of feeling about one proposition). (4) She might claim that one never has both kinds of degree of belief in the same proposition at the same time. For suppose, for reductio, that you have an uncertainty-related degree of belief of 0.3 that Dobbin wins the race and a vagueness-related degree of belief of 0.5 that Dobbin wins the race. How could you have acquired both these beliefs? In order to acquire the first, you would need to lack evidence concerning who wins. In order to acquire the second, you would need to have all the relevant evidence, and see that it—i.e. the world itself—leaves it unsettled who wins.²⁸ So clearly you could not have both these degrees of belief at once. There are still problems for this view, however. First, we need to be told how to reason with several propositions—and compounds thereof—in some of which we have degrees of belief of one type, and in others of which we have degrees of belief of the other type. Second, what justifies saying that we have here two non-interacting systems of degrees of belief, rather than one system, which assigns degrees to all propositions, but where these degrees behave differently in different situations (e.g. sometimes they obey the laws of probability, sometimes they do not)? This is the third possibility regarding the relationship between vaguenessbased degrees of belief and uncertainty-based degrees of belief: the suggestion that what we have is one univocal notion of degree of belief—one single system of assignments of degrees of belief to propositions—but where the degrees assigned sometimes behave in accordance ²⁷ This is the view with which Ramsey contrasts his own view, in the discussion quoted earlier. ²⁸ I am imagining a case where due to the vagueness of the boundaries of horses, two horses are equally good candidates for having crossed the line first. In practice this would no doubt be deemed a tie, but imagine that we are examining very high-resolution pictures of the finish, and that we are interested not in the practical question of distributing winnings, but purely in the question of which horse in fact crossed the line first.
233
with the laws of probability, and sometimes do not. This is the sort of view I shall advocate in the next section.²⁹ 5.3.2 Subjective Probabilities, Degrees of Truth, and Degrees of Belief The picture I propose has three components: (1) an agent’s epistemic state; (2) the degrees of truth of propositions; and (3) an agent’s degrees of belief in propositions. The agent’s epistemic state is a subjective matter. For any proposition, its degree of truth is an objective matter—determined by the world. The agent’s degree of belief in a proposition is a resultant of her subjective epistemic state and the objective degree of truth of the proposition. Now for the details. (1) I take an agent’s epistemic state to be (represented by) a probability measure over the space of possible worlds. So, where W is the set of possible worlds, the agent’s epistemic state P is a function which assigns a real number between 0 and 1 inclusive to each subset of W . Intuitively, the measure assigned to a set S of worlds indicates how likely the agent thinks it is that the actual world is one of the worlds in S. Given this understanding of P —together with the convention that assigning a set of worlds measure 1 means that you are absolutely certain the actual world ²⁹ Apart from my own view, another view which fits the description just given is that of Field (2000). Field supposes that an agent has a probability function P over propositions; he supposes also that the language includes a determinately operator D; and he then proposes that the agent’s degree of belief Q(α) in any proposition α is given by Q(α) = P(Dα). Thus my degree of belief that α is my subjective probability that determinately α. It may sound, then, as though we do have two different systems of degrees of belief: P-values and Q-values. But Field says that only Q-values are to be thought of as degrees of belief: ‘‘P should be thought of as simply a fictitious auxiliary used for obtaining Q’’ (p. 16); ‘‘P [should] not be taken seriously: except where it coincides with Q, it plays no role in describing the idealized agent’’ (p. 19). Field’s proposal, however, is of no use to us in our project of exploring how degrees of belief arising from degrees of truth are related to degrees of belief arising from uncertainty: the proposal does not employ degrees of truth, and it is hard to see how to add them. I also have some other worries about Field’s proposal. One worry concerns the appearance of a primitive determinately operator within the contents of beliefs. Another worry concerns the downgrading of P: I think Field takes this too far. In my proposal (§5.3.2), subjective probabilities do play an important role in describing an agent, but they are not to be identified with degrees of belief. Field, on the other hand, seems to be in the grip of the view that if subjective probabilities are allowed into the picture at all (as anything beyond fictitious auxiliaries), then they will automatically grab the mantle ‘degrees of belief ’. A third worry concerns the consequence of Field’s view (see his p. 17) that Q(α ∨ ¬α) is always 1: I think (and my view will capture this idea) that if Sally has vagueness-based degrees of belief of 0.5 that Tom is tall and of 0.5 that Tom is not tall, then she should also have a degree of belief of 0.5 that Tom is tall or not tall (cf. §5.5 below).
234 is in that set, and assigning a set of worlds measure 0 means that you are absolutely certain the actual world is not in that set—the three probability axioms are well-motivated: P1. For every set A ⊂ W , P(A) ≥ 0 P2. P(A ∪ B) = P(A) + P(B) provided A ∩ B = ∅ P3. P(W ) = 1 (2) At each possible world, each proposition has a particular degree of truth. Thus we may regard each proposition S as determining a function S : W → [0, 1], i.e. the function which assigns to each world w ∈ W the degree of truth of S at w.³⁰ The relationships between the functions associated with various propositions will be constrained in familiar ways by the logical relationships between these propositions: thus, for example, (S ∨ T) (w) = max{S (w), T (w)}, (S ∧ T) (w) = min{S (w), T (w)}, and (¬S) (w) = 1 − S (w). (3) We have a measure over worlds (the agent’s epistemic state P) and functions from worlds to real numbers (each proposition S). Thus S is a random variable, and I propose that we identify the agent’s degree of belief in S with her expectation (aka expected value) of S.³¹ To get a feel for the proposal, consider the case where there are finitely many possible worlds. One’s probability measure—which assigns a probability to each set of worlds—is in this case determined, via the additivity axiom P2, by the values assigned to singleton sets: P({w1 , . . . , wi }) = P({w1 }) + . . . + P({wi }). So, one assigns each world a degree of likelihood: a number indicating how likely one thinks it is that that world is the actual world. Each world w itself assigns each proposition S a degree of truth S(w). Now, my degree of belief in S is my expectation for S, i.e. my expected value for S’s degree of truth. Let us denote this E(S). In this finite case, it can be calculated thus, where w1 . . . wn are all the possible worlds: E(S) = P({w1 }) · S(w1 ) + . . . + P({wn }) · S(wn ) ³⁰ For the sake of simplicity of presentation, I may sometimes conflate S and S , i.e. write of a proposition as being a function from worlds to degrees, rather than as determining such a function. ³¹ Further to n. 30: there can be two distinct random variables which have the same value at every point in the sample space. Of course, considered as functions they will be identical—but I mean that they could still be considered as distinct random variables which happen to coincide. For example, on a given sample space it may turn out that the height of each person in millimetres is the same as her bank balance in dollars.
235
This is analogous to the calculation of expected utility in decision theory (with worlds playing the role of outcomes of acts, and degree of truth playing the role of utility of outcomes). The proposal meshes perfectly with the intuitive idea of one’s degree of belief that S as a measure of the strength of one’s tendency to act as if S.³² Consider a simple example. There are three ‘open worlds’ w1 , w2 , and w3 —i.e. three worlds such that one is not certain that one is not in them—i.e. P({w1 , w2 , w3 }) = 1. Suppose that S is the proposition ‘A tall person will win the race’. You don’t know who will win, but you do know that it is either the first man in our Sorites series leading from tall men to short men (this is the situation in w1 ), or the last man (this is the situation in w2 ), or the man in the middle (this is the situation in w3 ). You think that each of these three possibilities is equally likely, i.e. P({w1 }) = P({w2 }) = P({w3 }) = 13 . In w1 , S is 1 true; in w2 , S is 0 true; in w3 , S is 0.5 true. So your expectation that S is 13 · 1 + 13 · 0 + 13 · 0.5 = 0.5. This seems to be a true measure of the strength of your tendency to act as if S. Suppose you need a tall man for your basketball team, and you have a choice between signing up the race winner (whomever that should turn out to be), or Bill (whom you know to be of the same height as the first man in our Sorites series—hence ‘Bill is tall’ is 1 true, and you know this, and so your expectation of this proposition is 1), or Ben (whom you know to be of the same height as the last man in our Sorites series—hence ‘Ben is tall’ is 0 true, and you know this, and so your expectation of this proposition is 0), or Bob (whom you know to be of the same height as the man in the middle of our Sorites series—hence ‘Bob is tall’ is 0.5 true, and you know this, and so your expectation of this proposition is 0.5). It seems to me that you would sooner sign up the race winner than Ben, sooner sign up Bill than the race winner, and be indifferent between signing up the race winner and Bob. Thus, the strength of your tendency to act as if S mirrors your expectation of S.³³ ³² It is important to note that I am not claiming that two persons who have the same degree of belief that S will behave in the same ways (or have the same tendencies to behave in certain ways). I am claiming that they will have the same tendency to act as if S —and whether a person’s behaving in a certain way constitutes her acting as if S depends on her preferences (desires, utilities) and on her other beliefs. For example, let S be the proposition that there is an especially fragrant rose in Bob’s garden. For a rose-fancier, approaching Bob’s garden might constitute acting as if S, whereas for a person with an aversion to roses—or a rose-fancier with false beliefs about the location of Bob’s garden—moving away from Bob’s garden might constitute acting as if S. ³³ I am making the assumption here that your preferences regarding team members can be summed up thus: ‘The taller the better.’ If, on the other hand, you wanted only very tall players—so you are
236 The proposal also has the desired feature that sometimes degrees of belief behave like probability assignments, and sometimes not. Before showing this, I shall generalize the picture presented above. For so far we have considered only the special case where we have finitely many possible worlds, but of course we cannot, in general, suppose that there are only finitely many possible worlds—indeed, we cannot suppose that there are only countably many. But if there are uncountably many possible worlds, then (i) we cannot assume that the agent’s probability measure is defined on all subsets of the space of possible worlds,³⁴ and (ii) we cannot assume that every proposition determines a measurable function from worlds to truth values, i.e. a random variable. We shall handle this situation in the standard way. In regards to point (i), we suppose there to be a family F of subsets of the space W of all possible worlds which is a σ-field, i.e. it satisfies the conditions: 1. W ∈ F . 2. For all A ∈ F , A ∈ F . 3. For any countable number of sets A1 , . . . , An in F , n An ∈ F .³⁵ Our probability measure will be defined on F , i.e. it will assign probabilities to sets in F , and not to other subsets of W ; the sets in F will be called the measurable sets of possible worlds.³⁶ In regards to point (ii), for a function S from worlds to the reals to be measurable, i.e. a random variable, it must satisfy the condition that for any real x, {w ∈ W : S(w) ≤ x} ∈ F . If such a function is bounded, it will have a well-defined expectation E(S). All propositions are functions from worlds to [0, 1], and hence bounded. As for the condition that they be measurable, we henceforth restrict our attention to propositions which meet it. This means that we consider only propositions S such that it makes sense to ask, ‘‘How likely do you take it to be that this proposition has a truth value within such-and-such limits?’’ just as averse to signing up a borderline tall person as to signing up a short person—then signing up P would not constitute acting as if P is tall; rather, it would constitute acting as if P is very tall (recall n.32 above). In that case, in the situation described—where your expectation that Bob is tall is 0.5, and your expectation that Bob is very tall is 0—you would have no tendency to sign up Bob. Thanks to Wlodek Rabinowicz for helpful comments here. ³⁴ In fact, we might not want to assume this anyway—see objection 10 in §5.3.3. ³⁵ Note that by the De Morgan laws, we could equivalently replace union with intersection in condition 3. ³⁶ Once we have made this alteration to our setup, it is standard also to change axiom P2 so that it applies not just to unions of two sets, but to unions countably many sets: i.e. for any countable
of ∞ ∞ collection {Ai } of pairwise disjoint sets, P n=1 An = n=1 P(An ).
237
With the general picture now in place, we can make the following definitions: Definition (vagueness-free situation). An agent is in a vagueness-free situation (VFS) with respect to a proposition S iff there is a measure 1 set T of worlds (i.e. a set T such that P(T) = 1) such that S(w) = 1 or S(w) = 0 for every w ∈ T. (That is, the agent may not know for sure whether S is true or false, but she does absolutely rule out the possibility that S has an intermediate degree of truth: for she is certain that the actual world is somewhere in the class T, and everywhere in T, S is either 1 true or 0 true.)³⁷ An agent is in a VFS with respect to a set of propositions if she is in a VFS with respect to each of the propositions in . Definition (uncertainty-free situation). An agent is in an uncertaintyfree situation (UFS) with respect to a proposition S iff there is a measure 1 set T of worlds and a k ∈ [0, 1] such that S(w) = k for every w ∈ T. (That is, it is totally ruled out that S has a degree of truth other than k: for the agent is certain that the actual world is somewhere in the class T, and everywhere in T, S is k true.)³⁸ An agent is in a UFS with respect to a set of propositions if she is in a UFS with respect to each of the propositions in . We can now establish four results which show when degrees of belief behave like probability assignments and when they do not.³⁹ Proposition (degrees of belief equal probabilities in VFSs). If an agent is in a VFS with respect to S, then E(S) = P({w : S(w) = 1}). Proposition (degrees of belief equal degrees of truth in UFSs). If an agent is in a UFS with respect to S, then E(S) equals the degree of truth which the agent is certain S has. ³⁷ Note that one is in a vagueness-free situation with respect to S, in my sense, if one is subjectively certain that S does not have an intermediate degree of truth. Of course, it might still be the case that in fact S does have an intermediate degree of truth. So a ‘vagueness-free situation’ might better be called a ‘perceived vagueness-free situation’. I shall continue to use the former term, however, for the sake of brevity. Thanks to Peter Milne here. ³⁸ If there is such a k, it is unique. A world in which S is k true is not a world in which S is m true for any m = k. So if there are k and m with k = m such that there is a measure 1 set T of worlds with S(w) = k for every w ∈ T and a measure 1 set U of worlds with S(w) = m for every w ∈ U, then T and U are disjoint, and so P( T ∪ U) = 1 + 1 = 2 by P2, violating P3. ³⁹ The proofs of the following four propositions are straightforward and are left as exercises for the interested reader.
238 Proposition (degrees of belief behave like probabilities in VFSs). Let be a class of wfs, closed under the operations of forming wfs using our standard propositional connectives ∨, ∧, and ¬, such that one is in a VFS with respect to .⁴⁰ Then one’s degrees of belief (i.e. expectations) of wfs in behave like probabilities, in the sense that they satisfy the following three conditions: 1. For all wfs γ ∈ , 0 ≤ E(γ) ≤ 1. 2. For all tautologies γ ∈ , E(γ) = 1.⁴¹ 3. If γ1 ∈ and γ2 ∈ are mutually exclusive, then E(γ1 ∨ γ2 ) = E(γ1 ) + E(γ2 ).⁴² Proposition (degrees of belief behave like degrees of truth in UFSs). Let be a class of wfs, closed under the operations of forming wfs using our standard propositional connectives ∨, ∧, and ¬, such that one is in a UFS with respect to .⁴³ Then one’s degrees of belief (i.e. expectations) of wfs in behave like degrees of truth, in the sense that they satisfy the following three conditions: 1. E(¬γ) = 1 − E(γ). 2. E(γ1 ∨ γ2 ) = max{E(γ1 ), E(γ2 )}. 3. E(γ1 ∧ γ2 ) = min{E(γ1 ), E(γ2 )}. Summing up the proposal: an agent’s degrees of belief are the resultant of two things: the agent’s subjective uncertainty about which way the actual world is (represented by a probability measure over the space of all possible worlds—or at least over a σ-field of subsets of this space—with the measure assigned to a set of worlds specifying how likely the agent ⁴⁰ The closure requirement is no restriction, because if one is in a VFS with respect to a class of wfs, then one is in a VFS with respect to the closure of that class under the operations of forming wfs using our standard propositional connectives: this follows because whenever the component wfs are 1 true or 0 true, so are the compounds. ⁴¹ There are several possible definitions of ‘tautology’ in fuzzy logic. All we need for the proof is something they all agree on, viz. that a tautology never gets the value 0. ⁴² There are several possible definitions of ‘mutually exclusive’ in fuzzy logic. All we need for the proof is something they all agree on: viz. that two mutually exclusive propositions never both get the value 1. ⁴³ Again, the closure requirement is no restriction, because if one is in a UFS with respect to a class of wfs, then one is in a UFS with respect to the closure of that class under the operations of forming wfs using our standard propositional connectives: if one is certain that S is m true and that T is n true, then one is certain that S ∨ T is max{m, n} true, that S ∧ T is min{m, n} true, and that ¬S is 1 − m true.
239
thinks it is that the actual world is in that set) and the objective facts about how true each proposition is in each world. Specifically, the agent’s degree of belief in a proposition is the agent’s expected value for its degree of truth: roughly, the average of its truth in all the worlds the agent has not ruled out, weighted according to how likely the agent thinks it is that each of those worlds is the actual one. In some situations, the agent will have ruled out vagueness: she may not know which world is actual, but she is certain that in the actual world, some propositions of interest are either fully true or fully false. In such situations, her degrees of belief will behave just like probabilities (propositions 1 and 3). In other situations, the agent will be free of uncertainty with respect to some propositions of interest: she is certain of exactly how true they are in the actual world. In such situations, her degrees of belief will behave just like degrees of truth (propositions 2 and 4). In situations which are neither vagueness-free nor uncertainty-free—that is, where the agent is unsure of the truth values of some propositions of interest, and cannot rule out vagueness, that is, cannot rule out that they might have intermediate degrees of truth—her degrees of belief in those propositions need not behave like probabilities or degrees of truth. (In situations which are both uncertainty-free and vagueness-free—that is, the agent knows of each of the propositions in question that it is 1 true, or that it is 0 true—degrees of belief behave both like probabilities and like degrees of truth. This is possible because the behaviours of probabilities and degrees of truth coincide in this special case.) In all cases, I maintain that an agent’s expectation of a proposition S’s degree of truth is an accurate measure of her tendency to act as if S, and this is why I identify degrees of beliefs with expectations. My proposal contrasts with views such as the following: Let our degrees of belief be represented by a probability measure, P, on a standard Borel space (, F, P), where is a set, F is a sigma-field of measurable subsets of , and P is a probability measure on F. (Skyrms 1984, 53) [By a reasonable initial credence function C] I meant, in part, that C was to be a probability distribution over (at least) the space whose points are possible worlds and whose regions (sets of worlds) are propositions. C is a non-negative, normalized, finitely additive measure defined on all propositions. (Lewis 1986 [1980] 87–8)
240 The crucial difference between these views and mine is that they equate an agent’s degrees of belief directly with her subjective probabilities.⁴⁴ My view, on the other hand, countenances the subjective probability measure—it models the agent’s epistemic state—but regards degrees of belief as resultants of this state and degrees of truth. In the sort of cases Skyrms and Lewis were considering, in which bivalence was assumed, this difference makes no difference (propositions 1 and 3). However, if we want to add degrees of truth to the mix, then we will run into all sorts of problems if we have already identified degrees of belief with subjective probabilities—for, as we saw at the outset, degrees of truth also give rise to degrees of belief, but these degrees of belief do not behave like probabilities. On the other hand, if we identify degree of belief with expectation of truth even in the bivalent case, then we can generalize smoothly to the case of degrees of truth. 5.3.3 Objections and Replies I now consider some objections to my proposal. First I consider objections concerned with part (3): the identification of the agent’s degree of belief that S with her expectation that S. Then I consider objections concerned with part (1): the identification of the agent’s epistemic state with a probability measure over possible worlds. (Objections to part (2)—the idea that propositions have degrees of truth at worlds—are being dealt with throughout this and the next chapter.) (1) One’s expectation that S is not an accurate measure of one’s tendency to behave as if S. Suppose I know that a certain orangey-red autumn leaf is red to degree 0.5. Suppose also that I need a perfectly red leaf. Then I will have no tendency whatsoever to reach for this leaf, even though my expectation that it is red is 0.5. Reply: The problem here is the presence of the word ‘perfectly’. Of course, if I need a perfectly red leaf, then I will have no tendency whatsoever to reach for the orangey-red one. But this is quite compatible with the foregoing account, because my expectation that the leaf is perfectly red, i.e. red to degree 1, is 0. On the other hand, my expectation that it is red is 0.5; and if I need a red leaf, then I think I would have some tendency to reach for this one: less than for a perfectly red leaf, but more than for a green one. ⁴⁴ I take ‘credence’ to be a synonym for ‘degree of belief ’.
241
(2) Suppose that Jim is tall to degree 0.5, fat to degree 0.5, and bald to degree 0.5, while his workmate Tim is tall to degree 1, fat to degree 0.5, and bald to degree 0.5—and we know all this. Suppose that we are to award a prize to any man in the office who is tall, fat, and bald. On the fuzzy account, both ‘Jim is tall, fat, and bald’ and ‘Tim is tall, fat, and bald’ are true to degree 0.5, and we know this; so on my account, our degrees of belief in ‘Jim is tall, fat, and bald’ and in ‘Tim is tall, fat, and bald’ are the same—both 0.5. Yet we would sooner give a prize to Tim than to Jim. So one’s expectation of the truth value of P is not an accurate measure of the strength of one’s tendency to act as if P.⁴⁵ Reply: I agree that we would sooner give a prize to Tim than to Jim. I deny that what is driving our preference for Tim over Jim here is our degrees of belief in the conjunctions ‘Jim is tall, fat, and bald’ and ‘Tim is tall, fat, and bald’. I think what drives our preference is the following: with respect to fatness, Tim and Jim are a tie; with respect to baldness, they are a tie; but with respect to tallness, Tim is ahead. So tallness breaks the tie, and makes us prefer Tim. But this is quite different from saying that our degree of belief in the conjunction ‘Tim is tall, fat, and bald’ is higher than our degree of belief in the conjunction ‘Jim is tall, fat, and bald’—something which I deny. (3) If your degrees of belief do not conform to the probability calculus, then you are subject to Dutch book, i.e. you are irrational. Reply: One should not bet at all on a proposition S unless one is in a vagueness-free situation with respect to S; if one does bet in a non-VFS, then it is for that reason alone that one is irrational. Suppose you are not in a VFS with respect to S. Suppose, first, that you know that S is k true, for some k ∈ (0, 1); say k = 0.5 for the sake of argument. Then you should not bet on S. For to bet is to agree to an arrangement whereby you get such-and-such if S turns out to be the case. But you already know what is the case—and you know that it is, in the nature of things, indeterminate whether S —hence indeterminate whether you get your payoff. Knowing all this, you should not bet in the first place. Second, suppose that you do not know whether S is true—and you cannot rule out that S has an intermediate degree of truth. In this case again you should not bet, because for all you know, the bet will not—for the sort of reason just seen—be able to be decided. Of course, if there is in place some system for deciding ⁴⁵ Based on objections from Roy Sorensen and Dorothy Edgington (in discussion).
242 bets on S when S has an intermediate degree of truth—say an umpire who rules one way or the other, or a rule that S will be deemed 1 true if it is more than 0.5 true—then one may enter into a betting arrangement on S. However, in such a case the situation has, in effect, been turned into a VFS, by changing S’s intermediate degrees of truth in some non-ruled-out worlds into 1’s or 0’s.⁴⁶ (4) Continuing the previous objection: Some writers have claimed that ‘‘The cunning bettor is simply a dramatic device—the Dutch book a striking corollary—to emphasize the underlying issue of coherence’’ (Skyrms 1984, 22). The idea is meant to be that one is internally incoherent if one’s degrees of belief do not conform to the probability calculus: the Dutch book idea simply serves to bring this incoherence into the open in a striking way; but even if one is not subject to Dutch book for some reason (e.g. because betting has been made illegal, and this law is enforced absolutely), one is still internally incoherent. Reply: Why is one supposed to be incoherent in such a case? Well, here’s a way of bringing it out. Suppose I think A is 50 per cent likely to occur (in 50 per cent of futures compatible with the present, A occurs); I think B is 50 per cent likely to occur (in 50 per cent of futures compatible with the present, B occurs); I think A and B are incompatible (in no future do A and B both occur); and yet I think ‘A or B’ is not 100 per cent likely to occur—i.e. I think that in (say) 50 per cent, rather than 100 per cent, of futures compatible with the present, ‘A or B’ will be true. When framed in this way in terms of sizes of sets of possible futures, this combination of beliefs is obviously incoherent. But my view endorses this assessment: in the situation envisaged, the agent is in a VFS (she does not know whether or not A or B will occur, but she assumes neither of them will sort-of occur), and so will not have these ⁴⁶ My comments about not betting in non-VFSs are concerned with standard bets—i.e. bets which do not specify what is to happen (who gets what) when the proposition in question is neither true nor false. Milne 2007 discusses a new type of betting arrangement, tailor-made for vagueness, on which one could legitimately bet in a non-VFS. The basic idea (although this is not the way Milne expresses it) is that if one bets on S, and S is n true, then one receives n times the stake (so in the special case where S is 1 true, one receives all the stake—so one’s net winnings are the stake minus what one put into the stake; and where S is 0 true, one receives none of the stake). Of course, this complements rather than conflicts with my comments above (Milne was not suggesting otherwise). I say that one should not accept an ordinary bet if one thinks that vagueness may be present—for when vagueness is involved, there is no way of deciding such a bet. This does not mean that one should not accept a new kind of bet—one designed precisely to avoid the problem faced by ordinary bets when vagueness is present, by explicitly building in a decision procedure which works even when the proposition on which one is betting has an intermediate degree of truth.
243
degrees of belief, on my view. On the other hand, I do not think that, in itself, the following combination of degrees of belief is incoherent, even supposing the agent knows that A and B cannot both be fully true: A : 0.5,
B : 0.5,
A or B : 0.5
It all depends on how these degrees of belief arise. If you are in a VFS and have these degrees of belief, then you are indeed incoherent—as can be brought out either by Dutch book reasoning or by reflections on sizes of sets of possibilities. But degrees of belief might arise in other ways—not just as a result of uncertainty; and when they do, this sort of combination can be perfectly reasonable. For example, suppose that A is the proposition that a certain leaf is red, and B is the proposition that it is orange; then A and B cannot both be fully true. Suppose also that the leaf in question is right in the middle of a Sorites series leading from red things to orange things. Then, I submit, the above combination of degrees of belief is perfectly reasonable: intuitively it is just fine, and neither the Dutch book nor the ‘sizes of sets of possibilities’ rationales can get a grip to show that there is something wrong with it. Dutch book reasoning does not get started because I will not bet (there is nothing to bet on—no outcome to wait and see about: I already have all the information about the leaf’s colour before me). Similarly, the ‘sizes of sets of possibilities’ reasoning does not get started, because there is nothing I am uncertain about. (5) An objection arising from my response to the previous two: I claim that in non-VFSs, we have degrees of belief while not being prepared to bet (at all). The objection is that we cannot make sense of the idea of degree of belief except in terms of fair betting quotients or odds.⁴⁷ Reply: We make sense of degree of belief in S in terms of strength of tendency to act as if S, and ‘acting as if S’ can be made sense of more generally than in terms of ‘betting on S’. After all, betting is essentially tied up with ignorance or uncertainty—betting gets its life from the fact that we do not know, with certainty, what the outcome will be—but, I have argued, the idea of degree of belief gets a grip in circumstances in which there is no uncertainty at all. Consider again the autumn leaf which is borderline red–orange. You have some tendency to act as if it is red, as discussed ⁴⁷ Based on objections from Andy Egan and Josh Parsons (in discussion).
244 earlier. But with the leaf in plain sight, you would not accept a bet that it is red at any price: for we can all see quite plainly that the leaf is neither clearly red nor clearly non-red, and so we can see at the outset that the bet will misfire.⁴⁸ (6) Another objection in the ballpark of the previous three: Barnett (2000) claims that some attitudes cannot be beliefs if they do not satisfy the standard laws, i.e. the probability axioms. That is, it is constitutive of beliefs that they behave like probabilities. Reply: What is constitutive of belief is the idea that a belief that S is a tendency to act as if S in appropriate circumstances. It turns out —i.e. it is a fact about beliefs, but not constitutive of them—that in certain situations—viz. VFSs—beliefs do behave like probabilities. But in other situations they do not—and yet they still count as beliefs, because of their connection with tendencies to act. (7) If my degree of belief measures my tendency to act as if S, how can it be that I might have the same degree of belief that S in two situations, and yet behave very differently in those situations? Say my degree of belief in S is 0.5 because I am uncertain whether S is 1 true or 0 true. Then I might bet on S, if the price and prize are right. But suppose my degree of belief in S is 0.5 because I am certain that S is 0.5 true. Then, for the reasons discussed above, I will not bet on S, no matter what the price and prize. So the same state—a degree of belief of 0.5 in S —leads to different actions. How can this be? Reply: These different actions are the results not of a single belief, but of complexes of beliefs, which are different in the two situations (cf. n. 32 above). A 0.5-degree belief that S combined with the belief that, whatever further evidence comes in, I will not alter my degree of belief in S, leads to refusing to bet; a 0.5-degree belief that S combined with the belief that further evidence might come in leading me to believe to degree 1 that S, and that further evidence might come in leading me to believe to degree 0 that S, leads to accepting certain bets. What I do want to maintain is that a degree 0.5 belief that S equates to a certain tendency to act as if S, no matter what the source of this degree of belief. Thus I maintain that if, in a given situation, you have a degree 0.5 belief that Bill ⁴⁸ Those who feel strongly that where there are degrees of belief there must be betting quotients can find comfort in the kind of betting arrangement discussed in Milne 2007 (see n. 46 above). Milne shows that the fair betting quotient a rational agent assigns to a bet of his kind on A perfectly matches the agent’s degree of belief in my sense that A, i.e. her expectation of A’s degree of truth.
245
is bald (because you are in a VFS with respect to this proposition, and are uncertain whether Bill is 1 bald or 0 bald), and a degree 0.5 belief that Ben is bald (because you are in a UFS with respect to this proposition, and are certain that Ben is a dead-central borderline case of baldness), then your tendency to treat Bill as bald will be exactly the same as your tendency to treat Ben as bald. So, for example, if you need to sign up a bald man for a door-to-door sales campaign, you will be indifferent between signing up Bill and Ben.⁴⁹ Now we consider some objections to the idea that the agent’s epistemic state is modelled by a probability measure over possible worlds. First two objections to the possible worlds part; then an objection to the probability measure part. (8) There are agents for whom it is epistemically possible that Hesperus is not Phosphorus, and yet there are no possible worlds in which Hesperus is not Phosphorus. Reply: The agent assigns positive probability to worlds in which there is one heavenly body which is visible both in the evening and in the morning, and positive probability to worlds in which there are two heavenly bodies, one of which is visible in the evening and one of which is visible in the morning. This captures the phenomenon; it simply cannot be described in terms of the agent assigning a positive degree of belief to both ‘Hesperus is Phosphorus’ and ‘Hesperus is not Phosphorus’.⁵⁰ (9) It might be that I know everything about the objective state of the world—i.e. I assign measure 1 to a set containing just one possible world—and yet there are still things I do not know, for example, what time it is, where I am, or who I am. This is impossible on my view: once I assign measure 1 to a set containing just one possible world, there can be nothing left to find out. Reply: If we take seriously these Perry-type worries, we can replace the measure over worlds with a measure over centred worlds (Lewis 1979; Chalmers 2001). (10) It is absurd to suppose that an agent’s epistemic state is sufficiently determinate to be modelled by a probability measure over worlds, which assigns a unique real number to every set of worlds. Reply: There are two ⁴⁹ If you need a degree-1 bald man (or a perfectly bald man, or a very bald man), then of course you will not be indifferent. Cf. n. 33 and objection 1. ⁵⁰ Cf. Kripke 1980, 102 ff.
246 parts to this worry: the generality of the measure (it assigns a number to every set of worlds) and the precision of the measure (it assigns a unique real number to each set of worlds). Regarding generality, we have not in fact said that the probability measure assigns a number to every set of worlds: only to every set of worlds in F . So if we take this worry seriously, we can simply suppose F to be rather coarse; the upshot would be that the agent has degrees of belief in fewer propositions. Regarding precision, if we take this worry seriously, we can replace the single measure with a family of measures; the upshot would be that the agent does not have a unique degree of belief in each proposition, but rather a family of degrees of belief. 5.3.4 Degrees of Truth and Pragmatics I have now presented and defended a view of the relationship between subjective probabilities, degrees of truth, and degrees of belief; it is time to return to the case of Stalnakerian pragmatics, to see how it might work when we admit degrees of truth. We suppose that each conversationalist is in an epistemic state given by a probability measure over possible worlds. As in Stalnaker’s picture, each conversationalist has a set of presuppositions: but we now suppose presupposition to be a matter of degree. Agents may have intermediate degrees of belief (i.e. expectations) in propositions. The way in which one expresses an intermediate degree of belief is via an unconfident or hesitant utterance. We suppose that there are degrees of confidence of utterances—distinguished by response time, tone of voice, presence of hedging words, and so on—corresponding to the degrees of truth. An agent expresses that her degree of belief in S is n by uttering S with degree of confidence n. But if one can believe and assert propositions to intermediate degrees, one must be able to presuppose them to intermediate degrees too. So, for example, an agent might presuppose ‘Snow is white’ to degree 1 and ‘Bob is bald’ to degree 0.5. That is, she might fully take for granted that snow is white, but only partially take for granted that Bob is bald—that is, she might, for the purposes of a conversation, assume the attitude towards ‘Bob is bald’ that one would take to this proposition if one knew that Bob was a dead-central borderline case of baldness. The agents’s set of propositions with associated degrees of presupposition determines a set of worlds, that is, a context set: the set of all worlds in which each proposition is true to the degree to which it is presupposed. So in the
247
above example, the set will contain worlds in which ‘Snow is white’ is true to degree 1 and ‘Bob is bald’ is true to degree 0.5. The sense in which the agent presupposes these propositions is cashed out thus: we suppose that the agent assigns measure 1 to the corresponding context set. In a nondefective context, each agent has the same set of presuppositions—that is, they presuppose the same propositions to the same degrees—that is, they all assign measure 1 to the context set determined by these propositions and degrees.⁵¹ An agent may have a degree of belief of (say) 0.5 that S because she is uncertain as to S’s truth value or because she is certain that S’s truth value is 0.5. In both cases she will express her belief via a 0.5-confident utterance of S. However, only in the latter case will this utterance count as an assertion. Recall Stalnaker’s key idea that an assertion puts forward a set of worlds; if the assertion is accepted, the context set is adjusted by striking out worlds not in the asserted set. Now note that a 0.5-confident utterance born of uncertainty does not specify a set of worlds. If I express that I do not know what S’s truth value is, but that the weighted average of the degrees of truth which, for all I know, it might have is 0.5, then I do not rule out any worlds: for all I have said, we still might be in a world where S has any degree of truth whatsoever. On the other hand, a 0.5-confident utterance born of certainty that S is 0.5 true does specify a set of worlds: the set of worlds in which S is 0.5 true. That is why unconfident utterance of S counts as an assertion only if the utterer is in a UFS with respect to S. But now recall Stalnaker’s basic idea that a conversation consists of assertions, where the purpose of an assertion is to narrow the context set—and where all participants presuppose that this is the purpose of speaking in a conversation. Given this, if someone makes an n-confident utterance that S in a conversation, we should interpret her as making an assertion—i.e. as being in a UFS with respect to S, and claiming that the actual world is one in which S is n true. If this assertion is not challenged, how will the context be updated? Well, each conversationalist will conditionalize her probability measure on ⁵¹ This still leaves plenty of room for the conversationalists to be in different epistemic states, i.e. to have different probability measures—it’s just that there is one set of worlds which all their measures must assign 1. Note also that, just as in Stalnaker’s picture a conversationalist need not really believe what she presupposes for the purposes of a conversation, so too in this generalization of his picture a conversationalist may for the purposes of a conversation adopt a probability measure which does not represent her true epistemic state.
248 the set Sn of worlds in which S is n true. That is, she will update her old probability measure P to the new probability measure P given by:⁵² P (T) = P(T/Sn ) =
P(T ∩ Sn ) P(Sn )
This guarantees that where C is the old context set (so P(C) = 1), P (C ∩ Sn ) = 1—that is, the old context set intersected with the newly asserted set of worlds becomes the new context set; that is, it gets measure 1. To conclude §5.3: I have not (of course) shown one by one that degrees of truth are compatible with every important development that has been made in areas of philosophy of language, beyond the study of vagueness, where bivalence has hitherto been taken for granted. However, given the level of interoperability we have now demonstrated between degrees of truth and other important parts of our conceptual repertoire—logical, semantic, set-theoretic, and now probabilistic machinery—we can certainly see that the degree-theoretic approach to vagueness is very far from being a gated community in theory space.⁵³
5.4 Truth and Assertibility As discussed in the previous section, once we have degrees of truth in the picture, we need to countenance degrees of (confidence of) assertion, indicated by tone of voice, by length of pause before speaking, by pace of utterance, by presence or absence of hedging phrases such as ‘sort of’, ‘somewhat’, ‘to a certain extent’, and so on. There still remains the question of assertibility. We have multiplied the number of available speech acts: rather than simply making an assertion, one can now, for any degree n of confidence, make an assertion with that degree of confidence (a degree-n assertion, or n-assertion).⁵⁴ But there is still the issue of under ⁵² This will not be defined if P(Sn ) = 0, but that is not a problem here, because a conversationalist should not assert that the world is a way which it was already presupposed not to be. ⁵³ Many thanks to Agustín Rayo, for putting to me both the general challenge of showing that degree-theoretic treatments of vagueness can mesh with developments in philosophy of language outside the study of vagueness, and the case of Stalnakerian pragmatics as a particular instance of this challenge. ⁵⁴ We have taken the degrees of assertion to be represented by the real interval [0, 1], so that they correspond to the degrees of truth. In practice, of course, we cannot discriminate speech acts this finely—just as we cannot, in practice, determine the degrees of truth of utterances as finely as the apparatus of fuzzy semantics allows.
249
what conditions a given one of these speech acts is acceptable (appropriate, correct, warranted). In the case where we have only one speech act of assertion, we want an account of the conditions under which an assertion of a sentence S is acceptable—or in other words, of the conditions under which S is assertible. Now we need, for each degree of assertion, an account of the conditions under which an assertion of that degree of confidence of a sentence S is acceptable—or in other words, for each degree n of assertion, we need an account of the conditions under which S is n-assertible. One crucial point to note is that while truth and assertion are now matters of degree, the notions of assertibility or acceptability are not: they are pass/fail notions, not graded ones. In the classical case, where we have only one speech act of assertion (rather than degrees of assertion), a sentence is assertible (in a given context) if one can assert it without prompting a (legitimate) challenge from one’s conversational partners—that is, if, precisely, one’s assertion passes or is accepted. If it does not pass then, whether the challenge is mild or vehement, still, one’s assertion has not passed —it has not been accepted —and so (assuming the challenge is legitimate) the sentence that one asserted is not assertible (in that context). This pass/fail character of assertibility carries over to the case where we have a question of assertibility for each degree of assertion. We now want to know when a sentence is n-assertible, for each n: but for each n, n-assertibility is a pass/fail notion, not a graded one. For a given sentence, degree of confidence of assertion n, and context, either it would be acceptable to assert that sentence with that degree of confidence in that context—that is, this speech act should pass unchallenged—or it would not be.⁵⁵ With the point clarified that assertion, but not assertibility, is a matter of degree, I now offer my account of when a sentence S is n-assertible. Note that I am concerned here only with the relationship between truth and assertibility: my aim is to come up with a generalization of the classical idea that S is assertible when true to the context of degrees of truth and degrees of assertion. Assertion is also subject to many other norms: one’s ⁵⁵ One might think that in the classical case, assertibility should be regarded as a graded notion. For what proportion of conversationalists have to object, and how strongly, before an utterance counts as not having been accepted? Perhaps there is something to this thought, although I shall not pursue it here. The essential point remains that introducing degrees of assertion does not in itself provide a reason for thinking of assertibility as a graded notion. If there are reasons why we should think of assertibility in this way, they apply equally to the classical case (where we have just one act of assertion) and to the degrees of truth case (where we have one act of assertion for each degree of truth).
250 assertions should not be irrelevant, or rude, or badly expressed, and so on. I have nothing in particular to say about these other norms, but I do of course assume that they are still in play, alongside the generalized truth norm discussed here.⁵⁶ The generalization is in fact straightforward: S is n-assertible when true to degree n. That is, the degree of confidence which is appropriate in an assertion of S is the one that corresponds to S’s degree of truth. So if S is 0.5 true, it will be 0.5-assertible. This does not mean that an (ordinary, unhesitant) assertion of S would be ‘half right’—we have already said that the assessment of assertions is not graded (i.e. there are no ‘half rights’); it is pass/fail. Rather, it means that assertion of S with degree of confidence 0.5 would be right, while assertion of S with any other degree of confidence would be wrong. (As noted in n. 54, however, we cannot in general distinguish degrees of confidence of utterance, or degrees of truth of sentences, extremely finely, and so in practice the norm governing assertion may as well be that an assertion is acceptable if its degree of confidence roughly corresponds to the degree of truth of its content.)⁵⁷ With this account of assertibility in hand, we are in a position to respond to a challenge to degree theories posed by Wright. I argued in §3.5.1 that Closeness gives us tolerance intuitions without incoherence, and in Chapter 4 that we need degrees of truth in order to accommodate Closeness (without Tolerance). Wright, however, has presented an argument to the effect that degree-based approaches are of no help in avoiding the conclusion that vague predicates are tolerant, and hence incoherent: faced with a situation and a predicate, we have only two choices—to apply or to withhold. . . . The crucial notion to be mastered for practical purposes is thus ⁵⁶ It is not in dispute that assertion is subject to many norms (although sometimes terms other than ‘norm’ are used—e.g. ‘rules’ or ‘warranted assertibility conditions’). What is in dispute in the literature is whether one norm is primary—in the sense that it is uniquely characteristic of assertion to be subject to this norm, and that the other norms to which assertion is subject are a product of this primary norm and more general rules not specific to assertion—and if so, which norm it is. For example, Williamson (2000, ch. 11) argues against the idea that the truth norm ‘assert p only if p is true’ is primary in this sense, and in favour of the idea that the knowledge norm ‘assert p only if one knows p’ is primary. We need not enter this debate here. In particular, in discussing the truth norm, I do not mean to be claiming (or denying) that it is the primary norm of assertion. ⁵⁷ In §5.2, I said that a sentence is ‘assertion grade’ or ‘true enough to assert safely’ if its degree of truth is greater than or equal to 0.5. This does not mean that if a sentence S has a degree of truth of 0.5 or greater, then an (ordinary, unhesitant) assertion of S is acceptable. Rather, the idea is that a sentence is ‘assertion grade’ or ‘true enough to assert safely’ if the level of confidence appropriate in an utterance of the sentence is at least as high as the level of confidence appropriate in an utterance of its negation.
251
that of a situation to which the application of F is on balance justified. Without mastery of this notion, no amount of information about the structure of variations in the degree with which F applies entails how the predicate is to be used. Now of this notion may it not still be a feature that it always survives sufficiently small changes?—that if a and b are dissimilar only to some very small extent, then if describing a as F is on balance justified, so is thus describing b? . . . The introduction of a complex structure of degrees . . . has got us no farther; for among these we have still to distinguish those with which for practical purposes the application of the predicate is to be associated; otherwise we have not in repudiating bivalence done anything to replace the old connection between justified assertion and truth. (Wright 1975, 350)⁵⁸
In response, the first point to make is that, faced with a situation and a predicate, we have many more options than simply to apply or to withhold the predicate. We can apply the predicate with varying degrees of confidence or hesitation—i.e. where F is the predicate and a the object, we can assert Fa with varying degrees of confidence. Second, we ‘‘replace the old connection between justified assertion and truth’’ as indicated above: the appropriate (or in Wright’s terms, justified) degree of assertion is the one corresponding to the degree of truth of the sentence in question. And now we can see that tolerance does not threaten. For if a and b are dissimilar only to some very small extent, it does not follow that the degree of assertion appropriate for Fb is the same as the degree of assertion appropriate for Fa: it follows only that these degrees are very similar.
5.5 Truth-Functionality In §2.4 (pp. 85 ff.), I discussed the objection to recursive many-valued views that their interpretation of the sentential connectives as truth functions does not cohere with ordinary usage of compound sentences about borderline cases, and/or with intuitions about the truth of such sentences. This objection has been pushed strongly against the fuzzy view of vagueness; Williamson (2003, 694) calls it a ‘‘dark cloud over fuzzy logic’’. The objection proceeds by way of alleged counterexamples: a particular sentence is given, and then it is argued that this sentence obviously has a particular ⁵⁸ Cf. also Dummett 1978.
252 truth value (or assertibility status), whereas on the fuzzy account it has some other truth value (or some truth value which is not compatible with its having this assertibility status—e.g. it is unassertible, but the fuzzy theory gives it a non-zero degree of truth). One serious problem with all discussions of this issue in the literature is that none of them is based on proper data about ordinary usage. Some authors (myself included) have put informal questionnaires to undergraduate students and non-philosophers; others have simply consulted their own intuitions. No one has done an adequate study of a sort which would meet minimal standards for empirical work in linguistics or psychology.⁵⁹ The lack of reliable data will therefore need to be factored in to our discussion. The sentences we are interested in include the following, where p is a colour sample midway between clear red and clear orange:⁶⁰ 1. 2. 3. 4.
p is red. p is not red. p is orange. (a) p is red or p is not red. (b) p is or isn’t red. (c) p is either red, or it isn’t. (d) Either p is red, or it is not red. .. .
5. (a) p is red or p is orange. (b) p is red or orange. (c) p is either red or orange. (d) Either p is red, or it’s orange. .. . 6. (a) It is not the case that p is red or that p is not red. (b) p is neither red, nor not red. ⁵⁹ Bonini et al. 1999 have undertaken psychological tests of the attitudes of speakers to certain sentences involving vague predicates, but while their study may be unobjectionable from a methodological point of view, its content is inadequate, for they did not test attitudes to crucial compound sentences, such as ‘Fa or not Fa’ and ‘Fa and not Fa’, where a is borderline F. They simply assert, without testing this claim, that ‘x is red and x is not red’ seems clearly false when x is a borderline case of ‘red’—an intuition I challenged in §2.4. ⁶⁰ Of course we are not specifically interested in colours; in general we are interested in sentences about borderline cases. Conditionals will be discussed in §5.5.1.
253
(c) p’s neither red, nor not. .. . 7. (a) p is red and p is not red. (b) p is and isn’t red. (c) p is red, and not red. (d) p is both red and not red. .. . 8. (a) p is red and p is orange. (b) p is red and orange. (c) p is both red and orange. .. . 9. (a) It is not the case that p is red and p is not red. (b) It is not the case that p is and isn’t red. (c) It is not the case that p is red, and not red. (d) It is not the case that p is both red and not red. .. . 10. (a) It is not the case that p is red, but nor is it the case that p is not red. (b) p isn’t red, but it isn’t not red either. (c) p’s not red, but neither is it not red. .. . .. . If we had proper empirical data, we would know, for each of these sentences and various contexts of utterance, what proportion of ordinary speakers find the sentence assertible with a given degree of confidence in that context. We do not have this data—but we do have some reported sets of intuitions about some of these sentences, some of which have been thought to conflict with the fuzzy view of vagueness. I shall consider the cases that have been thought to cause a problem for the fuzzy view. It will become clear that there is no basis for thinking that the truth-functional degree theorist has a problem in this area. Often we will be able to adapt strategies that have been employed by other theorists of vagueness for explaining away apparent conflicts between the truth assignments that their theories make to certain sentences and ordinary usage of those sentences.
254 Before proceeding to cases, it will be useful to set up some apparatus. First, I shall frame my discussion in terms of a picture according to which when we utter a sentence, we say or state a wf relative to an interpretation (i.e. the intended interpretation). Thus the content expressed by a sentence in a context is an interpreted wf.⁶¹ I adopt this approach because it employs only the model-theoretic apparatus that I have taken as a common framework for discussing theories of vagueness—but one who prefers to think in terms of propositions (under one conception or another) could easily recast the discussion in those terms: all that matters in what follows is that we distinguish sentences from their contents. The terminology that I shall employ is this: by uttering a sentence, we say or state its content; to say that a sentence expresses a content in a context is to say that to utter the sentence in that context would be to state that content; it is sentences (not contents) that are said to be assertible to such-and-such degree in contexts. In general, I shall assume, in accordance with the discussion in §5.4, that—norms of assertion other than the truth norm to one side—a sentence is assertible in a context to a degree corresponding to the degree of truth of the content that it expresses in that context. Second, a challenge to a truth-functional degree theory might ultimately be based on considerations of assertibility (e.g. someone might have the intuition that a certain sentence is obviously assertible in some context, or survey data might show that no ordinary speaker will assert a certain sentence, even hesitantly, in some context), or on considerations of truth (e.g. someone might have the intuition that a certain sentence is obviously true). The assertibility challenge takes this form: 1. 2. 3. 4.
Sentence S is n-assertible in context C. Given the fuzzy theory, S expresses the wf p in C.⁶² p is not n true.⁶³ S is assertible in C to the degree that the content it expresses in C is true. 5. So given the fuzzy theory, S is not n-assertible in C. ⁶¹ For more details on this kind of framework, see Smith 2006, §2. ⁶² I will often, as here, suppress talk of the intended interpretation, for the sake of increased readability, but it should be kept in mind that a wf by itself is not a content: a wf plus an interpretation is a content of a sentence. Which interpretation of a given wf is supposed to be the intended one will always be obvious or immaterial, where not explicitly mentioned. ⁶³ That is, on the interpretation which is intended relative to C.
255
The truth challenge takes this form: 1. In C, sentence S is n true (i.e. S expresses a content in C which is n true). 2. Given the fuzzy theory, S expresses the wf p in C. 3. p is not n true. 4. So, given the fuzzy theory, S expresses a content in C which is not n true. In face of a challenge of either sort, there are two basic types of response available: contextualism and a warranted assertibility manoeuvre (WAM).⁶⁴ The contextualist response is this: p is indeed the obvious, ‘surface’ reading of S, and in many contexts S does express p; however, in context C, S in fact expresses q, and q (unlike p) is n true. Thus, whether in response to an assertibility challenge or a truth challenge, the contextualist denies the second premiss. The WAM is this: S does express p in C, but in C, the truth norm governing assertion is overridden by other norms of assertion—that is, in C, S’s assertibility does not go simply by the truth of its content. If we consider other factors affecting assertibility—apart from truth—we can see why S is n-assertible in C, even though the content it expresses in C is not n true. How exactly this general line of response will play out depends upon whether it is being deployed in response to an assertibility challenge or a truth challenge. As a response to an assertibility challenge, it consists in denying the fourth step—that is, while it accepts that the fourth step accurately states the contribution that truth makes to assertibility, the response reminds us that other factors can play a role in assertibility, resulting in a situation in which our ultimate judgement concerning the warranted assertibility of a sentence does not match its truth. As a response to a truth challenge, the WAM consists in denying the first step, via the claim that the objector is confusing warranted assertibility with truth: S is indeed n-assertible in C, and this is why the objector thinks that it is n true; but in fact it is not n true, because in this case factors other than truth affect our ultimate judgement that S is n-assertible. There has been debate in the literature concerning the relative merits of contextualism and WAMs, both as general philosophical strategies and in relation to particular issues—for example, in epistemology (see e.g. the ⁶⁴ The latter term is due to DeRose 1999; see also DeRose 2002.
256 papers cited in n. 64). We do not need to enter this debate here. The way I shall proceed is as follows. Given a sentence S which is intuitively n true and/or n-assertible in a context C, and where the wf p—which, it seems, the fuzzy theory must regard as the content expressed by S in C —is m true (m = n), I will content myself with finding a wf q which is n true, and which can be regarded as a plausible reading of S in C —i.e. which is such that it is plausible to say that ordinary speakers hear S as q in C. I shall leave open the further issue of whether to spell out this idea that speakers read or hear S as q in C via contextualism or a WAM. The contextualist approach would be to say that in C, S actually expresses q. The WAM would be to say that while S expresses p in C, its warranted assertibility condition in C is such that it is assertible to the degree that q (not p) is true. I think that in some of the cases to be discussed, a contextualist approach is more plausible, and in others a WAM is more plausible. However, as long as at least one approach is plausible in each case, it is unnecessary for present purposes—i.e. defending the fuzzy view against the truth-functionality objection—to pursue the question of which approach is more plausible in each case. So, to the cases. Consider sentence (8a): ‘‘p is red and p is orange.’’ According to Williamson’s intuitions, (8a) is clearly incorrect.⁶⁵ According to my observations, speakers generally hedge over (8a) just as they do over (1) (‘‘p is red’’) and (3) (‘‘p is orange’’), when p is borderline red/orange (recall §2.4).⁶⁶ Given these conflicting views, I think we must accept that proper empirical studies might show that almost all speakers find (8a) to be 0-assertible, or that almost all speakers find (8a) to be 0.5-assertible, or that significant numbers of speakers go each way (in a given context), or that all speakers go the same way in each context—but different ways in different contexts. Yet, whatever happens, there is no prospect of a problem here for the truth-functional degree theorist—for there is a plausible reading of (8a) which is 0.5 true, and another plausible reading which is 0 true. The former is simply the surface reading Rp ∧ Op. In order to state the latter reading, we need to add some symbols to our formal language. Let us use angle brackets as quotation marks—so where A is a wf, A is ⁶⁵ The actual sentence that Williamson 1994, 136 considers is ‘He is awake and he is asleep’, said of someone drifting off to sleep. It is not clear whether Williamson means ‘unassertible’ or ‘false’ by ‘incorrect’; but, as we have seen, the challenge can be put either way. ⁶⁶ Forbes 1983, 244 has similar intuitions.
257
a singular term whose intended referent is the wf A . Let us use T1 as a one-place predicate which we can read as ‘is true to degree 1’. Thus, where A is a wf, T1 A is a wf which is 1 true if A is 1 true, and 0 true if A is true to any degree other than 1.⁶⁷ Now the second plausible reading of (8a) is T1 Rp ∧ T1 Op.⁶⁸ This wf says that both Rp and Op are definitely true—that is, true to degree 1—in the sense that it is true to degree 1 if Rp and Op are each true to degree 1, and it is true to degree 0 otherwise. In the situation under consideration, in which Rp and Op are each 0.5 true, it is 0 true. Thus the fuzzy theorist can say the following about (8a): if the (currently unknown) fact is that almost all speakers find this sentence to be 0-assertible or 0 true (when p is borderline red/orange), then the explanation of this fact is that (in these contexts) almost all speakers hear this sentence as T1 Rp ∧ T1 Op (relative to the obvious intended interpretation—recall n. 62); if the fact is that almost all speakers find (8a) to be 0.5-assertible or 0.5 true (i.e. as assertible, or as true, as each of its conjuncts), then the explanation of this fact is that almost all speakers hear this sentence as Rp ∧ Op; if the fact is that significant numbers of speakers go each way, then the explanation of this fact is that significant numbers of speakers hear this sentence each way; and so on. Thus there is no prospect here of a problem for the fuzzy view. In order for this response to be convincing, we need to be sure that each of the proposed readings is plausible (in the contexts in question). For example, faced with someone who is adamant that ‘‘p is red and p is orange’’ is 0 true in some context, it would not (on the face of it) be plausible to ‘explain’ this fact by saying that she hears this sentence as saying that ⁶⁷ More formally: On the syntactic side, where A is a well-formed formula, A is a term, and where x is not a well-formed formula, x is not a symbol of the language. T1 is a one-place predicate. On the semantic side, there are no new requirements on interpretations (i.e. the new vocabulary is not new logical vocabulary). An interpretation must assign each term of the form A a referent: it is not required that the referent of this term by the wf A —however, in general, interpretations which lack this feature will not be intended. Similarly, an interpretation must assign T1 an extension: it is not required that the extension of this predicate on an interpretation M assign 1 to those things in the domain which are wfs that are assigned the truth value 1 on M and 0 to all other things in the domain—however, in general, interpretations which lack this feature will not be intended. For further discussion—including of how the semantic paradoxes may be handled within a framework of this kind—see Smith 2006. ⁶⁸ I here apply an idea of Keefe 2000, 164. The idea is taken up and developed in great detail by Weatherson 2004 (cf. also Weatherson 2005, §5), who attributes the idea to Fine 1997 [1975]. Reading Weatherson’s paper was a great help to my thinking about the truth-functionality objection to the fuzzy view. Weatherson presents his discussion in terms of a one-place sentential operator ‘determinately’, rather than in terms of a predicate ‘is definitely true’. My own discussion could also be framed in such terms—or in terms of predicates Pn which are 1 true of objects of which P is n true, and 0 true of all other objects; on this approach the second reading of (8a) is R1 p ∧ O1 p.
258 snow is green. Now the first reading—the surface reading—is obviously a plausible potential reading of ‘‘p is red and p is orange’’. What about the second? For a start, we need to ward off a potential confusion. In floating an idea along these lines (in the context of a defence of supervaluationism), Keefe (2000, 164) writes of ‘either a is red or not’ and ‘either a is definitely red or definitely not-red’ being confused. That way of putting the point might encourage a mistaken objection. An opponent who thinks that I am saying that speakers might confuse ‘‘p is red and p is orange’’ and ‘‘ ‘p is red’ is definitely true and ‘p is orange’ is definitely true’’—or that they might hear the former as the latter—could well respond that speakers cannot be thought to be so easily confused: surely speakers know when they are hearing a sentence with the word ‘definitely’ in it and when they are hearing one without this word in it? I quite agree. The point should not be put in terms of hearing one sentence as another sentence: the point concerns hearing one sentence in terms of one or another content.⁶⁹ Now, is it plausible to say that someone might, given the right emphasis, tone of voice, or other contextual features, hear ‘‘p is red and p is orange’’ as T1 Rp ∧ T1 Op—that is (talking in contextualist terms for the sake of simplicity) as making a claim which is true to degree 1 just in case both Rp (the content of ‘‘p is red’’) and Op (the content of ‘‘p is orange’’) are true to degree 1? I think so. After all, this is exactly how a proponent of classical semantics will say that speakers hear ‘‘p is red and p is orange’’. Consider Williamson, who thinks that this case provides an objection to the fuzzy view. He hears ‘‘p is red and p is orange’’ as false (i.e. full-on false: in classical semantics there are no grades of truth or falsity). This will be because he hears it as saying something which is (full-on) true if both its conjuncts (‘‘p is red’’ and ‘‘p is orange’’) are (full-on) true, and is (full-on) false otherwise. But that is precisely the second reading proposed above! So, to the extent that Williamson’s objection is plausible, so is my reply. Let us turn to a second alleged problem case for the fuzzy view, sentence (5a): ‘‘p is red or p is orange.’’ According to Fine’s intuitions, (5a) is true (1997 [1975], 123–4). According to my observations, speakers generally hedge over (5a) just as they do over ‘‘p is red’’ and ‘‘p is orange’’, when p is borderline red/orange (recall §2.4). According to a third set of ⁶⁹ I am not suggesting that Keefe is confused here—only that her way of putting the point could confuse others.
259
intuitions/observations, whether or not (5a) is assertible depends upon the context: sometimes we hear it as true; other times it seems false (see Weatherson 2004, §4). Given these conflicting views, I think we must accept that proper empirical studies might show that almost all speakers find (5a) to be 1-assertible, or that almost all speakers find (5a) to be 0-assertible, or that almost all speakers find (5a) to be 0.5-assertible, or that significant numbers of speakers go in each of two or three of these ways (in a given context), or that all speakers go the same way in each context—but different ways in different contexts. Yet, whatever happens, there is no prospect of a problem here for the fuzzy view—for there is a plausible reading of (5a) which is 0.5 true, another plausible reading which is 0 true, and a third plausible reading which is 1 true. The first is the surface reading Rp ∨ Op. The second is T1 Rp ∨ T1 Op, discussion of which would run parallel to the discussion above of T1 Rp ∧ T1 Op. What about the third reading? Let us consider some actual contexts in which sentences such as ‘‘p is red or p is orange’’ have been claimed to be (full-on) assertible. First, Tappenden (1993, 565) presents a case in which you have the job of sorting color samples on an assembly line. The samples come along the line in varying shades of red or orange. No other colors are sent rolling out. You are to drop the orange samples into one bin and the red ones into another. Every so often an indeterminate case comes along and you cannot make up your mind about it, so you set it aside.
Each time you do this, the foreman comes along and says, pointing to the sample, ‘‘That is red or that is orange’’.⁷⁰ Eventually, we may suppose, you get the message and classify all the samples one way or the other, rather than setting some aside. Now the truth-functional degree theorist can explain why we might find ‘‘That is red or that is orange’’ 1-assertible in this context as follows. In this context—which is rather special in a number of ways, one being that each sample must be put into a bin, another being that there are only two bins available—it is very natural to hear the foreman’s utterance of ‘‘That is red or that is orange’’ as a reminder of precisely these two points. But of course the truth-functional degree theorist can happily ⁷⁰ In Tappenden’s example the foreman waits until you have a pile of samples, and then says ‘‘Every one of these samples is either red or orange’’—but we are looking for a case in which the sentence uttered is of the form ‘‘p is red or p is orange’’, so I adapt the example slightly.
260 accept that the claim which we would express with the sentence ‘‘Each sample must be put into the ‘red’ bin or the ‘orange’ bin’’ (or ‘‘Each sample must be put into a bin, and there are only two bins available’’, etc.) is 1 true. Second, moving (slightly) from sentences of form (5a) to ones of form (5c), Keefe (2000, 163) gives the following example: suppose F, G and H are incompatible and a is on the borderline between being F and being G and is definitely not H: it would then be appropriate and informative to say ‘a is either F or G’. (E.g. to the question ‘is it red?’ asked of a borderline blue–green patch, the reply ‘no, it’s blue or green’ is appropriate.)
Now the truth-functional degree theorist can explain why we might find ‘‘It’s blue or green’’ 1-assertible in this context thus: in this context, it is very natural to hear this as saying that the patch is (definitely) not any colour other than blue or green. Indeed, Keefe goes on to say precisely that ‘a is either F or G’ ‘‘can be highly informative by saying something about the properties of a, in particular, implying that a does not have those properties that are incompatible with both F and G (e.g. that it is blue or green and not red)’’ (p. 164).⁷¹ Once again, then, the truth-functional degree theorist can provide a reading of the sentence at issue which is both plausible in the context and has a degree of truth which matches the claimed assertibility status of that sentence. The fact that ‘a is either F or G’ is 1-assertible in this kind of context does not show that the fuzzy theorist is wrong to treat disjunction as a truth function; it shows that ‘a is either F or G’ is not always heard as Fa ∨ Ga (relative to the obvious intended interpretation). Let us turn to a third alleged problem case for the fuzzy view, sentence (9a): ‘‘It is not the case that p is red and p is not red.’’ Just as I tend to hear ‘‘p is red and p is not red’’ as 0.5-assertible when p is borderline red/orange,⁷² so I tend to hear its negation, that is, (9a), as 0.5-assertible also. But Weatherson (2004, 1) and others claim that many competent ⁷¹ Cf. also Dummett 1997 [1975], 106. ⁷² Forbes (1983, 244) has similar intuitions. Williamson 1994, 136 says that ‘‘Intuitions can be confused by the idiomatic use of contradictions such as ‘He is and he isn’t’ to describe borderline cases.’’ I don’t think I’m being confused by this: I can indeed hear ‘‘He is and he isn’t’’ as saying that he is a borderline case—but then I hear this sentence as fully assertible, whereas I tend to hear ‘‘p is red and p is not red’’ as 0.5 assertible when p is borderline red/orange. In any case, I am not claiming it as an argument for the fuzzy theory that ordinary speakers consider some contradictions to be assertible—my arguments in favour of the fuzzy theory are quite different—and I also accept the possibility that proper
261
speakers find the latter to be (fully) assertible in such situations. The fuzzy theory can easily explain both sets of reactions: those who consider the sentence 0.5-assertible hear it as ¬(Rp ∧ ¬Rp); those who consider it 1-assertible hear it as ¬(T1 Rp ∧ T1 ¬Rp)—or, on the approach mentioned in n. 68, as ¬(R1 p ∧ R0 p)—which is 1 true on the obvious interpretation. There is an interesting variation on this case which we should consider. Suppose: that the twins Jack and Mack are balding in the same way. Their scalps are in exactly the same state; they are bald to exactly the same degree. However far the process has gone, the claim ‘Jack is bald and Mack isn’t’ . . . is not perfectly balanced between truth and falsity; intuitively, it is false, or at least much closer to falsity than to truth. (Williamson 2003, 693)
A way to respect the thought that ‘Jack is bald and Mack isn’t’ is false would be to say—along now-familiar lines—that this sentence can be read as T1 Bj ∧ T1 ¬Bm. Now consider the negation, ‘It is not the case that Jack is bald and Mack isn’t’. We could explain someone finding this 1-assertible by saying that they hear it as ¬(T1 Bj ∧ T1 ¬Bm). But I think that there is also another way in which one might hear this sentence, which would also lead one to regard it as 1-assertible. Here is a context in which this other reading seems natural. Suppose that for some reason we need to classify each person in a group as either bald or not bald. Jack and Mack are hard cases: we find it difficult to decide which way to classify them. While we are pondering the problem, someone says ‘‘(Well, one’s thing for sure:) It is not the case that Jack is bald and Mack isn’t’’. This sounds fully assertible. The fuzzy theorist can happily accept this, explaining it as follows: in this context we hear this sentence as saying that however we end up classifying them, Jack and Mack must be classified in the same way (because they are identical in respect of baldness)—a claim which, we may suppose, is 1 true. A variation on the previous case is where we have adjacent objects a and b in the middle of a Sorites series—for example, a is a borderline heap, and b has one less grain than a—and we say ‘‘It is not the case that a is a empirical work might reveal that ordinary speakers do in fact consider statements such as ‘‘p is red and p is not red’’ to be 0-assertible.
262 heap and b is not a heap’’. This is a Sorites premiss, in the sense that one can present the Sorites paradox using negated conjunctions of this form in place of conditionals ‘‘If a is a heap, then b is a heap’’ without affecting how compelling the argument is.⁷³ (The Sorites reasoning then proceeds as follows: a is a heap—as established at the previous stage of the reasoning; so it is not the case that b is not a heap; so b is a heap too.) So in this sort of context, ‘‘It is not the case that a is a heap and b is not a heap’’ compels our assent—i.e. is highly assertible. The truth-functional degree theorist can explain this as follows. We have just seen that ‘‘It is not the case that Jack is bald and Mack isn’t’’ can be heard as saying that, however we end up classifying them, Jack and Mack must be classified in the same way (because they are identical in respect of baldness). Similarly, ‘‘It is not the case that a is a heap and b is not a heap’’ can be heard as saying that a and b must be classified in the same way (because they are so similar in respects relevant to being a heap). That is, the sentence in question can very naturally be heard as an expression of the thought that ‘heap’ conforms to Tolerance. But then my account in §3.5.4 of why the Sorites is compelling can be applied here: we explain why someone is inclined to regard ‘‘It is not the case that a is a heap and b is not a heap’’ as highly assertible in the context of Sorites reasoning by saying that she hears this sentence as saying that a and b must be classified in the same way—and so she is inclined to agree, because her acceptance of Closeness (i.e. her feeling that ‘heap’ is vague) licenses her to accept Tolerance, and hence this application of it.⁷⁴ Let’s consider a fourth kind of case: (10a). Weatherson (2004, 23) says that the sentence ‘‘It is not the case that Louis is bald, but nor is it the case that he is not bald’’ is ‘‘a legitimate, if slightly long-winded, way to communicate that Louis is a penumbral case of baldness’’. I quite agree that the sentence sounds 1-assertible when heard in this way, but this poses no problem for the fuzzy view: for if we are indeed to regard the sentence as saying that Louis is a borderline case of baldness, we will have to regard it as saying ¬T1 Bl ∧ ¬T1 ¬Bl (which is 1 true when Bl is 0.5 true), not ¬Bl ∧ ¬¬Bl (which is 0.5 true when Bl is 0.5 true). The style of my response to the truth-functionality objection has now been well illustrated. Given that a certain sentence is n-assertible, we need ⁷³ Indeed, Weatherson 2004, §5.3 thinks the paradox is more compelling in the former form. ⁷⁴ For further discussion, see §5.5.1 below.
263
to provide a plausible reading of the sentence that is n true. In every case considered we have been able to do this, and while there are other cases that we have not explicitly discussed, I see no prospect of a case that could not be plausibly treated. It should also be noted that we have made no contentious claims about how the results of proper empirical work on ordinary usage will turn out. Our readings of sentences have been of three sorts: 1. The surface reading. 2. Readings involving the predicate ‘is true to degree 1’. These are plausible readings because they simply allow the degree theorist to hold that a given sentence might be heard as having precisely the content (truth conditions) that a classical theorist would regard it as having.⁷⁵ 3. Readings invoked in particular sorts of context: for example, hearing a sentence as saying that all samples have to be classified in one of two ways; hearing a sentence as an expression of Tolerance in the context of reasoning about a Sorites series; and so on. Overall, I have not attempted to paint a systematic picture of how particular features of context influence the content and/or warranted assertibility conditions of sentences with particular syntactic forms. That would be premature: we simply do not have adequate data on which to base theories of this sort. What I have tried to do is dispel the worry that truthfunctional degree theories cannot account for ordinary usage of, and/or intuitions about the truth and/or assertibility of, compound sentences about borderline cases. Given the great flexibility of such accounts, illustrated in relation to the cases considered above—in particular, that they are not stuck only with surface readings, as their opponents seem to have assumed—I think we can say that the dark cloud has been dispelled. Before moving on, I want to consider Edgington’s (1997) criticisms of truth-functionality, which lead to an important point not yet discussed. ⁷⁵ Note here that the classical theorist who thinks that ‘‘p is red or orange’’ says Rp ∨ Op (relative to the obvious classical interpretation) and the fuzzy theorist who thinks that ‘‘p is red or orange’’ says Rp ∨ Op (relative to the obvious fuzzy interpretation) disagree over the content (truth conditions) of this sentence. The fuzzy theorist who wants to agree with the above classical theorist about the content of this sentence will have to claim that it says, say, T1 Rp ∨ T1 Op (relative to the obvious fuzzy interpretation). The point to remember is that we are taking contents to be interpreted wfs, not just wfs; so it should not be a surprise that when we move from classical to fuzzy interpretations, we might need to change the wf in order to preserve the content.
264 Edgington advocates a non-truth-functional form of degree theory. In some cases her criticisms of truth-functionality fit the pattern we have seen—a certain sentence is claimed obviously to have some truth value or assertibility status, which is held to be incompatible with the truth value assigned to it by the fuzzy view—but in two cases she offers a further consideration in support of her claims, as follows. Suppose that we have a collection of balls of various sizes and colours, and suppose that the size of a ball is independent of its colour, and vice versa. Let Ra, Rb, and Rc be the statements that balls a, b, and c are red, and Sa, Sb, and Sc be the statements that they are small. Suppose that Ra is 1 true, Sa, Rb, Sb, and Rc are 0.5 true, and Sc is 0 true. According to the fuzzy account, Ra ∧ Sa and Rb ∧ Sb are both 0.5 true. But it is plausible that a is a better case for ‘‘red and small’’ than b: both are borderline in size, and a is clearly red while b is not. ‘‘Bring me a ball which is red and small; if you can’t find a clear case, bring the closest you can find.’’ Would not a—perfectly red and arguably small—be a better choice than b? (Edgington 1997, 304)
Similarly, according to the fuzzy account, Rb ∨ Sb and Rc ∨ Sc are both 0.5 true. But it is plausible that b is a better case of ‘‘red or small’’ than c is. ‘‘Bring me a ball which is either red or small (or whatever comes closest to this specification).’’ Would not b be a better choice than c? Both are borderline on color; b is borderline on size, while c is huge. (Edgington 1997, 304)
I think Edgington is right that a seems to be a better choice in the first case—but this does not mean that Ra ∧ Sa is more true than Rb ∧ Sb. a is a better choice because one would have to change less about a to make Ra ∧ Sa fully true than one would have to change about b to make Rb ∧ Sb fully true. This does not mean, however, that given a and b as they actually are, Ra ∧ Sa is truer than Rb ∧ Sb. The latter seems quite wrong: if it were the case, then in just the same way it should be the case that if A is true, and B, C, and D are false, then A ∧ B is truer than C ∧ D, which is absurd. We need to distinguish two things: the distance of a sentence from the truth, in the sense of how much would have to change about its subject matter to render the sentence true; and the distance of a sentence from the truth,
265
in the sense of how far its actual degree of truth is from the maximum truth value. A sentence might be very close to the truth in the first sense, and yet very far from the truth in the second sense.⁷⁶ For example, if Bob 1 in., then although we would not have to change Bob much to is 6 ft 64 render the sentence ‘Bob is six feet tall’ true, as things actually stand, the sentence is simply false. Likewise in Edgington’s second case, b is indeed a better choice than c, but this is simply because one would have to change less about b to make Rb ∨ Sb true than one would have to change about c to make Rc ∨ Sc true—and this has no direct implications for the actual truth values of Rb ∨ Sb and Rc ∨ Sc. 5.5.1 The Conditional and the Sorites Paradox We saw in §5.2 that all classically valid formulae and inferences are fuzzily valid (on my account of fuzzy consequence), and vice versa. It should be noted, however, that while our first-order language is expressively complete with respect to classical ({0, 1}-valued) truth functions, it is not expressively complete with respect to fuzzy ([0, 1]-valued) truth functions. Thus, if we introduced further connectives—ones which cannot be defined in terms of our ∨, ∧, and ¬—we would need to explore the question of consequence afresh. This would only be an issue, however, if further truth functions were needed in order to express our ordinary reasoning with vague concepts—and I do not believe that this is the case. But what about the conditional? Don’t we need the Łukasiewicz conditional (recall p. 68, and see below) in order to represent—and solve—the Sorites paradox within our fuzzy semantic framework? But this conditional is not definable in terms of our ∨, ∧, and ¬. In fact, we do not need the Łukasiewicz conditional. It is ironic that one of the few things that most philosophers have found attractive about the standard fuzzy account is its resolution of the Sorites—when in fact this resolution fails to solve the problem. I shall now present this resolution—paying attention to its employment of the Łukasiewicz conditional—before showing why the resolution fails. After that I shall indicate the proper way to resolve the Sorites paradox within the fuzzy framework: a way which makes no use of the Łukasiewicz conditional. ⁷⁶ A similar distinction is drawn (independently) by Sainsbury 1986, 97.
266 Consider a version of the paradox which concerns a series of piles of sand 1 through 10,000, where pile i has i grains of sand, and each pile is of a very similar shape to its neighbour(s). Our Sorites argument looks like this: 1. If pile 10,000 is a heap, then pile 9,999 is a heap. 2. If pile 9,999 is a heap, then pile 9,998 is a heap. .. . 9,999. If pile 2 is a heap, then pile 1 is a heap. 10,000. Pile 10,000 is a heap. ∴ 10, 001. Pile 1 is a heap. According to the standard fuzzy account, ‘if . . . then . . . ’ here is read as the Łukasiewicz conditional, which has the following truth conditions: 1 if [A ] ≤ [B ] [A → B ] = 1 − [A ] + [B ] otherwise The argument is valid, according to the standard definition in terms of preservation of designated values, with 1 as the only designated value. So what is wrong with the argument? Well, not all the premisses are 1 true. Premiss 10,000 is 1 true. But look at the conditionals. At first, both antecedent and consequent are 1 true, and so are the conditionals. As we move along the series, we get to a point at which the antecedents are ever so slightly more true than the consequents. In this region, the conditionals are ever so slightly less than 1 true. This continues for a while until both antecedent and consequent are 0 true, and hence the conditionals are 1 true again. The problem with the argument is, then, that not all the premisses are fully true. So why is it compelling? Because all the premisses are very nearly 1 true. We are taken in because the premisses are so nearly true that we think they are fully true. However the miniscule amount of falsity in some of the premisses accumulates as we move along the series: by the time we get to the end of the series, the conclusion is 0 true.⁷⁷ ⁷⁷ See e.g. Forbes 1983, 243–4; 1985, 171–2; and Williamson 1994, 123–4. There is another approach which says that an argument is valid in fuzzy logic just in case there is no interpretation on which the conclusion is less true than the least-true premiss (cf. n. 16 above). On this approach, modus ponens, and the above Sorites argument, come out invalid. See e.g. Machina 1976, 69–75; Williamson 1994, 123–4; and Priest 1998, 332. I side with Williamson, against Machina, in thinking that this approach does not yield a satisfying explanation of why the Sorites argument is compelling (given that, on this approach, it is not valid). There are also other variations, such as the combination in Lakoff 1973 of a Machina-type definition of validity with a non-Łukasiewicz conditional, on which modus ponens comes out valid.
267
That’s a neat story, but it cannot be the correct account of why the Sorites is both compelling and mistaken. For, as Wright points out, on this approach the fuzzy theorist is left without anything to say about the Sorites paradox as formulated in other ways: Can a degree-theoretic account explain the plausibility of the major premisses? There is no difficulty, of course, with the usual, quantified conditional form of premise. The explanation will claim that each instance, Fa → Fa , of (∀x)(Fx → Fx ) is almost true: that its consequent enjoys a degree of truth ever so nearly but not quite as great as that of its antecedent. And this claim will then be followed . . . by a stipulation that the degree of truth of any universally quantified statement is the minimum of the degrees of truth enjoyed by its instances. . . . But . . . the major premise doesn’t need to be conditional at all. In the case of the Sorites-series of indiscriminable color patches for instance, we could just as well take it in the form (∀x) − [red(x) & − red(x )]. All the ways of making the conditional form of major premise seem intuitively plausible would be applicable to this conjunctive form. . . . [the degree theorist] needs to explain . . . with what right such a conjunctive major premise may be regarded as almost true; otherwise he cannot explain its plausibility, or duly acknowledge the force of the arguments which seem to sustain it. (1987, 251–2)
As Wright then goes on to point out, one cannot see how the degree theorist could give an account on which such a conjunctive major premiss is almost true: and in any case, on the standard fuzzy account, such premisses are not almost true. Thus, the standard fuzzy explanation of the plausibility of the conditional formulation of the Sorites paradox does not extend to other formulations—yet clearly the various formulations are just stylistic variants of one fundamental problem. If a solution addresses this problem only when it is formulated in one specific way, then the solution is not really addressing the fundamental problem at all. The apparent merit of the standard fuzzy solution of the Sorites paradox is thus an illusion. Given that I do not want to adopt the standard fuzzy resolution of the Sorites, I have no need for a truth definition for the conditional which renders each premiss ‘If pile i is a heap, then pile i − 1 is a heap’ either totally true or almost totally true. Furthermore, I have a positive reason for rejecting this truth definition. In the spirit of wanting a classical consequence relation, I would also like to retain the equivalence of A → B , ¬A ∨ B ,
268 and ¬(A ∧ ¬B ), and the usual connection between consequence and the conditional: B is a consequence of A just in case A → B is a tautology. The Łukasiewicz conditional lacks both these properties (relative to the definition of consequence that I proposed above). I therefore propose simply to define the truth conditions for the conditional so that A → B is equivalent to ¬A ∨ B and ¬(A ∧ ¬B ) (the latter two are already equivalent).⁷⁸ It is then easy to show that B is a consequence of A just in case A → B is a tautology.⁷⁹ At this point two tasks remain: (1) to fend off the charge that the Łukasiewicz conditional provides a better reading of the English ‘if . . . then . . . ’ than does the fuzzy material conditional just defined; (2) to explain how we are going to resolve the Sorites paradox, given that we have rejected the standard fuzzy resolution. (1) It might be thought: doesn’t the standard fuzzy semantics for the conditional provide a better formal rendition of the English ‘if . . . then . . . ’ than the account proposed above, according to which A → B , ¬A ∨ B , and ¬(A ∧ ¬B ) always have the same truth value? For example, consider Bob, a borderline case of ‘bald’, and Bill, who has one less hair than Bob. Let us suppose ‘Bob is bald’ is 0.5 true and ‘Bill is bald’ is 0.51 true. Then ‘If Bob is bald, then Bill is bald’ is 0.51 true, according to my semantics, whereas on the standard fuzzy semantics, it would be 1 true—and isn’t the latter the more intuitive assignment? In fact, this is not at all clear. In saying ‘If Bob is bald, then Bill is bald’ one might mean that if one were to stipulate a sharp boundary for ‘bald’, and it enclosed Bob, then it must enclose Bill also. That is, one might be saying that if Bob is to count as bald (under some precisification of ‘bald’), then Bill is to count as bald too. This claim is certainly something I want to accept—and I can easily accept it. For when one hears ‘If Bob is bald, then Bill is bald’ as making this claim about boundary stipulation, one is not hearing it in accordance ⁷⁸ Formally, we could do this in two different ways—see p. 28. ⁷⁹ One interesting point of connection between the fuzzy material conditional just defined and the Łukasiewicz conditional is the following. In the standard fuzzy account, the tautology property—the property that a tautology has on every interpretation—is having the value 1, and in the case of the Łukasiewicz conditional, if [A ] ≤ [B ], then [A → B ] = 1. In the account presented here, the tautology property is having a value of at least 0.5, and in the case of the fuzzy material conditional defined above, if [A ] ≤ [B ], then [A → B ] ≥ 0.5 (for if [B ] ≥ 0.5, then [A → B ] = [¬A ∨ B ] ≥ 0.5; while if [B ] < 0.5, then [A ] < 0.5, so [¬A ] > 0.5, so [A → B ] = [¬A ∨ B ] > 0.5).
269
with its surface reading; i.e. one is not hearing it as saying Bb → Bl (l for ‘Bill’) relative to the obvious intended interpretation. My account assigns this wf the value 0.51 on that interpretation; it assigns the sentence ‘If Bob is bald, then Bill is bald’ the value 0.51 only in so far as this sentence is given its surface reading. So, suppose that I say ‘If Bob is bald, then Bill is bald’, and, as we might put it, I mean just that (I simply mean that if Bob is bald, then Bill is bald). In this case it does not seem that the sentence should definitely be true! Say I am unwrapping my Christmas presents; I get to a longish object and say, ‘If this is a spade, I will use it to dig a vegetable garden’. It turns out to be a two-piece fishing rod, and looking at it, I say again, ‘If this is a spade, I will use it to dig a vegetable garden’. This is an odd thing to say—for we can all clearly see that it is not a spade.⁸⁰ Now suppose you inform me that Bill has one less hair than Bob, and then Bob and Bill are brought in, and I see that Bob is a borderline case for ‘bald’. If I now say ‘If Bob is bald, then Bill is bald’ (and I mean just that), then far from being clearly true, my statement is odd: perhaps not quite as odd as the spade statement—because it is not clearly false that Bob is bald—but quite odd nevertheless. Thus, on this occasion—on which it is plausible to give my sentence its surface reading Bb → Bl —there is no clash between the intuitive assertibility status of the sentence and the truth value assigned to the latter wf by my semantics. I do not think, then, that the Łukasiewicz conditional is a better rendition of ‘if . . . then . . . ’ than the fuzzy material conditional defined above. The sort of example considered above, which is often supposed to show that it is better, does not show this—and other examples show that it is worse. Suppose that ‘Ben is tall’ is 0.51 true. Then, using the Łukasiewicz conditional, both ‘If Bob is bald, then Bill is bald’ and ‘If Bob is bald, then Ben is tall’ are true to degree 1. Presumably, however, someone who believes that the former is true to a high degree will not also believe that the latter is true to a high degree. This would indicate that the real source of the intuition that the former is true to a high degree is understanding ‘If Bob is bald, then Bill is bald’ as meaning that if one were to stipulate a sharp boundary for ‘bald’, and it enclosed Bob, then it must enclose Bill also—and this, as noted above, is something I can accommodate. ⁸⁰ Cf. Stalnaker 1999, 94 n. 18: ‘‘an indicative conditional is appropriate only in a context where it is an open question whether the antecedent is true.’’
270 (2) What are we going to say about the Sorites paradox? Recall that I have already laid out a recipe for approaching the Sorites paradox, given a semantics which accommodates predicates satisfying Closeness. I argued in §3.5.4 that if we suppose that a predicate conforms to Closeness, we can see both why a Sorites paradox for this predicate is compelling and also how the paradox is mistaken. The Sorites argument is compelling because when the major premiss (or premisses) expresses Tolerance, the argument is valid, and also the major premiss is plausible (to one who accepts Closeness but not Tolerance) because accepting Closeness licenses one to accept Tolerance as a useful approximation of what one believes, in ordinary situations. The argument is flawed, however, because it leads us to see that situations involving Sorites series are ones in which we cannot happily use Tolerance as an approximation of Closeness, and must work with Closeness itself—and when reformulated so that the major premiss expresses Closeness (but not Tolerance), the argument is invalid. Applying this strategy to the Sorites argument under consideration in the present section yields the following account. We find the paradox compelling because, first, we naturally hear the conditional premisses ‘If pile i is a heap, then pile i − 1 is a heap’ as expressing (consequences of) the claim that ‘heap’ is Tolerant, and, second, the argument is valid on this reading. The argument is ultimately unconvincing because the argument itself shows us that we are in one of those special situations where we cannot safely employ Tolerance as an approximation of Closeness, but must work with Closeness itself—and when we read the conditional premisses ‘If pile i is a heap, then pile i − 1 is a heap’ as expressing (consequences of) the claim that ‘heap’ conforms to Closeness (but not Tolerance), the argument is not valid. It will be helpful to state precisely the Tolerance and Closeness readings of the conditional premisses. In order to do so, we need to add some symbols to our formal language. Let us use [A ] as a name of the degree of truth of A .⁸¹ Let us also add an identity predicate = and treat it in a ⁸¹ On the syntactic side, where A is a closed wf, [A ] is a term, and where x is not a closed wf, [x] is not a symbol of the language. On the semantic side, an interpretation must assign each term of the form [A ] a referent: it is not required that the referent of this term on an interpretation M be the degree of truth assigned to A on M—however, in general, interpretations which lack this feature will not be intended. Cf. n. 67 above.
271
purely classical manner.⁸² Finally, let ≈T denote the relation which holds between truth values that are very close in respect of truth.⁸³ Now, when I say that we naturally hear the conditional premiss ‘If pile i is a heap, then pile i − 1 is a heap’ as expressing (a consequence of) the claim that ‘heap’ is Tolerant, I mean that we hear it as [Hpi ] = [Hpi−1 ]. On this reading, the full argument is: 1. [Hp10,000 ] = [Hp9,999 ] 2. [Hp9,999 ] = [Hp9,998 ] .. . 9,999. [Hp2 ] = [Hp1 ] 10,000. [Hp10,000 ] = 1 ∴ 10, 001. [Hp1 ] = 1 This argument is classically valid, and hence valid in the version of fuzzy logic presented in this chapter. When I say that we retreat to reading the conditional premiss ‘If pile i is a heap, then pile i − 1 is a heap’ as expressing (a consequence of) the claim that ‘heap’ conforms to Closeness (but not Tolerance), I mean that we read it as [Hpi ] ≈T [Hpi−1 ]. On this reading, the full argument is: 1. [Hp10,000 ] ≈T [Hp9,999 ] 2. [Hp9,999 ] ≈T [Hp9,998 ] .. . 9,999. [Hp2 ] ≈T [Hp1 ] 10,000. [Hp10,000 ] = 1 ∴ 10, 001. [Hp1 ] = 1 This argument is not classically valid, and hence not valid in the version of fuzzy logic presented in this chapter. (We could make a valid argument by adding a premiss saying that ≈T is transitive—but this premiss would be clearly false on interpretations on which ≈T has its intended extension.) ⁸² On the syntactic side, = is a binary predicate. On the semantic side, a = b is 1 true if the referent of a is identical to the referent of b, and 0 true otherwise. This is a new requirement on interpretations—i.e. we treat identity as an item of logical vocabulary. However, the result about our fuzzy consequence relation being identical to the classical consequence relation carries over when we enrich our first-order language with identity in this way. Note that there do exist fuzzy treatments of identity which (try to) render it a vague relation; for my reasons for thinking that identity must always be treated as precise—even in the context of a fuzzy semantics which allows other relations to be vague—see Smith 2008. ⁸³ Recall §3.4.2. ≈T is treated as a non-logical predicate.
272 Two potential misunderstandings need to be warded off. First, I say we find the conditional premisses ‘If pile i is a heap, then pile i − 1 is a heap’ compelling in so far as we read them as [Hpi ] = [Hpi−1 ]. But in the given situation, not all the latter wfs are true (e.g. where Hpk is 1 true and Hpk−1 is less than 1 true, [Hpk ] = [Hpk−1 ] is 0 true)—so why exactly do we find them compelling? Well, the claim is certainly not that we find them compelling because they are all true. The idea is that we are inclined to treat statements of Tolerance as true, and so we do so here—initially. But the whole point is that they cannot all be true—for then the absurd conclusion would be true. The Sorites reasoning itself makes us see this—and makes us retreat to a reading of the conditionals as expressing (consequences of) the claim that ‘heap’ conforms to Closeness. Note that, read as [Hpi ] ≈T [Hpi−1 ]—that is, as expressions of Closeness—all the conditional premisses are 1 true in the given situation. However, on this second reading, the argument is not valid. Second, I am not arguing that speakers find the Sorites paradox as formulated at the beginning of this section compelling because they find a second argument (the one just presented involving =) valid, and that they find the Sorites paradox as formulated at the beginning of this section unconvincing because they find a third argument (the one just presented involving ≈T ) invalid. We want to know why speakers find the original argument compelling and yet unconvincing—and talking about their reactions to other arguments would not (without further argument) shed light on this. I am considering only one argument—the one made up of English sentences that is presented at the beginning of this section. I then consider two different views that speakers might have as to the content of some of the sentences in this argument—that is, two different views they might take on what the argument is saying. I represent these two different views as two different arguments in our formal language (one involving = and one involving ≈T ), in keeping with my general approach of regarding contents as interpreted wfs. So I am not changing the argument to which I am imagining speakers to be reacting: all along I am supposing them to be reacting to the one argument—sequence of English sentences—originally presented. It is, however, part of my explanation of why speakers have conflicting reactions to this and other
273
Sorites arguments—finding them on the one hand compelling, but on the other hand unconvincing—that they have available to them two views as to the content of the sentences that make up these arguments. On one of these two readings (Tolerance), the contents of the premisses are not all true—but if they were, the content of the conclusion would be true—while on the other reading (Closeness), the contents of the premisses are all true—but the content of the conclusion is not. In sum, when we have a system of semantics that accommodates Closeness (without Tolerance), we can take the solution to the Sorites set out in §3.5.4 and use it straight off the shelf. We do not need, for example, any proprietary truth definitions for conditionals. The standard fuzzy response to the Sorites—employing the Łukasiewicz conditional—is thus a red herring. It has been noted, and found to be in need of explanation, that logically equivalent formulations of the Sorites paradox are not always equally compelling.⁸⁴ I said earlier (p. 262) that versions of the paradox with premisses of the form ‘‘It is not the case that Fa and not Fa ’’ are generally as compelling as versions with premisses of the form ‘‘If Fa, then Fa ’’. But versions with premisses of the form ‘‘Either it is not the case that Fa or it is the case that Fa ’’ seem not to be compelling. Is this not a problem for me, given that on my semantics, ¬(Fa ∧ ¬Fa ), Fa → Fa , and ¬Fa ∨ Fa are all equivalent? It would be a problem if I had attempted to explain why Sorites premisses of the former two kinds are compelling by saying that their surface readings are (close to) 1 true—for then I would have to accept that the surface readings of Sorites premisses of the third kind are (close to) 1 true, and then I would face the task of explaining why Sorites premisses of the third kind are nevertheless not compelling. But my explanation of why Sorites premisses of the former two kinds are compelling did not take this form. I said that we find them compelling because, in the right type of context, we naturally hear them as expressions of Tolerance. Sorites premisses of the third sort are not compelling because—for whatever reason—we simply do not (without a deal of difficulty) hear them as expressions of Tolerance. ⁸⁴ See e.g. Weatherson 2004, §5.3.
274
5.6 Denying Bivalence It has been argued that bivalence cannot coherently be denied, for any denial of bivalence implies a contradiction (T is a truth predicate): 1. 2. 3. 4.
¬(TA ∨ T¬A ) TA ↔ A and T¬A ↔ ¬A ¬(A ∨ ¬A ) ¬A ∧ ¬¬A
the denial of bivalence two instances of Tarski’s T-schema from (1) and (2) from (3).⁸⁵
As Williamson (1997 [1992], 266–7 n. 4) notes, in order for this argument to carry weight, step 2 must be validated, and step 4 must be absurd. What is the truth predicate T which features in the argument? Consider the classical semantic framework for a moment. Suppose that you say ‘‘Bob is bald’’, and I say ‘‘That’s true’’. There are two things I might be doing. I might be saying, in effect, ‘‘Ditto’’, that is, asserting what you asserted. In that case my statement should have the same truth value as yours. On the other hand, I might be making an assertion about your statement: namely, the assertion that your statement has the property of being true. In this case, if your statement is indeed true, then mine should be true, and if your statement is false, then my statement should be false. But that is just to say that my statement should have the same truth value as yours. So, in the classical framework, the difference between using a truth predicate to reassert what you say and using a truth predicate to say, of your statement, that it has the property of being true, does not come clearly into view. It can emerge clearly, however, when we deny bivalence. In particular, in a degree-theoretic framework, we may distinguish two sorts of truth predicate. First, there is the ‘‘ditto’’ predicate, T. If you say ‘‘Bob is bald’’, and I say, in the ‘‘ditto’’ sense, ‘‘That’s true’’, then my statement should have the same degree of truth as yours. Second, there are truth predicates Tx : one for each degree of truth x in [0, 1]. If you say ‘‘Bob is bald’’, and I make the claim, concerning your statement, that it is true to degree x—that is, Tx —then if your statement has degree of truth x, my statement should have degree of truth 1, while if your statement has a truth value other than x, then my statement should have degree of truth 0. ⁸⁵ Williamson 1997 [1992], 265–6; cf. Williamson 1994, §7.2. Horwich 1998, 76–7 presents a closely related argument. I use angle brackets here in the way introduced in §5.5.
275
In considering how Williamson’s argument applies to a degree-theoretic framework, there are thus two different truth predicates which we should consider in place of T: T1 (the ‘is true to degree 1’ predicate) and T (the ‘ditto’ predicate). In the case of T1 , step 2 is not validated; that is, T1 does not satisfy the T-schema.⁸⁶ In the case of T, step 2 is validated: T does satisfy the T-schema; i.e. TA ↔ A is a tautology.⁸⁷ The problem with T, however, is that the version of step 1 involving T is not a denial of bivalence, and is not something to which the proponent of many-valued semantics is committed. If Bob is a borderline case of baldness, and so we want to deny that ‘Bob is bald’ is true or false—that is, to deny bivalence—we should go about this by stating ¬(T1 Bb ∨ T1 ¬Bb), not by stating ¬(TBb ∨ T¬Bb). So the proponent of many-valued semantics is safe from Williamson’s argument, which requires that the truth predicate used to deny bivalence satisfies the T-schema.
5.7 Different Senses of ‘Fuzzy Logic’ In accordance with reasonably standard terminology, I have described [0, 1]-valued logic as fuzzy logic. Zadeh (1975, 409–10) uses a different terminology, according to which [0, 1]-valued logic is non-fuzzy, and the term ‘fuzzy logic’ is reserved for the generalization that Zadeh introduces: A fuzzy logic, FL, may be viewed, in part, as a fuzzy extension of a nonfuzzy multi-valued logic which constitutes a base logic for FL. For our purposes, it will be convenient to use as a base logic for FL the standard Łukasiewicz logic L1 (abbreviated from LAleph ) in which the truth-values are real numbers in the 1 interval [0, 1].
The truth value set of FL is a countable set of the form {true, false, not true, very true, not very true, more or less true, rather true, not very true and ⁸⁶ Given my account of the conditional, T1 A ↔ A has the same degree of truth as (¬T1 A ∨ A ) ∧ (¬A ∨ T1 A ), which is not a tautology—indeed, it can have an arbitrarily low positive degree of truth (e.g. it is 0.01 true when A is 0.99 true). ⁸⁷ Given my account of the conditional, TA ↔ A has the same degree of truth as (¬TA ∨ A ) ∧ (¬A ∨ TA ). For this to be less than 0.5 true, one of its conjuncts must be less than 0.5 true, hence both disjuncts of this conjunct must be less than 0.5 true—but that cannot happen if A and TA have the same degree of truth, for then one disjunct is the negation of something which has the same value as the other disjunct.
276 not very false, . . . }, where each element of this set represents a fuzzy subset of the truth value set of L1 , that is, of [0, 1].⁸⁸ Now consider the following passage from Haack: Zadeh offers us not only a radically non-standard logic, but also a radically nonstandard conception of the nature of logic. It would scarcely be an exaggeration to say that fuzzy logic lacks every feature that the pioneers of modern logic wanted logic for . . . it is not just a logic of vagueness, it is—what from Frege’s point of view would have been a contradiction in terms—a vague logic. (Haack 1979, 441)⁸⁹
Haack notes explicitly that she is concerned with fuzzy logic in the elaborate sense—not with the view discussed in this book. I think that what Haack says in these passages is right—but none of it applies to the fuzzy view as discussed here. ⁸⁸ Here we see that Zadeh’s terminology is somewhat unfortunate: he means by ‘fuzzy subset of [0, 1]’ a function from [0, 1] to [0, 1], and yet he withholds the term ‘fuzzy’ from [0, 1]-valued logic. For further details of Zadeh’s more elaborate view, see Smith 2004, §14. On the terminological point: many writers on fuzzy logic employ a distinction between narrow and broad senses of ‘fuzzy logic’, but usage of this terminology is not consistent throughout the literature (cf. e.g. Hájek 1998, 2, and Novák 1998, 75). ⁸⁹ Cf. Haack 1978, 167: ‘‘Fuzzy logic, in brief, is not just a logic for handling arguments in which vague terms occur essentially; it is itself imprecise. It is for this reason that I said that Zadeh’s proposal is much more radical than anything previously discussed; for it challenges deeply entrenched ideas about the characteristic objectives and methods of logic. For the pioneers of formal logic a large part of the point of formalisation was that only thus could one hope to have precise canons of valid reasoning. Zadeh proposes that logic compromise with vagueness.’’
6 Worldly Vagueness and Semantic Indeterminacy In this chapter, I continue the examination, begun in the previous chapter, of objections to the fuzzy view of vagueness in particular, and to degreetheoretic treatments of vagueness in general. This chapter covers the major remaining objections to the fuzzy view: the problems of artificial precision and sharp boundaries. In response, I propose a new version of the fuzzy view, which I call fuzzy plurivaluationism. This view combines fuzzy models with semantic indeterminacy of the sort involved in plurivaluationism. The conclusion will be that fuzzy plurivaluationism is the correct theory of vagueness, on the grounds that, first, it is a degree theory—and so satisfies our positive requirement on a theory of vagueness—and, second, it withstands all known objections to degree theories.
6.1 Artificial Precision We come now to an objection to the fuzzy view that crops up again and again in the literature. Consider the following three passages: [Fuzzy logic] imposes artificial precision. . . . [T]hough one is not obliged to require that a predicate either definitely applies or definitely does not apply, one is obliged to require that a predicate definitely applies to such-and-such, rather than to such-and-such other, degree (e.g. that a man 5 ft 10 in tall belongs to tall to degree 0.6 rather than 0.5). (Haack 1979, 443) [T]he degree theorist’s assignments impose precision in a form that is just as unacceptable as a classical true/false assignment. In so far as a degree theory avoids determinacy over whether a is F, the objection here is that it does so by enforcing
278 determinacy over the degree to which a is F. All predications of ‘‘is red’’ will receive a unique, exact value, but it seems inappropriate to associate our vague predicate ‘‘red’’ with any particular exact function from objects to degrees of truth. For a start, what could determine which is the correct function, settling that my coat is red to degree 0.322 rather than 0.321? (Keefe 1998b, 571) One immediate objection which presents itself to [the fuzzy] line of approach is the extremely artificial nature of the attaching of precise numerical values to sentences like ‘73 is a large number’ or ‘Picasso’s Guernica is beautiful’. In fact, it seems plausible to say that the nature of vague predicates precludes attaching precise numerical values just as much as it precludes attaching precise classical truth values. (Urquhart 1986, 108)
All three writers voice the same basic objection, which is that the fuzzy account involves artificial precision: it is highly implausible to suppose that any given vague sentence is assigned some particular unique real number between 0 and 1 as its degree of truth, rather than some other nearby number. Haack simply makes the claim that the assignment of unique fuzzy values is artificial and implausible, whereas Keefe and Urquhart both back up this claim with reasons. Keefe’s reason concerns the question of what could determine the unique correct assignment, while Urquhart’s reason has to do with the nature of vague predicates. I endorse the basic claim, and I endorse Keefe’s reason for making it, but I wholeheartedly reject Urquhart’s reason. Consider the latter point first. Urquhart claims that ‘‘it seems plausible to say that the nature of vague predicates precludes attaching precise numerical values just as much as it precludes attaching precise classical truth values’’ (my emphasis). This is simply false. As we have seen in Chapters 3 and 4, the nature of vague predicates—as captured by the Closeness definition—precludes classical semantics, because in order to accommodate predicates which satisfy Closeness, we need to countenance degrees of truth. The nature of vague predicates most certainly does not preclude the attaching of precise numerical truth values, however: the fuzzy semantic framework is a prime example of a framework which can accommodate predicates which satisfy Closeness. This illustrates the importance of including in our investigation a careful analysis of what vagueness is. If we do not—if we jump straight into the question of the correct theory of vagueness—then when we find an unattractive feature in
279
some theory, we might think it is unattractive because it is at odds with the nature of vagueness, when in fact the problem with the theory is of a quite different sort. Let us then be very clear about the nature of the present problem with the fuzzy view. There is indeed something implausible about assigning a unique fuzzy degree of truth to a vague sentence—but what is implausible about this is not that it offends against the very nature of vagueness. It does not so offend. If a vague discourse was assigned a unique intended fuzzy interpretation of the sort discussed in §2.2.1, this would not in any way compromise the vagueness of that discourse. Yet the idea that each vague discourse is assigned a unique intended fuzzy interpretation does offend intuition. Intuitively, it is not correct to say that there is one unique element of [0, 1] that correctly represents the degree of truth of ‘Bob is bald’, with all other choices being incorrect. There might be better and worse choices, but none is uniquely correct. The fact that this objection has been voiced frequently by both the opponents and the advocates of the fuzzy view indicates that the intuitions on which it is based are genuine and widespread.¹ So we have an affront to intuition, but the source of it is not conflict with the very idea of what it is to be vague—rather, the source of the affront must lie elsewhere. I think Keefe is exactly right about the source of the offence. We cannot see what could possibly determine that the degree of truth of ‘Bob is bald’ is 0.61 rather than 0.62 or 0.6—or even perhaps 0.7 or 0.55—and in the absence of facts determining the precise degree of truth of ‘Bob is bald’, we are inclined to deny that there is any fact of the matter as to this sentence’s precise degree of truth. In other words, we are inclined to move from underdetermination of the degree-of-truth facts to indeterminacy of such facts. In order to get clear about this, let’s examine the general issues in this area, and set up a framework for thinking about this sort of problem, before applying the framework to the particular case of the fuzzy account of vagueness. 6.1.1 The Problem of the Intended Interpretation First of all, note that the problem is not how we could possibly determine that (say) Bob’s degree of baldness is 0.7. Bob’s baldness is not within our ¹ Apart from the three passages quoted above, other sources for this objection to the fuzzy view include Copeland 1997, 521–2; Goguen 1968–9, 332; 1979, 54; Lakoff 1973, 462, 481; Machina 1976, 61; Rolf 1984, 223–4; Schwartz 1990, 46; Tye 1995, 11; and Williamson 1994, 127–8.
280 control: it is determined by genetics and perhaps various other things, but not by how we have used the word ‘bald’ in the past, or any other choices we have made. Rather, the problem is most clearly set up as follows. On the one hand we have our uninterpreted language, and on the other we have its fuzzy interpretations—all of them. Consider four interpretations M1 . . . M4 : they all have the same domain, and assign Bob as the denotation of the name ‘Bob’; but they assign four different functions f1 . . . f4 , respectively, to the predicate ‘is bald’.² Here are some sample values assigned by these four functions:³ f1 : Bob → 0.7, Telly Savalis → 1, Fabio → 0 f2 : Bob → 0.7, Telly Savalis → 0, Fabio → 1 f3 : Bob → 0.701, Telly Savalis → 1, Fabio → 0 f4 : Bob → 0.699, Telly Savalis → 1, Fabio → 0 Now consider the sentence ‘Bob is bald’. It is 0.7 true on M1 , 0.7 true on M2 , 0.701 true on M3 , and 0.699 true on M4 . How true is it simpliciter? That depends upon which interpretation is the intended one. And here we have our problem. Some interpretations are clearly incorrect: ones which assign the number 3 as the denotation of ‘Bob’; ones whose domain contains only the real numbers; ones which assign to ‘is bald’ a function which maps all prime numbers to 1 and everything else in the domain to 0.3; and closer to home, M2 above (which assigns to ‘is bald’ a function which maps someone who is clearly bald to 0, and which maps someone who is clearly not bald to 1). But what about M1 , M3 , and M4 ? What could single out one of these as the intended one, and render the others incorrect? It seems that nothing could: and hence it seems that the picture on which sentences have a unique degree of truth—their degree of truth on the intended fuzzy interpretation—is incorrect. Once we have set up the problem in this way, we can see that it is, at a certain level of abstraction, the same problem as Quine’s problem of the inscrutability of reference, Kripkenstein’s sceptical problem, and Putnam’s model-theoretic argument against Metaphysical Realism.⁴ I call the abstract problem—of which these three, and our problem for the fuzzy ² Recall that the extension of a predicate is a fuzzy subset of the domain, which is a function from members of the domain to the set [0, 1] of fuzzy truth values. ³ Telly Savalis has not a hair on his head; Fabio has a thick, flowing mane. ⁴ See Quine 1960, ch. 2; 1969; 1970; Kripke 1982; and Putnam 1983 [1977]; 1978; 1981, ch. 2.
281
account, are instances—the problem of the intended interpretation. We can approach the abstract problem via some of its instances. Consider Quine’s argument. The basic setup involves a field linguist attempting to construct a translation manual between English and Jungle. Now instead of talking about translation manuals, we can equivalently talk about the linguist trying to construct (a description in English of) the intended interpretation of Jungle. The data available to the field linguist are as follows. She can identify Jungle sentences, and she can determine when Jungle speakers assent to a sentence, and when they dissent from one. In the first stage of the argument, Quine argues that these data underdetermine a choice of translation manual. For example, the data do not determine whether ‘gavagai’ (considered as a term, as opposed to an entire sentence) should be translated as ‘rabbit’, ‘rabbit stage’, or ‘undetached rabbit part’—or, in other words, whether the extension of ‘gavagai’ on the intended interpretation is the set of rabbits, the set of rabbit stages, or the set of undetached rabbit parts.⁵ At this stage, we are merely at the point where our data do not determine a unique translation/interpretation. We have no reason to conclude that there is no unique correct translation/interpretation: maybe there is, and we are simply unable to find out what it is. This is presumably precisely what is going on in familiar cases of underdetermination of theory by evidence in science and forensics: the total available evidence may not determine whether the butler or the gardener did it, but there is a fact of the matter nevertheless. Quine, of course, wants to draw a stronger conclusion: not underdetermination of translation (we cannot know what the correct translation/interpretation is, given the evidence) but indeterminacy of translation (there is no fact of the matter at all concerning what is the correct translation/interpretation). Quine draws this further conclusion on the basis of what I call the publicity premiss. The publicity premiss says that language and meaning are essentially public, and thus the facts about meaning cannot transcend the publicly available evidence about meaning: viz. what people say under what circumstances: ‘‘There is nothing in linguistic meaning beyond what is to be gleaned from overt behaviour in observable ⁵ Quine in fact argues that it is, furthermore, underdetermined whether ‘gavagai’ should be translated as a predicate at all, as opposed to a singular term denoting the fusion of all rabbits or the universal rabbithood.
282 circumstances’’ (Quine 1992, 38).⁶ From here the indeterminacy conclusion is easily reached. The linguist has access to all the publicly available evidence concerning what people say in what circumstances. This evidence does not determine a unique translation/interpretation. No other sorts of facts, beyond this publicly accessible evidence, are relevant to determining the correct translation/interpretation (the publicity premiss). Hence there is no fact of the matter concerning what is the correct translation/interpretation. We can now distil the abstract form of this sort of argument: 1. Facts of type T do not determine a unique intended interpretation of discourse D. 2. No facts of any type other than T are relevant to determining the intended interpretation of D. 3. From (1) and (2): All the facts together do not determine a unique intended interpretation of D. 4. It cannot be a primitive fact—i.e. a fact not determined by other facts—that some interpretation M is the unique intended interpretation of D. 5. From (3) and (4): It is not a fact at all that D has a unique intended interpretation. In Quine’s case, type T includes all and only the publicly accessible facts concerning what people say in what circumstances, and step 2 is then the publicity premiss. Steps 1 and 4 require brief comment. Step 4 is generally supported only by the strongly held intuitions of most philosophers who have thought about these matters, rather than by argument.⁷ I would not rule out the possibility that the overall best theory of (some) language might involve denying 4: it might turn out that we get the theory with the best combination of attractive and unattractive elements (obviously it would not have no ⁶ Quine does not merely assert this: he has at least three arguments for the publicity premiss—but they need not concern us here. ⁷ For a famous statement against primitive semantic facts see Fodor 1987, 97. Cf. also Kripke 1982, 51–3. Not everyone accepts step 4: Boghossian 1989, §VI seriously considers primitive meaning facts (although concerning mental content, rather than meanings of public language expressions) in response to Kripkenstein’s sceptical problem. Van Cleve 1992, 358 refers to primitivism about the reference of words as the position of Plato’s Cratylus, and attributes primitivism about the reference of thoughts to Chisholm and Brentano. See also Hohwy 2001, 2 n. 2 for references to some other possible examples of endorsement of primitivism about meaning.
283
unattractive elements, for denying 4 would be one of them) by supposing that some discourse has a unique intended interpretation, even though this fact could not be determined by facts of any other sort. However, for now I want to accept 4, on the basis of its intuitive merits, and see where we are led. The result will not be at all unpleasant, and this, it seems—in light of what has just been said—will be defence enough of premiss 4. As for step 1, why do we accept it? Certainly not on the basis of actually assembling all the type-T facts and then exhaustively verifying that they really are insufficient to determine a unique intended interpretation for D. Rather, we think carefully about the matter for a while, and certain facts of type T come to mind. It appears to us that these facts are not sufficient to determine a unique intended interpretation of discourse D. We conclude that step 1 is true.⁸ I do not mean here to criticize this procedure—only to draw attention to it.⁹ In light of the foregoing, consider Kripkenstein’s sceptical argument. Kripkenstein challenges me (you) to cite some fact about myself that determines that I meant addition and not quaddition by ‘plus’ when I used this term in the past.¹⁰ In other words, the challenge is to cite facts which establish that the intended interpretation of my past discourse assigns the function symbol ‘plus’ the binary function on the domain which assigns to ⁸ Quine, of course, argues for the indeterminacy of translation. However, if one looks closely at his argument, one sees that ultimately the characterization that I have just given does apply to Quine. See in particular Quine 1960, 72, where the argument climaxes not in a demonstration that different sets of analytical hypotheses—or in our terminology, different interpretations—can conform equally well with the data, but in claims to the effect that this cannot be doubted: ‘‘Both analytical hypotheses may be presumed possible. Both could doubtless be accommodated by compensatory variations in analytical hypotheses concerning other locutions, so as to conform equally to . . . all speech dispositions of all speakers concerned. . . . There can be no doubt that rival systems of analytical hypotheses can fit the totality of speech behavior to perfection, and can fit the totality of dispositions to speech behavior as well, and still specify mutually incompatible translations of countless sentences insusceptible of independent control.’’ ⁹ Soames 1997 in effect criticizes the procedure. Recall Ch. 2, n. 10. Soames argues that even though we cannot see how to derive meaning facts from facts about usage and the environment, it does not follow that meaning facts do not supervene on such facts—and indeed this would not follow even if we had assembled all the type-T facts, and could not see how to derive the meaning facts from them. So, when ‘determines’ is understood in terms of supervenience, or metaphysical determination, Soames rejects 1 (or rather, the instance of 1 involved in Kripkenstein’s sceptical argument—see below). ¹⁰ Quaddition, symbolised ⊕, is defined as follows (where + is the ordinary addition function, and k is some number greater than any of the numbers which I have previously added): x + y if x, y < k x⊕y= 5 otherwise
284 each pair of natural numbers the sum of those two numbers, rather than the binary function on the domain which assigns to each pair of natural numbers the quum of those two numbers. Unlike Quine, Kripkenstein does not restrict us to publicly observable facts. Thus the argument strikes at someone who rejects premiss 2 of Quine’s argument—for example, someone who thinks that mental facts accessible only to the speaker herself are relevant to determining the intended interpretation of a speaker’s words. In other words, class T in Kripkenstein’s argument is wider than it is in Quine’s. Nevertheless, the argument is that facts in the new, wider class T still do not suffice to determine a unique intended interpretation of our discourse, and thus the stronger conclusion of indeterminacy—not underdetermination—of meaning is reached.¹¹ Let us now return to the problem for the fuzzy view of vagueness with which we began. Our problem can readily be seen to be a specific instance of the abstract argument isolated above—and furthermore, an instance that is virtually irresistible. This will lead to the conclusion that there is no fact of the matter concerning which fuzzy interpretation of a given vague discourse is the unique intended one—that is, to the conclusion that the discourse does not in fact have a unique intended fuzzy interpretation. Hence, contra the fuzzy picture as presented above, there will not be a fact as to the precise degree of truth of a statement such as ‘Bob is bald’ (i.e. its degree of truth simpliciter, as opposed to its degree of truth on this or that interpretation). Let discourse D be any discourse involving vague predicates. As for type T, there is widespread agreement concerning the sorts of facts it should contain: •
• •
All the facts as to what speakers of D actually say and write, including the circumstances in which these things are said and written, and any causal relations obtaining between speakers and their environment. All the facts as to what speakers of D are disposed to say and write in all kinds of possible circumstances. All the facts concerning the eligibility as referents of objects and sets.¹²
¹¹ More subtly, the conclusion is that we must accept indeterminacy if we retain the truth-conditional picture of meaning embodied in the project of assigning interpretations—of the model-theoretic sort we have been considering—to languages. Kripkenstein thinks that (sufficient) determinacy can be regained if we reject this picture altogether and adopt a conception of meaning based on assertion conditions rather than truth conditions. ¹² See Merrill 1980 and Lewis 1999 [1983].
285
I would also add: •
All the facts concerning the simplicity or complexity of the candidate interpretations.
This exhausts T. Thus premiss 2 says that if anything determines that some interpretation is the intended interpretation of discourse D, it is facts about the (actual and counterfactual/dispositional) usage of speakers of D, together with facts about the intrinsic eligibility as referents of the objects and sets assigned as referents in that interpretation, together with the facts about the intrinsic simplicity of the interpretation. If these things do not determine the meanings of parts of D uniquely, then nothing does. Premiss 1 says that these things do not determine the meanings of parts of D uniquely. Note that the interpretations in question here are fuzzy interpretations of the sort introduced in §2.2.1. Premiss 1 is then highly plausible: the sum total of actual and counterfactual usage facts does not determine that ‘Bob is bald’ should be 0.7 true rather than 0.699 or 0.701 true, and none of f1 , f3 , or f4 is any more eligible than the others. Hence the facts of type T do not determine that any of M1 , M3 , or M4 is the unique intended interpretation of our vague language, nor that any of them is incorrect—hence these facts do not determine that our discourse has a unique correct interpretation. I say that premiss 1 is highly plausible; it is not conclusively established (recall the point I made earlier, that we establish premiss 1 on the basis of what seems plausible after some careful thought about the matter, not on the basis of exhaustively assembling all the facts of type T and verifying that they really do not suffice to determine a unique intended interpretation of our vague language). In particular, I think that the role of simplicity of interpretations has not been explored sufficiently.¹³ So it may just be that premiss 1 is false—and if it is false, then the present objection to the fuzzy view (that it posits a fact as to the precise degree of truth of a statement such as ‘Bob is bald’ where there can be no such fact) is mistaken. However, if we were to hang our hats on this possibility, we ¹³ Elsewhere I argue for a solution to the problem of reference according to which the intended interpretation of a discourse is the simplest one which satisfies the constraints imposed by usage—where ‘simplest’ is cashed out precisely, in terms of Kolmogorov complexity. I argue that this approach promises a solution to the sceptical arguments of Kripkenstein and others. I am not convinced that extending this approach to the current situation would lead to the conclusion that there is a unique intended fuzzy interpretation of vague discourse—hence the need to explore what happens if we deny this conclusion (see below).
286 would be in the same unsatisfying position as the epistemicist, who tells us merely that something is the case, when we want to know how it could possibly be the case. So we need to explore the alternative, that premiss 1 is true. We will see that we can get a very appealing view by accepting the conclusion of the above argument: i.e. that there simply is no unique intended fuzzy interpretation of our vague discourse. 6.1.2 Fuzzy Plurivaluationism Let us then accept that the type-T facts—facts about our actual usage, our causal relations to our environment, our usage dispositions, referential eligibility, and simplicity of interpretations—do not determine a unique intended fuzzy interpretation of our vague language. We therefore abandon the idea that each discourse has a unique intended interpretation, replacing it with the idea that each discourse has some acceptable interpretations (maybe many, maybe in some cases only one). An acceptable interpretation of a discourse is simply one that is not ruled out as incorrect by the type-T facts, or in other words, one that meets all the constraints on correct interpretations imposed by the type-T facts. These constraints include: •
•
•
Paradigm/Foil constraints If speakers would all unhesitatingly apply the predicate P to the object a in normal conditions, then any candidate correct interpretation must assign P a function which maps a to 1. If speakers would all unhesitatingly withhold the predicate P from the object b in normal conditions, then any candidate correct interpretation must assign P a function which maps b to 0. Ordering constraints If person a and person b are of the same sex and roughly the same age, and a’s height is greater than b’s, then any candidate correct interpretation must assign to the predicate ‘is tall’ a function which maps a to a value greater than or equal to the value to which it maps b. If pile a and pile b are composed of the same sort of particles and are roughly the same shape, and a’s number of particles is greater than b’s, then any candidate correct interpretation must assign to the predicate ‘is a heap’ a function which maps a to a value greater than or equal to the value to which it maps b. Exclusion constraints Any candidate correct interpretation that assigns the predicate ‘is red’ a function which maps a to a value near 1 must assign the predicate ‘is yellow’ a function which maps a to a
287
value near (or equal to) 0. Any candidate correct interpretation that assigns the predicate ‘is tall’ a function which maps a to a value near 1 must assign the predicate ‘is short’ a function which maps a to a value near 0. This hardly scratches the surface: there is a vast number of constraints of this sort. Some of them are general: for example, if F is (intuitively) a vague predicate, then any candidate correct interpretation must be such that if a and b (in the domain of that interpretation) are very close in F-relevant respects, then Fa and Fb are very close in respect of truth (on that interpretation). Some are specific to a single predicate: for example, any candidate correct interpretation must assign to ‘is bald’ a function which maps Telly Savalis to 1. Consider all the constraints of this sort, generated by all the type-T facts. If an interpretation is not ruled out as incorrect by these constraints, then it will be acceptable. The upshot of our earlier discussion is that many fuzzy interpretations of our ordinary vague discourse are acceptable. This gives us fuzzy plurivaluationism—a view which stands to fuzzy models in precisely the way plurivaluationism (discussed in §2.5) stands to classical models. (It does not give us fuzzy supervaluationism, which was discussed in §2.4.1.) Fuzzy plurivaluationism has no further semantic machinery, besides the fuzzy interpretations. There is no further semantic story, beyond that told by the many acceptable interpretations. There are many candidate correct fuzzy interpretations, any of which would do perfectly well (i.e. meet all the constraints imposed by all the typeT facts), and there is nothing to decide between them. Our vague language has many acceptable fuzzy interpretations simultaneously, and as far as semantics is concerned, that is the end of the story. So we have indeterminacy, or plurality, of meaning. Unlike in the supervaluationist picture, the language is not in a unique (higher-order) semantic state. Semantic states are individuated by interpretations, and there are many of them. This multiplicity of fuzzy interpretations is the full and final semantic story. That said, if a sentence has a certain degree of truth, say 0.3, on every acceptable interpretation, then we can talk as if there is just one intended interpretation, on which the sentence is 0.3 true. However this is just talk. We are not doing something analogous to the supervaluationist: distilling a
288 unique super-interpretation, where the degree of truth of a sentence on this interpretation is, say, the infimum of its degrees of truth on the acceptable interpretations. We are simply coining a simple description (‘‘the sentence is 0.3 true’’) of a complex semantic state of affairs (viz. there are many acceptable interpretations, on each of which the sentence is 0.3 true). In general, when all the acceptable interpretations share a certain feature, we can talk as if there is only one intended interpretation, which has that feature. So, for example, if there are many acceptable interpretations of my claim ‘Bob is bald’, but on each one this sentence is greater than 0.6 true, then we can talk as if I said something (one unique, determinate thing) which is greater than 0.6 true. Again, suppose I say ‘‘Bob is bald; if someone is bald, then he is tall; therefore Bob is tall’’, and we wish to assess my argument for soundness. Let us say that an argument is sound relative to an interpretation if it is valid, and its premisses are all strictly greater than 0.5 true on that interpretation. So an argument is sound simpliciter if it is sound relative to the intended interpretation. Now that we have many acceptable interpretations to consider, we must ask whether the argument I uttered is sound relative to each of them. If it is, then we can talk as if there is just one intended interpretation, and say that the argument I uttered is sound. But again, this is just a simple shorthand description of a complex semantic situation. And of course, when it is not the case that all the acceptable interpretations share a certain feature, then the shorthand is not available: we can say only that my utterance is 0.3 true on this acceptable interpretation, 0.4 true on that one, and so on, or that my argument is sound on this acceptable interpretation, unsound on that one, and so on. Recall our discussion of assertibility in §§5.4 and 5.5. We said that (norms of assertion other than the truth norm to one side) a sentence is assertible to the degree that its content is true. By ‘content’ we mean a wf plus an interpretation—i.e. the interpretation that is the intended one, relative to the context of utterance. When we have many acceptable interpretations, rather than one intended interpretation, we still utter only one sentence, and it expresses only one wf, but this wf has many equally acceptable interpretations. Thus our sentence expresses multiple contents. So, how confident should our utterance of the sentence be? Well, if the wf expressed has the same degree of truth—say 0.4—on every acceptable interpretation, then the answer is clear: we should utter the sentence with
289
degree of confidence 0.4. But suppose the wf is true to a low degree on one acceptable interpretation and a high degree on another. Then there is no degree of confidence such that an assertion of that degree would be appropriate. Note that there is no semantic or logical problem about a sentence which has multiple acceptable interpretations and is true to (very) different degrees on different interpretations—the issue is purely pragmatic: we cannot acceptably assert such sentences, with any degree of confidence. The move from a unique intended interpretation to many acceptable interpretations does not affect our definition of validity at all. Think for a moment about classical logic. Whether an argument is valid is a purely formal matter. It does not depend upon what an argument says—that is, what it says on the intended interpretation. It depends upon whether there is any interpretation at all on which the premisses are true and the conclusion false. We can determine whether an argument is valid without having any idea which out of all its interpretations is the intended one—that is, the one which reflects the actual content of the argument. Similarly in fuzzy logic: whether an argument is valid, as defined in §5.2, is a matter of the relationship between the degrees of truth of its premisses and conclusion on every interpretation whatsoever. Validity is a matter of form, not content, and as in the classical case, we can determine an argument’s validity without any reference to what its constituent sentences say—that is, without any reference to which interpretations of the argument are correct or acceptable. 6.1.3 Worldly Vagueness and Semantic Indeterminacy We have now added semantic indeterminacy—that is, the lack of a uniquely correct interpretation—to the worldly vagueness inherent in the fuzzy view. We have not taken away that worldly vagueness: it is still there in the fuzzy models, which remain entirely unchanged. So we are not now saying—with the classical plurivaluationists—that vagueness inheres solely in the relationship between language and the world. Indeed, we are not even saying that vagueness inheres partly in the relationship between language and the world. For while we now have a view in which this relationship is indeterminate, this indeterminacy has nothing to do with vagueness. If we made enough stipulations on the meanings of our words to narrow down just one unique intended fuzzy interpretation, our
290 language would still be vague. Vagueness is accounted for in the nature of the interpretations (they admit degrees of truth). The fact that our practice does not single out just one of these is then an interesting further fact which I think we have to accept—it is not part and parcel of dealing with vagueness. If we had better memories, powers of discrimination, and calculating abilities, we could fix, for each of our vague predicates, a unique function from objects to fuzzy truth values as its extension. This would in no way remove the vagueness of these predicates—as long as the function was such that Closeness was still respected (as it would be, if the function came from one of the original acceptable interpretations). My view—fuzzy plurivaluationism—thus has two separate parts. The fact that the interpretations it posits are fuzzy is a response to the need to accommodate vagueness (understood in terms of Closeness). The fact that there is indeterminacy as to the correct (fuzzy) interpretation is not a response to vagueness: vagueness has already been accommodated by positing fuzzy interpretations. Rather, it is a response to the inherent limitations of our meaning-fixing practices: limitations which have nothing in particular to do with vagueness, as the discussion of Quine and Kripkenstein above makes clear. The fact that the two parts of my view are thus (un)related explains why I have spoken of the ‘artificial precision’ problem for the fuzzy view rather than the problem of ‘higher-order vagueness’, as this problem is also sometimes known in the literature.¹⁴ The latter is a misnomer: the artificial precision problem for the fuzzy account is not a second-order, iterated version of the original problem of accommodating vagueness. The first problem (i.e. the problem of accommodating vagueness) is completely solved by adopting fuzzy interpretations, and the new problem (i.e. the artificial precision problem) has an entirely different source. Consider the following passage from Weatherson (2003b, 484): some properties and objects are more natural than others, and when our verbal dispositions do not discriminate between different possible contents, naturalness steps in to finish the job, and the more natural property or object becomes the content. Well, that is what happens when things go well. Vagueness happens when things do not go well. Sometimes our verbal dispositions do not discriminate between ¹⁴ See e.g. Williamson 1994, 127; Copeland 1997, 522; and Priest 1998, 331–2.
291
several different contents, and no one of these is more natural than all the rest. In these cases there will be many unnatural contents not eliminated by our dispositions which naturalness does manage to eliminate, but there will still be many contents left uneliminated. For example, as far as our dispositions to usage go, ‘tall woman’ might denote any one of the following properties: woman taller than 1680 mm, woman taller than 1681 mm, woman taller than 1680.719 mm, etc. And it does not seem that any of these properties is more natural than any other. Hence there is no precise fact about what the phrase denotes. Hence it is vague. In sum, our dispositions are never enough to settle the content of a term. In some cases, such as ‘water’, ‘rabbit’, ‘plus’, ‘brain’ and ‘vat’, nature is kind enough to finish the job more or less. In others it is not, and vagueness is the result.
This is an extremely clear statement of a view that I do not wish to advocate. I agree with Weatherson about the mechanisms of meaning fixation—i.e. the nature of the type-T facts—and I agree that in standard cases of vague discourse they fail to single out a unique intended interpretation. But contra Weatherson, this is not what vagueness consists in. Semantic indeterminacy and vagueness are distinct phenomena.¹⁵ We can have vagueness without semantic indeterminacy—or at least, we could have, if we made the effort to fix a unique fuzzy interpretation for some vague discourse—and we can have semantic indeterminacy without vagueness. The latter is, as argued in Chapter 4, what classical plurivaluationism gives us. Consider also Burgess (1990, 418), who says that what generates the Sorites paradox in the first place is that we have no answer to the question as to ‘‘what, if anything, determines the boundaries of vague concepts’’.¹⁶ I disagree. What generates the paradox is an issue to do with the nature of the boundaries drawn by vague predicates: these predicates conform to Closeness, and hence (as discussed in §3.5.2) draw blurred boundaries. What solves the paradox is adopting a model theory with degrees of truth, which can accommodate such boundaries. Now there is a quite distinct question: what determines these boundaries? Considering this question, we see that we cannot maintain that there is a unique intended interpretation, but instead need to posit semantic indeterminacy. But the question of boundary determination (in my terms, the location problem) is distinct from the question of the nature of the boundaries ¹⁵ See §3.2 above. ¹⁶ Emphases removed. Cf. also McGee 1997, 142, 157; and Field 1974, 205 n. 5; 2000.
292 drawn (in my terms, the jolt problem), and the indeterminacy which we posit in response to the boundary-determination question is separate from the fuzziness which we posit in response to the nature-of-the-boundary question. Classical plurivaluationism posits only semantic indeterminacy. As we have seen in Chapters 2 and 4, it solves the location problem faced by epistemicists—but it runs headlong into the jolt problem. (In the terms of the previous paragraph, it solves the boundary-determination problem, but not the nature-of-the-boundary problem.) No classical model is an acceptable interpretation of vague language (although it might be an admissible precisification of an intended non-classical interpretation). Adding more classical interpretations—that is, positing many acceptable ones, rather than one intended one—solves the problem of how our practice could associate our language with a unique interpretation (by denying that there is a unique intended interpretation), but it goes no way at all to solving the problem that the account cannot accommodate vagueness. The fuzzy account (with a unique intended fuzzy interpretation), on the other hand, solves the jolt or nature-of-the-boundary problem—it accommodates predicates with blurred boundaries, it accommodates vagueness (understood in terms of Closeness)—but it runs into the location or boundary-determination problem. Fuzzy models are acceptable as interpretations of vague language. However, there is still the problem of how our practice could associate our language with a unique (fuzzy) interpretation. By combining both semantic indeterminacy (in the absence of a unique intended interpretation) and worldly vagueness (in the fuzzy interpretations themselves), fuzzy plurivaluationism solves both the location problem and the jolt problem. Let us take stock. I have described a modification of the fuzzy position, in response to the problem of artificial precision. The problem was that it is highly implausible to suppose that any given vague sentence is assigned some particular unique real number between 0 and 1 as its degree of truth, rather than some other nearby number. It just seems ridiculous to suppose that ‘This cup is red’ is true to degree 0.678, rather than 0.677 or 0.679, or even perhaps 0.6 or 0.7. I have done two things. First, I have discussed the source of this intuition. The problem is not that assigning a unique fuzzy truth value violates the nature of vagueness. Rather, the problem is that we cannot see what could possibly
293
determine just one value as correct, as opposed to other nearby values. This determination problem is of the same general kind as other problems to do with semantic indeterminacy, such as Quine’s problem of the indeterminacy of translation and Kripkenstein’s sceptical puzzle. Second, I have proposed a modification of the fuzzy picture in order to deal with this problem. The basic fuzzy semantic machinery remains entirely untouched: I retain the notion of a fuzzy interpretation as introduced in §2.2.1 and associated definitions such as that of fuzzy validity introduced in §5.2. I make one core change: instead of supposing that each vague discourse is associated with a unique intended interpretation, I suppose that each discourse is associated with some acceptable interpretations. This change flows directly from my diagnosis of the problem: the meaning-fixing facts impose some constraints on correct interpretations, but not enough to single out a unique such interpretation; the acceptable interpretations are simply the ones that are not ruled out as incorrect by these constraints. The upshot of this change is that a vague sentence is not now assigned some particular unique real number between 0 and 1 as its degree of truth. It is assigned many different degrees of truth—one on each acceptable interpretation—and none of these is more correct than any of the others. This was exactly what our intuitions said should be the case: that this was not the case on the original fuzzy view was precisely the problem with the original version of the fuzzy account. Hence the modification which I have made to the fuzzy account solves the problem of artificial precision. 6.1.4 Linear Ordering An objection to the fuzzy account which is closely related to the artificial precision objection just discussed is the linear ordering objection. According to the standard fuzzy account—where there is a unique intended interpretation, and degree of truth simpliciter is degree of truth on the intended interpretation—for any two sentences whatsoever, either they are precisely as true as one another, or one is strictly more true than the other: it cannot be that two sentences are incomparable in respect of truth. Now to many, this linear ordering of sentences with respect to truth seems counter-intuitive. Given any two vague statements A and B, surely—they say—it need not be the case that one is strictly more true than the other, or else that they have exactly the same degree of truth? This objection to the fuzzy view
294 crops up amongst both friends and foes of the fuzzy approach. Keefe (1998b, 570) writes: The comparatives corresponding to multi-dimensional predicates typically have indeterminate instances. For example, there are pairs of people about whom there is no fact of the matter who is nicer (more intelligent), nor that they are equally nice (intelligent): the vagueness of ‘‘nice’’ (‘‘intelligent’’) is such as to leave the question unsettled.
The point does not apply only to multi-dimensional predicates: Keefe also discusses the example of ‘‘a is tall’’ vis-à-vis ‘‘b is red’’. Her conclusion is that ‘‘we cannot assume that there is always a fact of the matter about which of two borderline sentences is more true’’ (Keefe 1998b, 570). Williamson (1994, 128) makes the same objection: Consider . . . a purely comparative statement: (#2 ) ‘It is wet’ is truer than ‘It is cold’. (#2 ) is vague. . . . In many contexts it is neither clearly true nor clearly false, attempts to decide it can founder in just the way characteristic of attempts to decide ordinary vague statements in borderline cases, and so on. What needs to be acknowledged is the vagueness of . . . (#2 ).
Goguen (1967, 1968–9)—a friend of the fuzzy approach—also places a lot of weight on this point, and indeed advocates a generalization of fuzzy set theory in which the set of truth values is only partially ordered. As in the case of the artificial precision objection, we have here both a feeling (that there is something intuitively wrong with the idea that sentences are linearly ordered in respect of truth), and—in Keefe’s case—a diagnosis (that it is the vagueness of certain predicates that generates cases in which certain sentences are incomparable in respect of truth). I disagree with Keefe’s diagnosis, for reasons that will now be familiar. It is not that a linear ordering of sentences would violate the nature of vagueness. As I have stressed to the contrary, if a vague discourse had a unique intended fuzzy interpretation, that would not in itself threaten its vagueness. What, then, about the intuitive reservation itself? Speaking for myself, this reservation has some intuitive force, but not nearly as much as the artificial precision objection. But the beauty of the view that I have advocated is that whether the objection is a good one or not, fuzzy plurivaluationism gets things right. Let me explain.
295
Suppose that our practice does not determine that the sentence ‘Bob is bald’ should be 0.3 true rather than 0.4 true; nor does it determine that the sentence ‘Harry is nice’ should be 0.3 true rather than 0.4 true. There is still an open question: what, if anything, does our practice determine about the relative degrees of truth of ‘Bob is bald’ and ‘Harry is nice’? Consider, first, some uncontroversial cases. Bill has some hair, but not much, and is a little over six feet in height. Ben’s hair is just starting to thin a bit, and he is a little under six feet in height. Our practice does not determine that ‘Bill is tall’ is 0.95 true rather than 0.97 true, or that ‘Bill is bald’ is 0.4 true rather than 0.45 true, or that ‘Ben is tall’ is 0.93 true rather than 0.95 true, or that ‘Ben is bald’ is 0.05 true rather than 0.1 true. It does, however, determine that ‘Bill is tall’ is truer than ‘Ben is tall’, that ‘Bill is bald’ is truer than ‘Ben is bald’, and that ‘Bill is tall’ is truer than ‘Ben is bald’. The last case is interesting. Bill’s tallness has nothing whatsoever to do with Ben’s baldness; yet it is intuitively obvious that Bill is closer to being definitely tall than Ben is to being definitely bald. So our practice determines not only (some) ordering facts between sentences about the same subject matter—as in the first two cases just mentioned—but also (some) ordering facts between sentences about unrelated subjects (e.g. Bill’s tallness and Ben’s baldness). Now the question is, does our practice determine a relative ordering (in respect of truth) of all sentences—that is, does it determine, for any two sentences whatsoever, either that they are exactly the same in respect of truth, or that one is strictly more true than the other? The objection currently under consideration says No. Suppose this is correct. In particular, suppose that our practice does not determine anything about the relative truth of ‘Bob is bald’ and ‘Harry is nice’. In that case, some acceptable interpretations will assign these sentences the same degree of truth, some will assign the former a higher degree of truth than the latter, and some will assign the latter a higher degree of truth than the former. As none of these interpretations is any more or less correct than the others, it will then turn out that we cannot say that ‘Bob is bald’ and ‘Harry is nice’ are exactly as true as one another, or that one is truer than the other. In the circumstances, this is the desired result. On the other hand, suppose that in fact our practice does determine something about the relative truth of ‘Bob is bald’ and ‘Harry is nice’—say, that the former is truer than the latter. Then, as a matter of fact, on every acceptable interpretation, ‘Bob is bald’ is truer than ‘Harry is nice’. (On some interpretations they are
296 0.31 and 0.3 true, respectively, on others 0.33 and 0.31 true, on others 0.4 and 0.3 true, etc.) Then, while neither sentence has a unique degree of truth, we can say that one sentence is truer than the other (because this holds on every acceptable interpretation). If this is the case, then the linear ordering objection to the original fuzzy account is mistaken—but in this case, my fuzzy plurivaluationist account agrees with the original fuzzy account on precisely the point which the objector has (ex hypothesi) got wrong. 6.1.5 Measuring Truth? It will be useful to discuss the relationship between fuzzy plurivaluationism and a theory which it might have brought to mind: the theory of measurement. In measuring the temperature of objects, we assign numbers to them. For example, the ice cube in my glass is assigned the number 0 . . . or the number 32 . . . depending upon whether we are measuring temperature in degrees Celsius or Fahrenheit. Similarly, in measuring the lengths of objects, we assign numbers to them. For example, the rat that lives behind the restaurant is assigned the number 30 . . . or the number 12 . . . depending upon whether we are measuring length in centimetres or inches. As these examples show, the numbers we assign in measuring the temperatures, lengths, and other properties of objects are not in general unique. Is this analogous to what is going on in our case, where we have assignments of degrees of truth to sentences which are not unique? That is, are we in essence measuring the degree of truth of sentences? Here’s how the analogy might be pursued. Suppose we are measuring length. (For simplicity, suppose that the only things we want to measure are straight sticks.) We start with two primitive operations: comparison (we lay two sticks side by side and see whether they are the same length or one is longer) and concatenation (we lay some sticks end to end). We can combine these two operations: we can compare a-concatenated-withb against c. This gives us an objective basis from which to start our measurement of length—that is, our assignment of numbers to sticks to represent their lengths. We say that any such assignment of numbers to objects is acceptable as long as it meshes with our operations in certain basic ways. If a is (by comparison) longer than b, then the number assigned to a must be greater than that assigned to b. If a-concatenated-with-b
297
is (by comparison) the same length as c, then the number assigned to a plus the number assigned to b must equal the number assigned to c; and so on. In the case of assigning truth values to sentences, someone might think that an analogous thing is going on. Our primitive operations are as follows. We can consider two sentences and decide which seems truer to us (cf. comparing two sticks), and we can combine sentences using the logical operations of negation, disjunction, and conjunction (cf. concatenating two sticks—in the truth case, we have many logical operations, whereas in the length case we have just the one operation of concatenation). These can be combined: we can ask whether we think p ∧ q is truer than r, and so on. Now an acceptable assignment of truth values to sentences will be one which meshes with these operations in certain basic ways. If p is deemed as true as a = a, then p is assigned the value 1. If p is deemed as true as a = a, then p is assigned the value 0. The value assigned to ¬p must be equal to 1 minus the value assigned to p; and so on. Is this the right way of looking at what we are doing here? Are we measuring truth? Goguen (1968–9, 331–2) seems to think so:¹⁷ We certainly do not want to claim there is some absolute [fuzzy] set representing ‘short’. We expect variation with user and context. . . . It appears that many arguments about fuzzy sets do not depend on particular values of functions. . . . This raises the problem of measuring fuzzy sets. . . . Probably we should not expect particular numerical values of shortness to be meaningful (except 0 and 1), but rather their ordering . . . degree of membership may be measured by an ordinal scale.¹⁸
In fact, however, while there are some analogies between what we are doing and the process of measurement, the overall idea is quite different. When we measure the length of a piece of wood, say, we can assign this length the number 30 (centimetres) or the number 12 (inches) or the number 3 (hands), but there is nothing indeterminate about the length of the wood in itself. The piece of wood has one, unique length, and we are simply naming this length in different ways, using different systems of length-names.¹⁹ In the present case, on the other hand, where we are ¹⁷ But see n. 22 below. ¹⁸ Cf. Sanford 1975, 29, and Machina 1976, 61. ¹⁹ This is why I stressed in §5.1 that we get a clearer picture of measurement if we countenance lengths and so on, rather than just considering mappings directly from objects to real numbers. NB:
298 assigning vague sentences degrees of truth, and these assignments are not unique, the whole point is precisely that there is indeterminacy as to how true the sentence is, not simply as to how to name its amount of truth, this being an entirely determinate matter in itself. If we were to apply the measurement model to the present case, we would end up with a quite different picture from the one I have presented. The idea would be that the different degrees of truth which we assign to sentences on the different acceptable interpretations are different names for the one unique, unvarying amount of truth possessed by each sentence. This would suggest a picture in which our truth values are not the elements of [0, 1] at all. Rather, the real truth values are these unvarying amounts of truth—and they are simply named by real numbers. But this is not at all the picture we want. Our different acceptable interpretations are not different acceptable descriptions of one unique, underlying semantic reality—they are, to put it dramatically, different semantic realities, each equally real. We have genuine semantic indeterminacy here, not a choice as to how to describe one determinate situation. Conversely, if we were to apply our own picture to the case of measurement, we would end up with something quite different from the standard story about measurement. Instead of a stick which can be assigned 30 (centimetres) or 12 (inches), we would have a stick whose length was inherently indeterminate, which could be assigned 29 or 30 or 31 centimetres. That is, the different assignments would not represent different systems for assigning lengths, but different lengths within the same system. Now suppose we were trying to measure such inherently indeterminate sticks. We would need some additional machinery to handle the non-uniqueness of assignments within one measurement system, as well as the machinery (from standard measurement theory) for handling the non-uniqueness of acceptable measurement systems. Now there is no reason to think that this extra machinery would simply be the same as the original machinery, applied again. Indeed there are reasons, which will emerge below, why this will not work. But this means that our case—where the indeterminacy of assignments of truth value is analogous not to a multiplicity of acceptable measuring systems, but to indeterminacy of the correct length assignment All the systems of length-names use the same names—the real numbers—but they assign these names according to different rules.
299
within one such system—is not best handled by the machinery of standard measurement theory. This point will come out more clearly if we examine the issue of different sorts of measurement scale. Consider transformations of the real numbers (functions from the reals to the reals). These fall into various types, for example: • • • •
A function φ is a similarity transformation if there exists a positive real number α such that for every x, φ(x) = αx. A function φ is the identity transformation if for every x, φ(x) = x. A function φ is a monotone transformation if whenever x < y, φ(x) < φ(y). A function φ is a linear transformation if there exists a positive real number α and a real number β such that for every x, φ(x) = αx + β.
Now, suppose that we are measuring a particular attribute A. We have a mapping f from objects to the real numbers, with the number assigned to an object representing its quantity of A. Suppose that there are other mappings f1 , f2 , f3 , . . . from objects to the real numbers which would serve just as well to measure the attribute A. We now determine the relationships between these mappings. There will be a type of transformation T such that if fi is an admissible mapping, then there is a T transformation φ with fi = φ ◦ f . According to the type of transformation T involved, we say that the attribute is measured on this or that type of scale. For example: • • • •
Where T is a similarity transformation, we say that A is measured on a ratio scale. Where T is the identity transformation, we say that A is measured on an absolute scale. Where T is a monotone transformation, we say that A is measured on an ordinal scale. Where T is a linear transformation, we say that A is measured on an interval scale.
Thus mass is measured on a ratio scale: mass as measured in kilograms (say) may be converted to mass as measured in pounds (say) by multiplying by a constant. Counting is an example of an absolute scale: when we want to measure how many objects there are in a given collection—i.e. the cardinality of the collection—there is only one admissible way of assigning
300 a number to that collection. Temperature is measured on an interval scale: temperature as measured in degrees Celsius (say) may be converted to temperature as measured in degrees Fahrenheit (say) by multiplying by a constant and then adding a constant.²⁰ We might now ask, analogously, whether there is a type T of transformation on the reals in the interval [0, 1] such that, for a given discourse, if M1 = (M1 , I1 ) and M2 = (M2 , I2 ) are acceptable interpretations of that φ discourse, then (M1 = M2 and) there is a T transformation φ with I2 = I1 , φ where I1 is defined as follows: • •
φ
where I1 assigns a sentence letter the value x, I1 assigns it φ(x) where I1 assigns a predicate a function f from (n-tuples of members φ of) the domain to [0, 1], I1 assigns that predicate the function φ ◦ f
One reasonably natural suggestion would be to place the following necessary and sufficient conditions for functions φ : [0, 1] → [0, 1] to be of the relevant type T: 1. φ(0) = 0 and φ(1) = 1 2. if x < y then φ(x) < φ(y) 3. φ(1 − x) = 1 − φ(x) This definition settles some things that the fuzzy plurivaluationist account presented above leaves open. For example, as discussed above, the basic account does not impose an overall linear ordering of sentences with respect to truth, but the above conditions do determine such an ordering. One worry about this particular proposal regarding T is that it seems to leave no room for our central notion of two sentences being very close (in an absolute sense) in respect of truth.²¹ But furthermore, it is not even clear that the acceptable interpretations are related by any sort of transformation of truth values, however complicated. For it seems quite possible that there might be predicates P and Q, and objects a and b, such that 0.3 is an acceptable degree of truth for both Pa and Qb, but 0.4 is an acceptable degree of truth for Pa but not for Qb. I cannot see anything which would rule out this possibility a priori. Yet, if this is possible, then it simply cannot be the case that any two acceptable interpretations are related by a T transformation ²⁰ For more details on measurement theory see Coombs et al. 1954; Suppes and Zinnes 1963; and Krantz et al. 1971. ²¹ Recall the discussion in §3.4.1.
301
in the way described above, no matter what the conditions on T. This brings us back to the point made earlier, that at a fundamental conceptual level, what we are doing is different from measurement. We are not (to repeat) assigning different numbers to a sentence as different ways of naming its fixed amount of truth: we are assigning it different amounts of truth. The analogous case in the field of (say) length measurement is not (again to repeat) where we assign a piece of wood both the numbers 12 (inches) and 30 (centimetres), but where we have a piece of wood whose actual length is indeterminate, and where we might, say, assign the wood any number between 29 and 31 centimetres. Now the fact that we may admissibly assign this piece of wood any such number of centimetres does not mean that if this other piece of metal—also inherently indeterminate in length—is admissibly assigned the length 30 centimetres, then automatically it is also admissibly assigned any length between 29 and 31 centimetres. The length of the metal might be less indeterminate than that of the wood: it might ‘fluctuate’ only between 29.99 and 30.01 centimetres! This is the sort of thing we have in the case where we assign acceptable degrees of truth to sentences. Perhaps two sentences can both acceptably be assigned the value 0.5. This does not mean that if one is also acceptably assigned 0.4, so is the other. That does not seem to me to be something that we want to build into the very fabric of our theory. Thus I reject the idea that the acceptable interpretations must be related by some sort of T transformation on [0, 1] in the way indicated above.²² 6.1.6 Are There Any Acceptable Fuzzy Interpretations? The consideration of measurement raises a question which is worth asking of the fuzzy plurivaluationist view that I have presented. In measurement theory, there are two parts to showing that an attribute (length, mass, heat, etc.) can be measured by real numbers. We must prove a uniqueness theorem, which tells us up to what sort of transformation our assignments of numerical values are unique. That’s the part we looked at in the previous section. We must also prove a representation theorem, which shows that ²² It might still be the case that for a particular predicate, the different acceptable extensions of this predicate are related by some sort of transformation of [0, 1]—with different transformations being appropriate for different predicates (e.g. ‘short’ is measured on one kind of scale, while ‘beautiful’ is measured on another kind of scale). This may have been what Goguen had in mind in the passage quoted above; cf. also Norwich and Turksen 1982.
302 the reals are a suitable structure for measuring the quantity in question in the first place. This is not automatic. For example, certain preference orderings amongst options cannot be represented by assignments of realnumber utility values to options.²³ Thus we have a quantity—how much one likes or values something —which cannot (in certain situations) be measured using real numbers at all (as opposed to being able to be measured by them, but not in just one unique way—as is the case with the quantities of heat, mass, and so on). In the present case, I have argued that there is not (in general) just one correct assignment of truth values to sentences—in other words, there is not just one intended interpretation of a discourse. This corresponds to the uniqueness part of the measurement project. What about the representation part? The worry might be that no fuzzy interpretation is acceptable. Consider, for example, the linear ordering worry. Someone gripped by this worry might think that my answer was not good enough. Recall my answer: on each acceptable interpretation, all sentences are linearly ordered in respect of truth (each acceptable interpretation is a standard fuzzy interpretation, and the linear ordering worry arises precisely because such interpretations linearly order sentences in respect of truth); however, it might be that different acceptable interpretations order two given sentences differently—one makes p truer than q, one makes them exactly the same in respect of truth, and one makes q truer than p—so that, overall, no one ordering can be said to be correct. The objector might think, however, that any interpretation which linearly orders all sentences in respect of truth is thereby unacceptable. For nothing about our practice mandates that p should be truer than q, or the same as q in respect of truth, or less true—so are not all three sorts of interpretations mentioned above (ones which make p truer than q, the same in respect of truth, or less true than q) therefore unacceptable? If this were correct, there would be no acceptable fuzzy interpretations of any vague discourse, which would mean, on my approach, that all vague statements mean nothing.²⁴ As I stressed earlier, my view is that the true semantic story regarding vague sentences is that (in general) they have many acceptable fuzzy interpretations. This plurivaluationist view contrasts with the supervaluationist ²³ See e.g. Debreu 1954 and Hansson 1968, §IV. ²⁴ This is an analogue of the criticism of classical plurivaluationism—that no classical model is acceptable as an interpretation of vague discourse—made in Ch. 4.
303
view that the true semantic story regarding vague sentences is distilled from their many admissible precisifications, where none of these precisifications is a correct interpretation of a vague sentence as it stands. So in response to the present worry, I would not want to concede that my fuzzy models are not acceptable interpretations of vague sentences as they stand. But I do not have to: the solution lies elsewhere. Distinguish what is acceptable in an interpretation from what is mandatory. Our practice (together with the facts about eligibility, simplicity, and so on) imposes certain constraints on correct interpretations of our discourse—for example, that they must not assign to ‘is bald’ a function which maps Telly Savalis to a value other than 1; that they must not assign functions to ‘is red’ and ‘is orange’ which both map some object in the domain to 1; and so on. Anything directly required by these constraints is mandatory; anything not ruled out by the constraints is acceptable. Now what I want to say is that nothing about our practice requires a particular ordering of ‘Bob is bald’ and ‘Harry is nice’ in respect of truth (it does not require that the former is truer than the latter, or that the latter is truer than the former, or that the two are equally true), but that any particular ordering is acceptable—that is, nor does our practice require that they should not be ordered in any particular way (it does not require that it is not the case that the former is truer than the latter, it does not require that it is not the case that the latter is truer than the former, and it does not require that it is not the case that the two are equally true). Intuitively, this seems right: we have not mandated that these sentences should be ordered in any particular way, but neither have we mandated that they must be incomparable. When we first consider the linear ordering worry for the standard fuzzy view—where that view tells us that on the unique intended interpretation, ‘Bob is bald’ is (say) strictly truer than ‘Harry is nice’—we think: we did not mandate this ordering; what we have fixed leaves it open which sentence is truer; what is wrong with this other interpretation, on which ‘Harry is nice’ is strictly truer than ‘Bob is bald’? This other interpretation seems just as good as the one which the fuzzy theory deems the unique intended interpretation—it does not seem just as bad (because it linearly orders sentences in respect of truth). On the contrary, the whole point was that all these interpretations are equally compatible with our constraints on correct interpretations, so the fuzzy theory has no business saying that only one of them is correct.
304 If someone still wishes to assert that we have positively mandated that some sentences be incomparable, we need only ask them the question which we asked of the original fuzzy account: what is it about our practice that places this incomparability constraint on correct interpretations? I think it is overwhelmingly plausible that nothing does. The relevant point about our practice is simply that nothing about it mandates anything about the ordering of ‘Harry is nice’ and ‘Bob is bald’: not that they should be ordered this or that way, or that they should not be ordered at all (i.e. should be incomparable). Our practice is simply silent on this matter. It was precisely this sort of silence that set the artificial precision and linear ordering problems for the fuzzy view in the first place.
6.2 Sharp Boundaries The following objection has often been made against the fuzzy approach. If a Sorites series begins with a full-fledged F object and ends with a full-fledged non-F object, then according to the fuzzy account, there will be a last full-fledged F object in the series—for example, a last bald man. In this connection one often hears talk of sharp boundaries. The claim is made that just as the classical view involves a sharp boundary between the bald things and the non-bald things, so the fuzzy view involves a sharp boundary between the 1-bald things and the rest—and hence is no better than the classical view.²⁵ At first sight, it may seem that this objection does not apply to fuzzy plurivaluationism, as opposed to the standard fuzzy view in which there is a unique intended interpretation. For on the fuzzy plurivaluationist view, there is no overall last bald man. On any acceptable interpretation, there is a last man in the series who is mapped to 1 by the function assigned to ‘is bald’ on that interpretation. In general, however, we may expect this to be a different man on different interpretations. Thus, overall, there is no particular man of whom it may be said that he is the last bald man. The problem quickly reasserts itself, however. For there is a last man m in the series such that the sentence ‘m is bald’ is 1 true on every acceptable ²⁵ See e.g. Wright 1987, 254–6; Tye 1996, 219; and Keefe 2000, 113. Cf. also Wright 1975, 349–50; Sainsbury 1991, 169; and Williamson 2003, 695, who refers to what I am calling the ‘sharp boundaries problem’ as the ‘problem of higher-order vagueness’.
305
interpretation. So, while there is no man such that we can talk as though he is the last bald man, there is a last man such that we can talk as though he is bald. So in this sense, there is still a last bald man, according to fuzzy plurivaluationism. One obvious way of trying to avoid this outcome is by saying that the notion of an acceptable interpretation is vague. In §6.2.1, I discuss and reject this approach. I then argue, in §6.2.2, that the ‘problem’ of the last bald man is not a problem at all. There are two sorts of reason why one might object to the existence of a last bald man: reasons based on the nature of vagueness and reasons based on the nature of the facts which determine meaning. Neither counts against the existence of a last bald man. So we may happily accept that there is a last bald man in our Sorites series. 6.2.1 Vagueness in the Metalanguage One idea that has been mentioned frequently in the literature—most often in connection with the artificial precision problem for the standard fuzzy view—is that of having a degree-theoretic account of vagueness presented in a vague metalanguage. Following on from the passage quoted on p. 294 above, Williamson writes: Why should the vagueness of . . . (#2 ) be hard to acknowledge? If a vague language requires a continuum-valued semantics, that should apply in particular to a vague meta-language. The vague meta-language will in turn have a vague meta-metalanguage, with a continuum-valued semantics, and so on all the way up the hierarchy of meta-languages. (Williamson 1994, 128)²⁶
The way in which a story along these lines would help with the last bald man problem is as follows. If the notion of an acceptable interpretation is a vague one, then there is some vagueness as to who is the last man who is 1-bald on every acceptable interpretation. This means that we do not have to accept a precise cut-off between the last man who is 1-bald on every acceptable interpretation and the rest of the men in the series. ²⁶ For discussions or mentions of similar or related views, see Cook 2002; Edgington 1997, 297, 310; Field 1974, 227; Horgan 1994, 160; Keefe 2000, 117–21; McGee and McLaughlin 1995, 238; Rolf 1984, 222; Sainsbury 1997 [1990], 260; Tye 1990, 551; 1997 [1994], 287; 1995, 16; 1996, 219–20; Varzi 2001, 58–9; and Williamson 1994, 130; 2003, 695.
306 Of course, there is a suspicion that the problem will re-emerge at a higher level—but my objection to the present proposal is more fundamental. I do not clearly understand the proposal, and I think that this is a problem with the proposal, not with my powers of comprehension. Goguen (1968–9, 327) writes (my emphasis): Our models are typical purely exact constructions, and we use ordinary exact logic and set theory freely in their development. . . . It is hard to see how we can study our subject at all rigorously without such assumptions.
I agree entirely. I understood fuzzy model theory in the first place because I took it to be a piece of standard mathematics. I understood the definition of a fuzzy set as a function from a background set to [0, 1], the definition of a model as a set of objects together with a function from items of our standard first-order language to elements of the domain (etc.), the recursive definition of the truth value of a closed formula on an interpretation, and so on, to have their place alongside the definitions and constructions that one finds in standard mathematics textbooks. These definitions and constructions are all presented in a precise language, governed by classical logic and semantics, of which every competent mathematician has a working understanding. Now if you turn around at the end of your presentation of fuzzy model theory and tell me that the language in which you made your presentation was governed by the very semantics that you just presented, then I have to say that I did not understand your presentation at all. I am back at square one. I thought you were presenting a piece of normal mathematics, and I know how to understand that sort of thing. If you were not, then I do not know what you were doing, and I do not know how to understand it. Of course, what you say does conjure up some sort of picture for me—it is not as if you uttered complete nonsense—but it is not the sort of perfectly clear picture I get when I work through and understand a piece of normal mathematics. And this is not because I have not worked hard enough: it is because—by your lights as well as mine—what you are doing is not presenting a piece of normal mathematics, in normal mathematical language.²⁷ ²⁷ Cf. the argument against vague identity in Smith 2008.
307
We come to the study of vagueness already able to understand vague statements such as ‘Bob is bald’ in one sense, but not in another sense. We understand them in the sense that we can use them, and respond appropriately to their use by others, in a wide range of situations. We do not understand them, in the sense that we lack a clear theoretical understanding of what is involved in their semantics—of what is going on between a person and the world when she says ‘Bob is bald’. That we lack such an understanding is precisely what drives the study of vagueness and (given the ubiquity of vagueness) makes it so interesting. Now there is a widely accepted standard of what counts as giving a clear theoretical understanding of the semantics of a certain type of discourse: we give a system of model theory for that discourse, of the standard set-theoretic sort. That is the standard I have adopted in this book. Now if we present our semantics, but then say at the end: ‘‘Understand what I have just said in just the way you understand vague language. That is, treat my explanation as having been given not in a precise language, but in a language that is itself vague in the very same way as the language we are investigating,’’ then we simply have not lived up to this standard. Of course, this does not mean that our account is gibberish: we already understand vague language in the first sense indicated above, and so we can also understand the newly presented semantics for vagueness in this sense. What it does mean, however, is that we have abandoned the project of attaining a clear theoretical understanding of the semantics of vague discourse (given our standard of clarity). But that is much too high a price to pay. It defeats our purpose entirely, given that our driving goal was to find such a clear understanding of the semantics of vague discourse. At this point I should ward off a potential confusion. My goal is not to eliminate vagueness. I do not ask, for any vague statement, for a non-vague statement which has the same content. The content of a claim about what is going on, between me and the world, when I say that Bob is bald, is quite different from the content of the (vague) claim that Bob is bald. I seek a clear, precise claim with the former content. Williamson (1994, 191) writes: Formal semantic treatments of vague languages—many-valued logics, supervaluations and the like—are characteristically framed in a meta-language that is conceived as precise. Thus one cannot say in the precise meta-language what utterances in the vague object-language say, for to do so one must speak vaguely;
308 one can only make precise remarks about those vague utterances. Since the expressive limitations of such a meta-language render it incapable of giving the meanings of object-language utterances, it can hardly be regarded as adequate for a genuine semantic treatment of the object-language. . . . the formality of the semantics [comes] at the cost of giving up the central task of genuine semantics: saying what utterances of the object-language mean.
‘‘Saying what ‘Bob is bald’ means’’ might mean making a (different) claim which has the same content as ‘Bob is bald’, or it might mean giving an account of the semantic relations between parts of the sentence ‘Bob is bald’ and parts of the world, and of how these combine to determine the truth status of the whole sentence—i.e. it might mean (as I put it earlier) saying what is going on between a person and the world when she says ‘Bob is bald’. It is only in the latter sense of ‘saying what utterances mean’ that it is the central task of genuine semantics to say what utterances of the object language mean. But fulfilling this task involves, precisely, making remarks about utterances of the object language (as Williamson puts it), rather than making remarks which have the same content as utterances of the object language. Semantics is not translation. So even if vagueness is not eliminable—if no claim in a precise language has the same content as some vague claim—this does not threaten the capacity of a precise metalanguage to say what vague utterances mean (in the second sense indicated above), and hence to carry out the central task of genuine semantics. Indeed, quite the contrary, as I have argued. We do not understand the relationship between vague language and the world, and we want an explanation of it. The proponent of fuzzy semantics presented in a vague metalanguage appears to give us such an explanation—that is, fuzzy model theory—but then she adds at the end that we must understand that the relationship between what she has said and the relationship between vague language and the world is the same as the relationship between vague language and the world. But what is this relationship? Our problem was precisely that we do not have a clear theoretical understanding of this relationship, and the fuzzy metalinguist has not provided such an understanding. 6.2.2 The Last Bald Man We have rejected one way of trying to avoid the last bald man problem. Let us look now at the problem itself, and see whether it is really something that we need to avoid at all. Recall the original objection to the standard
309
fuzzy position. If a Sorites series begins with a full-fledged F object and ends with a full-fledged non-F object, then according to the fuzzy account, there will be a last full-fledged F object—for example, a last bald man. The point of the objection is supposed to be that the fuzzy account is therefore no more intuitively acceptable than the classical account. Why is the supposition of a last (full-fledged) bald man supposed to be counter-intuitive? If there was no last bald man, then all the men in the series would be bald—even the man at the end with a full head of hair. This is obviously wrong, and thus it seems that it is in fact the denial of the claim that there is a last bald man that is counter-intuitive. But now, if there is nothing wrong with the claim that there is a last bald man in the series, what is the objection to a classical semantics for vague language? Well, in the classical picture, the step from the last bald man to the next man, while a very small step in baldness-relevant respects, corresponds to a large step in the truth value of the claim ‘This man is bald’. In the fuzzy framework, on the other hand, the step from the last bald man to the next man corresponds to a very small step in the truth value of the claim ‘This man is bald’: the step from the value 1 to some nearby value such as 0.999. The problem with the classical picture is not its positing of a last bald man per se: it is the fact that in the classical framework, the step from this man to the next must involve a violation of Closeness, and hence conflicts with the vagueness of ‘is bald’. In the fuzzy framework, on the other hand, the positing of a last bald man does not involve a violation of Closeness. In this context, the usual talk of sharp boundaries is potentially misleading, because it hides a very important distinction. In one sense, a ‘sharp boundary’ is one that involves a violation of Closeness: we have two objects a and b that are very close in F-relevant respects; but there is a sharp boundary which cuts between a and b, resulting in Fa and Fb not being very close in respect of truth. Now the classical boundary between the bald things and the non-bald things is sharp in this first sense: the boundary is such that Bob (the last bald man) and Bill (the next man) differ by just a hair, and yet ‘Bob is bald’ is True while ‘Bill is bald’ is False. On the other hand, the fuzzy boundary between the 1-bald things and the rest is not sharp in this first, Closeness-violating sense: Bob and Bill differ by just a hair, ‘Bob is bald’ is 1 true, and ‘Bill is bald’ is 0.999 true—hence these two claims are very close in respect of truth. In the second sense, a theory involves a ‘sharp boundary’ if it assigns a different semantic status to Fa and
310 Fb, even though a and b are very close in F-relevant respects. In this sense, both the classical and the fuzzy views clearly involve sharp boundaries. But that’s no problem: any non-vague theory of vagueness will involve sharp boundaries in this sense (unless it assigns the same semantic status to Fx for every object x, in which case it is a hopeless theory), and, as discussed in the previous section, we want a non-vague theory of vagueness. Letting B(n) be ‘A man with exactly n hairs on his head is bald’, and Val(S) be the degree of truth of the sentence S, Schwartz (1990, 46) writes: Consider the set {B(108 ), B(108 − 1), . . . , B(0)} . . . since the set . . . is finite and the values are well-ordered, there must be a first n such that Val(B(n)) > 0. In other words there will be a precise, to the hair, dividing line between definitely non-bald and borderline non-bald. This means that there is an absolute precision implied by the degrees of truth approach that is inconsistent with the vagueness of ‘bald’. A precise and unknown dividing line between definitely non-bald and borderline is just as contrary to the vagueness of ‘bald’, and just as unbelievable, as would be a precise dividing line between non-bald and bald. The same kind of argument could presumably be repeated for any vague term.
This is precisely what I deny. The dividing line in the fuzzy picture between definitely non-bald and borderline is not just as contrary to the vagueness of ‘bald’ as would be a precise dividing line between non-bald and bald—for, unlike the latter sort of dividing line, the former sort does not involve a violation of Closeness. A variant of the last bald man objection says that the fuzzy logician is committed to a threefold division between the sentences true to degree 1, those true to degree 0, and the remainder—and by thus positing sharp lines between the clear cases and the borderline cases, and the clear countercases and the borderline cases, falls foul of the original problem of higher-order vagueness (as described in §3.5.5).²⁸ My response to this objection is the same as before. The force of the original problem of higher-order vagueness against theories which approach vagueness in terms of three categories (e.g. true, false, and undefined) stems from the fact that if we have only three categories, we cannot accommodate Closeness: these three categories bring with them boundaries which are sharp in the first sense outlined above. The fuzzy theory does not face this problem: for, while we may distinguish borderline cases of ‘bald’ (say) from objects assigned 0 or 1 by the fuzzy ²⁸ See Sainsbury 1997 [1990], 256, and Keefe 2000, 131.
311
set of bald things, we do not get boundaries between the clear cases and the borderline cases, and the clear countercases and the borderline cases, which are sharp in the objectionable, Closeness-violating sense. Thus we do not face the problem of having to blur these boundaries by moving to a higher level of borderline cases, i.e. the original problem of higher-order vagueness. Considerations to do with the nature of vagueness do not, then, count against the existence of a last bald man in the original fuzzy view. Nor do such considerations count against the existence, in my fuzzy plurivaluationist view, of a last man who is 1-bald on every acceptable interpretation. On my view, ‘is bald’ satisfies Closeness on every acceptable interpretation, and hence we can say overall that it satisfies Closeness. Thus, the existence of a last bald man does not threaten the vagueness of ‘is bald’ on my view. What about considerations to do with the determination of meaning? Someone might respond to my fuzzy plurivaluationist view that not only does our practice not suffice to fix a unique intended interpretation of our discourse: it does not even suffice to fix a unique set of acceptable interpretations. For after all, what is it about our practice that could make an interpretation which assigns ‘is bald’ a function which maps Bill to 0.99 acceptable, but an interpretation which assigns ‘is bald’ a function which maps Ben—who differs from Bill by just a hair—to 0.99 unacceptable?²⁹ Let’s take stock of the situation. On the one hand, we have a vague discourse. On the other hand, we have all the fuzzy interpretations of it. The question is, what does the practice of speakers of the discourse determine concerning which of those interpretations are acceptable and which incorrect? First, an undeniable fact is that our practice does not determine nothing. It is not in doubt that some interpretations are incorrect: for example, any interpretation which assigns ‘is bald’ a function which maps Telly Savalis to 0. Second, the upshot of §6.1 was that our practice does not determine a unique intended interpretation. From here, we have several options. One option is to accept that there are no acceptable fuzzy interpretations. This means that either vague language is meaningless, or its correct interpretation is not a fuzzy interpretation at all. The former option is a non-starter. The latter option does not help: if we replace fuzzy ²⁹ Cf. Field 1974, 226–7.
312 interpretations with some other sort of interpretations, we will still face the problem of how our practice determines a unique one, or set, of these as correct. A second option is to posit a fuzzy set of acceptable interpretations. But this does not help—indeed, it makes the problem worse. Suppose we admit degrees of correctness for interpretations. There would still be a last bald man: the last man who is 1-bald on every interpretation which is 1-correct. All we have done is make the determination problem harder: now our practice has to determine, for each interpretation, not simply whether it is acceptable, but to what degree it is acceptable! The same comments would apply if we opted for some sort of ramified or iterated fuzzy set of acceptable interpretations: this move would, if anything, make the determination problem harder. A third option would be to say that there is something vague about the set of acceptable interpretations, but refuse to specify exactly what one means by this (e.g., one does not cash this out as meaning that the set of acceptable interpretations is a fuzzy set). This proposal is to be avoided, for the sorts of reason given in §6.2.1: it sacrifices the clarity and perspicuity that we get with a non-vague account of the semantics of vagueness. The fourth and final option is to say that our practice determines a classical/crisp/precise set of acceptable interpretations. Now in opposition to this view, we have the following instance of the argument structure from §6.1: 1. Facts of type T do not determine a unique set of acceptable interpretations of discourse D. 2. No facts of any type other than T are relevant to determining the acceptable interpretations of D. 3. From (1) and (2): All the facts together do not determine a unique set of acceptable interpretations of D. 4. It cannot be a primitive fact—i.e. a fact not determined by other facts—that some interpretation M is an acceptable interpretation of D. 5. From (3) and (4): It is not a fact at all that D has a unique set of acceptable interpretations. I do not accept the conclusion, and I think that step 1 is the false premiss. Suppose we conduct a survey in which we ask speakers to rate the degree of baldness of certain men. The survey runs through a Sorites series of
313
men, from clearly bald to clearly non-bald. It is to be expected that the results of the survey bear some important relation to the facts as to the acceptable interpretations of ‘is bald’—assuming that facts about usage (partly) determine meaning. It is not to be expected that we will be able to extract from the survey results a unique correct extension for ‘is bald’. Undoubtedly, not every participant will agree as to who is the last 1-bald man: some will say the man with zero hairs, some the man with 30 hairs, and so on. Any choice amongst these seems entirely arbitrary. Thus, it seems utterly implausible to suppose that our practice determines a unique intended interpretation. However, we do not face the same problems in trying to extract from the survey results a unique, crisp set of acceptable interpretations. For it is simply not the case that for anything you might imagine saying about the degrees of baldness of the men in the series, some speaker will say this thing. Thus, some things just will not be said by any participant. It is entirely non-arbitrary to say that interpretations which deem true these things which nobody will say are incorrect, and that the rest—those which match what at least one participant says—are acceptable. Thus we can non-arbitrarily extract a unique set of acceptable interpretations from the survey, while we cannot so extract a unique intended interpretation. Now return to our original question: what is it about our practice that makes an interpretation which assigns ‘is bald’ a function which maps Bill to 0.99 acceptable, but an interpretation which assigns ‘is bald’ a function which maps Ben—who differs from Bill by just a hair—to 0.99 unacceptable? It may simply be that saying makes it so: that a given interpretation is acceptable because it makes true what some speaker says. Or it may be that the relationship between meaning and use is more complex than this. Either way, we can be quite confident that something about our (collective) practice does determine such precise distinctions—for this reveals itself in the survey results. It is simply a fact that there is a last man in the series whom every participant will deem 100 per cent bald. My suggestion is that either these surface usage facts themselves, or some deeper facts which underlie them and are indirectly manifested in the survey, determine that an interpretation which assigns ‘is bald’ a function which maps Bill to 0.99 is acceptable, but an interpretation which assigns ‘is bald’ a function which maps Ben to 0.99 is unacceptable—where Ben is the last man in the series whom every participant will deem 100 per cent bald.
314 One might think that a problem remains. Given a set of speakers, a crisp set of acceptable interpretations emerges, in the way just discussed. But if we were to survey different speakers, or the same speakers tomorrow, we would probably get a different set of acceptable interpretations. But then it seems that we cannot, once and for all, fix on a unique set of acceptable interpretations—and hence cannot, once and for all, fix on a last man in a given Sorites series who is bald according to every acceptable interpretation. My response to this is that it is a true observation, but by no means an objection: to think so would be to misunderstand our goal. The challenge was to show how our practice could fix a unique set of acceptable interpretations of our words. There is a parameter here: us (‘our practice’), that is, a group of speakers. My goal was not to eliminate this parameter, but to show how—given a set of speakers—it is plausible to think that their practice could determine a unique set of acceptable interpretations of their words. The goal was not to show that a unique set of acceptable interpretations can be fixed on ‘once and for all’—that is, even in the absence of a designated group of speakers—and so the observation that this has not been achieved is no objection. Note that it is not simply that my goal happened not to be to eliminate the ‘group of speakers’ parameter, and so this is not an objection to me—but furthermore, that would in fact be a bad choice of goal. The guiding idea behind all our discussions of problems of the determination of meaning (beginning in §2.1.1) has been that language is a human artefact, and that the meanings of terms depend essentially on the ways in which speakers use them. From this perspective, the idea of some words having some meanings in abstraction from a set of speakers whose practice confers upon them those meanings is a non-starter. Rather, talk of the meanings of some terms must always be relative to a group of speakers, whose dispositions regarding the usage of those terms plays an essential part in fixing those meanings.³⁰ ³⁰ In the case of vague terms such as ‘tall’ and ‘bald’, I have assumed that the extensions of our terms are fixed by our own dispositions regarding their usage. However, it need not always be the case that when a speaker uses a word (at some time t), her dispositions at t regarding the use of that word are among the determinants of its meaning—i.e. are among the determinants of the intended interpretation(s) of her speech at t. We should not want this to be the case, because it would rule out the possibility that what one said might be false—i.e. false on (all) the intended interpretation(s). It is, then, part and parcel of my view that when we speak, and mean something by our words—i.e. speak relative to an intended interpretation(s)—there must be a ‘reference class’ of speakers whose dispositions regarding the use of the words one uses serve to determine (at least partly—there may be other factors involved) which interpretation(s) are intended, i.e. which interpretation(s) one is speaking
315
Thus, neither considerations to do with the nature of vagueness, nor considerations to do with the determination of meaning, count against the existence of a last man who is 1-bald on every acceptable interpretation. relative to, i.e. what one means by what one is saying. However it is not part and parcel of the view that one is oneself a member of this reference class. For example, one might speak relative to an interpretation picked out by one’s teachers usage dispositions, or by some experts’ usage dispositions, or by the usage dispositions one had at some earlier time (e.g. when one was less tired).
This page intentionally left blank
Conclusion We have covered a lot of ground, and I will not recapitulate every step. It might, however, be useful to conclude by retracing one key line of thought, and showing how it leads to the positive view that I have presented and defended: fuzzy plurivaluationism. The line of thought begins with the observation that there are two problems with the epistemicist view of vagueness—specifically, with its positing of a particular change point in a Sorites series for P at which the claim ‘this object is P’ goes from being true to being false. The first problem is what I call the jolt problem. The problem is the nature of the semantic shift posited by the epistemicist, and in particular the dramatic difference between the semantic statuses of the claims ‘this object is P’ (which is true) and ‘that object is P’ (which is false), where these claims concern the objects on either side of the change point—objects which are very similar in all respects relevant to the application of the predicate P. In short, the epistemicist thinks that as we go down the Sorites series saying ‘this is P’ of each object in turn, there is at some point a sudden jolt, as our claims crash all the way from true to false in one step—and this has been thought to be implausible. The second problem is what I call the location problem. The problem concerns the fixing of the location of the change point, and in particular the fact that we cannot see how our own usage of language (together with facts about our causal connections to our environment, referential eligibility, simplicity, and so on) could fix it to be at any particular point in the series. The epistemicist thinks that there is a number n such that the nth object in the series is the last one to which P truly applies—but why, it has been objected, not n − 1 or n + 1? What is it about our usage of P that gives it a meaning which singles out a unique n in this way? In the literature, these two sorts of problem have tended to be run together under the heading ‘higher-order vagueness’. For example, when someone wants to object to (say) a truth gap view of vagueness, on the grounds that, while it avoids a point in the Sorites series at which the claim ‘this is P’ crashes from true to false, it does not avoid dramatic semantic shifts
318 altogether—for it posits a dramatic difference between the semantic statuses of the claims Pa and Pb, where a and b straddle the boundary between the positive (negative) cases and the borderline cases, in that Pa is true (false) while Pb lacks a truth value, even though a and b are very similar in all respects relevant to the application of P —she typically does so by saying that the gappy view falls foul of the problem of higher-order vagueness. And when, for example, someone wants to object to (say) a supervaluationist view, on the grounds that while it avoids positing a unique number n such that the nth object in the series is the last one to which P applies, it does not avoid problems of meaning-determination altogether—for it posits a precise set of admissible precisifications of vague language, when it is extremely hard to see how our practice could determine a unique such set—he typically does so by saying that the supervaluationist view falls foul of the problem of higher-order vagueness. But the two objections should not be run together under one name, for the first is an instance of the jolt problem, and the second is an instance of the location problem—and these are fundamentally different problems. The jolt problem is intimately connected to vagueness. I have argued that it is of the essence of vagueness that dramatic semantic shifts do not occur in Sorites series: for if P is vague, and a and b are very similar in P-relevant respects (and adjacent items in a Sorites series for P always are very similar in P-relevant respects), then Pa and Pb are very similar in respect of truth. The location problem, on the other hand, is not intimately connected to vagueness. I have argued that it is a manifestation of more general worries about what fixes the meaning of our language—worries which manifest in areas having nothing to do with vagueness, such as Quine’s problem of the indeterminacy of translation and Kripkenstein’s sceptical problem. So we have two quite different problems to solve—and this is the origin of the two key features of fuzzy plurivaluationism. First, its positing of degrees of truth is a response to the jolt problem. I argued that no theory can allow for the existence of predicates which both have associated Sorites series and satisfy Closeness—that is, no theory can avoid the jolt problem—unless it countenances degrees of truth. Second, its positing of semantic indeterminacy—a lack of a unique intended interpretation of vague discourse—is a response to the location problem. I argued that the vagueness of a discourse would not be impugned simply because it had a unique intended fuzzy interpretation—but that as a matter of fact,
319
it seems that our meaning-fixing practices do not in general suffice to determine a unique intended interpretation of vague language, and so we must countenance semantic indeterminacy. Other theories of vagueness can solve one of these two problems. For example, classical plurivaluationism can solve the location problem, and the original fuzzy theory, which posits degrees of truth without semantic indeterminacy, can solve the jolt problem. However only fuzzy plurivaluationism—which posits both degrees of truth and semantic indeterminacy—can solve both problems.
This page intentionally left blank
References Abbott, Barbara (1997). Models, truth and semantics. Linguistics and Philosophy, 20: 117–38. Akiba, Ken (1999). On super- and subvaluationism: A classicist’s reply to Hyde. Mind, 108: 727–32. (2004). Vagueness in the world. Noûs, 38: 407–29. Barnett, David (2000). Vagueness-related attitudes. Philosophical Issues, 10: 302–20. Beall, JC, and Colyvan, Mark (2001). Heaps of gluts and Hyde-ing the sorites. Mind, 110: 401–8. Beaney, Michael (ed.) (1997). The Frege Reader. Blackwell, Oxford. Benacerraf, Paul (1965). What numbers could not be. Philosophical Review, 74: 47–73. Bennett, Brandon (1998). Modal semantics for knowledge bases dealing with vague concepts. In A. G. Cohn, L. Schubert, and S. Shapiro (eds.), Principles of Knowledge Representation and Reasoning: Proceedings of the 6th International Conference (KR-98), 234–44. Morgan Kaufmann, San Mateo, CA. Black, Max (1997 [1937]). Vagueness: An exercise in logical analysis. In Keefe and Smith (1997b), 69–81. Boghossian, Paul (1989). The rule-following considerations. Mind, 98: 507–49. Bonini, Nicolao, Osherson, Daniel, Viale, Riccardo, and Williamson, Timothy (1999). On the psychology of vague predicates. Mind and Language, 14: 377–93. Brady, Ross T. (1988). A content semantics for quantified relevant logics I. Studia Logica, 47: 111–27. Braun, David, and Sider, Theodore (2007). Vague, so untrue. Noûs, 41: 133–56. Bueno, Otávio, and Colyvan, Mark (2006). Just what is vagueness? Typescript (available at ). Burgess, John A. (1990). The sorites paradox and higher-order vagueness. Synthese, 85: 417–74. (1998). In defence of an indeterminist theory of vagueness. Monist, 81: 233–52. (2001). Vagueness, epistemicism and response-dependence. Australasian Journal of Philosophy, 79: 507–24. and Humberstone, I. L. (1987). Natural deduction rules for a logic of vagueness. Erkenntnis, 27: 197–229. Burns, Linda (1991). Vagueness: An Investigation into Natural Languages and the Sorites Paradox. Kluwer Academic Publishers, Dordrecht.
322 Burns, Linda (1995). Something to do with vagueness. Southern Journal of Philosophy, 33 (supplement): 23–47. Campbell, Richmond (1974). The sorites paradox. Philosophical Studies, 26: 175–91. Cargile, James (1997 [1969]). The sorites paradox. In Keefe and Smith (1997b), 89–98. Chalmers, David J. (2001). The nature of epistemic space. Typescript (available at ). Chambers, Timothy (1998). On vagueness, sorites, and Putnam’s ‘‘intuitionistic strategy’’. Monist, 81: 343–8. Cook, Roy T. (2002). Vagueness and mathematical precision. Mind, 111: 227–47. Coombs, C. H., Raiffa, H., and Thrall, R. M. (1954). Some views on mathematical models and measurement theory. Psychological Review, 61: 132–44. Copeland, B. Jack (1997). Vague identity and fuzzy logic. Journal of Philosophy, 94: 514–34. Davidson, Donald (1984 [1977]). Reality without reference. In Davidson (1984), 215–25. (1984). Inquiries into Truth and Interpretation. Clarendon Press, Oxford. (2005 [1986]). A nice derangement of epitaphs. In Davidson (2005), 89–107. (2005). Truth, Language, and History: Philosophical Essays, v. Oxford University Press, Oxford. Debreu, Gerard (1954). Representation of a preference ordering by a numerical function. In R. Thrall, C. Coombs, and R. Davis (eds.), Decision Processes, 159–65. John Wiley and Sons, New York. DeRose, Keith (1999). Contextualism: An explanation and defense. In John Greco and Ernest Sosa (eds.), The Blackwell Guide to Epistemology, 187–205. Blackwell, Malden, MA. (2002). Assertion, knowledge, and context. Philosophical Review, 111: 167– 203. Divers, John (2006). Possible-worlds semantics without possible worlds: The agnostic approach. Mind, 115: 187–225. Dummett, Michael (1997 [1975]). Wang’s paradox. In Keefe and Smith (1997b), 99–118. (1978). Truth. In Truth and Other Enigmas, 1–24. Harvard University Press, Cambridge, MA. Edgington, Dorothy (1992). Validity, uncertainty and vagueness. Analysis, 52: 193–204. (1993). Wright and Sainsbury on higher-order vagueness. Analysis, 53: 193–200. (1997). Vagueness by degrees. In Keefe and Smith (1997b), 294–316.
323
Eklund, Matti (2001). Supervaluationism, vagueifiers, and semantic overdetermination. Dialectica, 55: 363–78. (2005). What vagueness consists in. Philosophical Studies, 125: 27–60. Fara, Delia Graff (1997). The Phenomena of Vagueness (PhD thesis, Department of Linguistics and Philosophy, Massachusetts Institute of Technology). Published under ‘Delia Graff ’. (2000). Shifting sands: An interest-relative theory of vagueness. Philosophical Topics, 28: 45–81. Published under ‘Delia Graff ’. and Williamson, Timothy (2002). Introduction. In Delia Graff Fara and Timothy Williamson (eds.), Vagueness, pp. xi–xxviii. Ashgate, Aldershot. Published under ‘Delia Graff and Timothy Williamson’. Field, Hartry (1973). Theory change and the indeterminacy of reference. Journal of Philosophy, 70: 462–81. (1974). Quine and the correspondence theory. Philosophical Review, 83: 200–28. (2000). Indeterminacy, degree of belief, and excluded middle. Noûs, 34: 1–30. Fine, Kit (1997 [1975]). Vagueness, truth and logic. In Keefe and Smith (1997b), 119–50. Fodor, Jerry (1987). Psychosemantics. MIT Press, Cambridge, MA. and Lepore, Ernest (1996). What cannot be evaluated cannot be evaluated and it cannot be supervalued either. Journal of Philosophy, 93: 516–35. Forbes, Graeme (1983). Thisness and vagueness. Synthese, 54: 235–59. (1985). The Metaphysics of Modality. Clarendon Press, Oxford. Goguen, Joseph (1967). L-fuzzy sets. Journal of Mathematical Analysis and Applications, 18: 145–74. (1968–9). The logic of inexact concepts. Synthese, 19: 325–73. (1979). Fuzzy sets and the social nature of truth. In Madan M. Gupta, Rammohan K. Ragade, and Ronald R. Yager (eds.), Advances in Fuzzy Set Theory and Applications, 49–67. North-Holland, Amsterdam. Greenough, Patrick (2003). Vagueness: A minimal theory. Mind, 112: 235–81. Gregory, Dominic (2005). Keeping semantics pure. Noûs, 39: 505–28. Haack, Susan (1978). Philosophy of Logics. Cambridge University Press, Cambridge. (1979). Do we need ‘‘fuzzy logic’’? International Journal of Man-Machine Studies, 11: 437–45. (1996 [1980]). Is truth flat or bumpy? In Haack (1996), 243–58. (1996). Deviant Logic, Fuzzy Logic: Beyond the Formalism. University of Chicago Press, Chicago. Hájek, Petr (1998). Metamathematics of Fuzzy Logic. Kluwer Academic Publishers, Dordrecht.
324 Hájek, Petr (1999). Ten questions and one problem on fuzzy logic. Annals of Pure and Applied Logic, 96: 157–65. Halmos, Paul, and Givant, Steven (1998). Logic as Algebra. The Dolciani Mathematical Expositions, 21. The Mathematical Association of America. Hansson, Bengt (1968). Fundamental axioms for preference relations. Synthese, 18: 423–42. Hardin, C. L. (1988). Color for Philosophers: Unweaving the Rainbow. Hackett, Indianapolis. Hart, W. D. (1992). Hat-tricks and heaps. Philosophical Studies, 33: 1–24. Heck, Richard G. (1993). A note on the logic of (higher-order) vagueness. Analysis, 53: 201–8. Hohwy, Jakob (2001). Semantic primitivism and normativity. Ratio, 14: 1–17. Horgan, Terence (1994). Robust vagueness and the forced-march sorites paradox. Philosophical Perspectives, 8: 159–88. Horwich, Paul (1998). Truth, 2nd rev. edn. Clarendon Press, Oxford. Hyde, Dominic (1994). Why higher-order vagueness is a pseudo-problem. Mind, 103: 35–41. (1997). From heaps and gaps to heaps of gluts. Mind, 108: 641–60. (1999). Pleading classicism. Mind, 108: 733–5. (2001). A reply to Beall and Colyvan. Mind, 110: 409–11. (2003). Higher-orders of vagueness reinstated. Mind, 112: 301–5. Islam, Amitavo (1996). The abstract and the concrete. Presented at the Annual Conference of the Australasian Association of Philosophy, University of Queensland. Ja´skowski, Stanisław (1969). Propositional calculus for contradictory deductive systems. Studia Logica, 24: 143–57. Translation by O. Wojtasiewicz from the Polish: Rachunek zdan´ dla systemów dedukcyjnych sprzecznych. Studia Societatis Scientarum Torunensis, sectio A, vol. I, no. 5, Toru´n, 1948. Johnston, Mark (1989). Dispositional theories of value. Proceedings of the Aristotelian Society, supp. vol. 63: 139–74. (1993). Objectivity refigured: Pragmatism without verificationism. In J. Haldane and C. Wright (eds.), Reality, Representation and Projection, 85–130. Oxford University Press, New York. Kamp, Hans (1975). Two theories about adjectives. In Edward L. Keenan (ed.), Formal Semantics of Natural Language, 123–55. Cambridge University Press, Cambridge. (1981). The paradox of the heap. In Uwe Mönnich (ed.), Aspects of Philosophical Logic, 225–77. D. Reidel, Dordrecht.
325
Keefe, Rosanna (1998a). Vagueness and language clusters. Australasian Journal of Philosophy, 76: 611–20. (1998b). Vagueness by numbers. Mind, 107: 565–79. (2000). Theories of Vagueness. Cambridge University Press, Cambridge. and Smith, Peter (1997a). Introduction: theories of vagueness. In Keefe and Smith (1997b), 1–57. (1997b). eds. Vagueness: A Reader. MIT Press, Cambridge, MA. Kleene, Stephen Cole (1952). Introduction to Metamathematics. D. Van Nostrand, Princeton. Klir, George J., and Yuan, Bo (1995). Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice-Hall, Upper Saddle River, NJ. Krantz, David H., Luce, R. Duncan, Suppes, Patrick, and Tversky, Amos (1971). Foundations of Measurement, i. Academic Press, New York. Kripke, Saul (1975). Outline of a theory of truth. Journal of Philosophy, 72: 690–716. (1980). Naming and Necessity. Blackwell, Oxford. (1982). Wittgenstein on Rules and Private Language. Harvard University Press, Cambridge, MA. Lakoff, George (1973). Hedges: A study in meaning criteria and the logic of fuzzy concepts. Journal of Philosophical Logic, 2: 458–508. Lepore, Ernest (1983). What model theoretic semantics cannot do. Synthese, 54: 167–87. Lewis, David (1969). Convention: A Philosophical Study. Harvard University Press, Cambridge, MA. (1983 [1970]). General semantics. In Lewis (1983), 189–232. (1983 [1975]). Languages and language. In Lewis (1983), 163–88. (1983 [1976]). Survival and identity. In Lewis (1983), 55–77. (1979). Attitudes de dicto and de se. The Philosophical Review, 88: 513–43. (1983 [1979]). Scorekeeping in a language game. In Lewis (1983), 233–49. (1986 [1980]). A subjectivist’s guide to objective chance. In Lewis (1986b), 83–132. (1999 [1983]). New work for a theory of universals. In Lewis (1999), 8–55. (1983). Philosophical Papers, i. Oxford University Press, New York. (1999 [1984]). Putnam’s paradox. In Lewis (1999), 56–77. (1986a). On the Plurality of Worlds. Basil Blackwell, Oxford. (1986b). Philosophical Papers, ii. Oxford University Press, New York. (1992). Meaning without use: Reply to Hawthorne. Australasian Journal of Philosophy, 70: 106–10.
326 Lewis, David (1993). Many, but almost one. In John Bacon, Keith Campbell, and Lloyd Reinhardt (eds.), Ontology, Causality and Mind: Essays in Honour of D. M. Armstrong, 23–38. Cambridge University Press, Cambridge. (1999). Papers in Metaphysics and Epistemology. Cambridge University Press, Cambridge. Łukasiewicz, Jan (1970). Selected Works, ed. L. Borkowski. North-Holland, Amsterdam. and Tarski, Alfred (1970 [1930]). Investigations into the sentential calculus. In Łukasiewicz (1970), 131–52. MacFarlane, John (2006). The things we (sorta kinda) believe. Philosophy and Phenomenological Research, 73: 218–24. Machina, Kenton F. (1976). Truth, belief, and vagueness. Journal of Philosophical Logic, 5: 47–78. Malinowski, Grzegorz (1993). Many-Valued Logics. Clarendon Press, Oxford. McGee, Vann (1997). ‘Kilimanjaro’. Canadian Journal of Philosophy, supp. vol. 23: 141–63. and McLaughlin, Brian (1995). Distinctions without a difference. Southern Journal of Philosophy, 33 (supp.): 203–51. Mehlberg, Henryk (1997 [1958]). The Reach of Science. University of Toronto Press, Toronto. Extract from §29, pp. 256–9, repr. under the title ‘Truth and Vagueness’, in Keefe and Smith (1997b), 85–8. Menger, Karl (1942). Statistical metrics. Proceedings of the National Academy of Sciences of the United States of America, 28: 535–7. Menzel, Christopher (1990). Actualism, ontological commitment, and possible world semantics. Synthese, 85: 355–89. Merrill, G. H. (1980). The model-theoretic argument against realism. Philosophy of Science, 47: 69–81. Milne, Peter (2007). Bets and fuzzy propositions: Comments on Nicholas J. J. Smith’s ‘Degrees of truth, degrees of belief, and pragmatics’. Presented at the Arché Vagueness Conference, St Andrews, 8 June. Nguyen, Hung T., and Walker, Elbert A. (2000). A First Course in Fuzzy Logic, 2nd edn. Chapman & Hall/CRC, Boca Raton. Norwich, A. M., and Turksen, I. B. (1982). The fundamental measurement of fuzziness. In Ronald R. Yager (ed.), Fuzzy Sets and Possibility Theory: Recent Developments, 49–60. Pergamon, New York. Novák, Vilém (1998). Fuzzy logic. In Dov M. Gabbay and Philippe Smets (eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems, i: Quantified Representation of Uncertainty and Imprecision, 75–109. Kluwer, Dordrecht.
327
Perfilieva, Irina, and Moˇckoˇr, Jiˇrí (1999). Mathematical Principles of Fuzzy Logic, Kluwer Academic Publishers, Boston. Peirce, C. S. (1902). Vague. In J. M. Baldwin (ed.), Dictionary of Philosophy and Psychology, 748. Macmillan, New York. Pinkal, Manfred (1995). Logic and Lexicon: The Semantics of the Indefinite. Studies in Linguistics and Philosophy, 56, trans. Geoffrey Simmons. Kluwer, Dordrecht. Plantinga, Alvin (1974). The Nature of Necessity. Clarendon Press, Oxford. Post, E. L. (1920). Introduction to a general theory of elementary propositions. Bulletin of the American Mathematical Society, 26: 437. (1921). Introduction to a general theory of elementary propositions. American Journal of Mathematics, 43: 163–85. Priest, Graham (1998). Fuzzy identity and local validity. Monist, 81: 331–42. (2001). An Introduction to Non-Classical Logic. Cambridge University Press, Cambridge. and Routley, Richard (1989). Systems of paraconsistent logic. In Graham Priest, Richard Routley, and Jean Norman (eds.), Paraconsistent Logic: Essays on the Inconsistent, 151–86. Philosophia Verlag, Munich. Przełe¸cki, Marian (1976). Fuzziness as multiplicity. Erkenntnis, 10: 371–80. Putnam, Hilary (1975). The meaning of ‘meaning’. In Mind, Language and Reality, Philosophical Papers, ii. 215–71. Cambridge University Press, Cambridge. (1983 [1977]). Models and reality. In Putnam (1983a), 1–25. (1978). Realism and reason. In Meaning and the Moral Sciences, 123–40. Routledge & Kegan Paul, London. (1981). Reason, Truth and History. Cambridge University Press, Cambridge. (1983a). Realism and Reason, Philosophical Papers, iii. Cambridge University Press, Cambridge. (1983b). Vagueness and alternative logic. In Putnam (1983a), 271–86. (1985). A Quick Read is a wrong Wright. Analysis, 45: 203. (1991). Reply to Stephen Schwartz and William Throop. Erkenntnis, 34: 413–14. Quine, Willard Van Orman (1960). Word and Object. MIT Press, Cambridge, MA. (1969). Ontological relativity. In Ontological Relativity and Other Essays, 26–68. Columbia University Press, New York. (1970). On the reasons for the indeterminacy of translation. Journal of Philosophy, 67: 178–83. (1992). Pursuit of Truth, rev. edn. Harvard University Press, Cambridge, MA. Raffman, Diana (1994). Vagueness without paradox. Philosophical Review, 103: 41–74.
328 Raffman, Diana (1996). Vagueness and context relativity. Philosophical Studies, 81: 175–92. (2000). Is perceptual indiscriminability nontransitive? Philosophical Topics, 28: 153–75. Ramsey, F. P. (1990 [1926]). Truth and probability. In Ramsey (1990), 52–94. (1990). Philosophical Papers, ed. D. H. Mellor. Cambridge University Press, Cambridge. Rea, George (1989). Degrees of truth versus intuitionism. Analysis, 49: 31–2. Read, Stephen, and Wright, Crispin (1985). Hairier than Putnam thought. Analysis, 45: 56–8. Rescher, Nicholas (1969). Many-Valued Logic. McGraw-Hill, New York. Restall, Greg (1994). On Logics without Contraction (PhD thesis, Department of Philosophy, University of Queensland). Robertson, Teresa (2000). On Soames’s solution to the sorites paradox. Analysis, 60: 328–34. Rolf, Bertil (1984). Sorites. Synthese, 58: 219–50. Russell, Bertrand (1997 [1923]). Vagueness. In Keefe and Smith (1997b), 61–8. Sainsbury, Mark (1986). Degrees of belief and degrees of truth. Philosophical Papers, 15: 97–106. (1997 [1990]). Concepts without boundaries. In Keefe and Smith (1997b), 251–64. (1991). Is there higher-order vagueness? Philosophical Quarterly, 41: 167–82. (1995). Paradoxes, 2nd edn. Cambridge University Press, Cambridge. Sanford, David H. (1975). Borderline logic. American Philosophical Quarterly, 12: 29–39. (1976). Competing semantics of vagueness: Many values versus super-truth. Synthese, 33: 195–210. (1993). The problem of the many, many composition questions, and naive mereology. Noûs, 27: 219–28. Schiffer, Stephen (1999). The epistemic theory of vagueness. Philosophical Perspectives, 13: 481–503. (2000). Vagueness and partial belief. Philosophical Issues, 10: 220–57. Schwartz, Stephen P. (1987). Intuitionism and sorites. Analysis, 47: 179–83. (1990). Intuitionism versus degrees of truth. Analysis, 50: 43–7. and Throop, William (1991). Intuitionism and vagueness. Erkenntnis, 34: 347–56. Shapiro, Stewart (2006). Vagueness in Context. Clarendon Press, Oxford. Skyrms, Brian (1984). Pragmatics and Empiricism. Yale University Press, New Haven. Smith, Nicholas J. J. (2003). Vagueness by numbers? No worries. Mind, 112: 283–90.
329
(2004). Vagueness and blurry sets. Journal of Philosophical Logic, 33: 165–235. (2005a). A plea for things that are not quite all there: Or, Is there a problem about vague composition and vague existence? Journal of Philosophy, 102: 381–421. (2005b). Vagueness as closeness. Australasian Journal of Philosophy, 83: 157–83. (2006). Semantic regularity and the liar paradox. The Monist (special issue on Truth), 89: 178–202. (2008). Why sense cannot be made of vague identity. Noûs, 42: 1–16. Soames, Scott (1997). Skepticism about meaning: Indeterminacy, normativity, and the rule-following paradox. Canadian Journal of Philosophy, supp. vol. 23: 211–49. (1999). Understanding Truth. Oxford University Press, New York. (2002). Replies. Philosophy and Phenomenological Research, 65: 429–52. Sorensen, Roy (1988). Blindspots. Clarendon Press, Oxford. (2001). Vagueness and Contradiction. Clarendon Press, Oxford. Stalnaker, Robert C. (1999). Context and Content: Essays on Intentionality in Speech and Thought. Oxford University Press, Oxford. Suppes, Patrick (1957). Introduction to Logic. D. Van Nostrand, Princeton. and Zinnes, Joseph L. (1963). Basic measurement theory. In R. Duncan Luce, Robert R. Bush, and Eugene Galanter (eds.), Handbook of Mathematical Psychology, i. 1–76. John Wiley and Sons, New York. Tappenden, Jamie (1993). The liar and sorites paradoxes: Toward a unified treatment. Journal of Philosophy, 90: 551–77. Tye, Michael (1989). Supervaluationism and the law of excluded middle. Analysis, 49: 141–3. (1990). Vague objects. Mind, 99: 535–57. (1997 [1994]). Sorites paradoxes and the semantics of vagueness. In Keefe and Smith (1997b), 281–93. (1994). Why the vague need not be higher-order vague. Mind, 103: 43–5. (1995). Vagueness: Welcome to the quicksand. Southern Journal of Philosophy, 33 (supp.): 1–22. (1996). Fuzzy realism and the problem of the many. Philosophical Studies, 81: 215–25. Unger, Peter (1979a). There are no ordinary things. Synthese, 41: 117–54. (1979b). Why there are no people. Midwest Studies in Philosophy, 4: 177–222. Urquhart, Alasdair (1986). Many-valued logic. In D. Gabbay and F. Guenthner (eds.), Handbook of Philosophical Logic, iii. 71–116. D. Reidel Publishing Company, Dordrecht. Van Cleve, James (1992). Semantic supervenience and referential indeterminacy. Journal of Philosophy, 89: 344–61.
330 Van Fraassen, Bas C. (1966). Singular terms, truth-value gaps and free logic. Journal of Philosophy, 63: 481–95. Varzi, Achille C. (2001). Vagueness in geography. Philosophy & Geography, 4: 49–65. (2003a). Higher-order vagueness and the vagueness of ‘vague’. Mind, 112: 295–9. (2003b). Indeterminate identities and semantic indeterminacy. Typescript (available at ); text of a talk presented at The Philosophy of Terence Parsons: Logic, Metaphysics, and Natural Language, University of Notre Dame, 8 February. (2007). Supervaluationism and its logics. Mind, 116: 633–75. Wallace, John (1977). Only in the context of a sentence do words have any meaning. Midwest Studies in Philosophy, 2: 144–64. Weatherson, Brain (2003a). Epistemicism, parasites, and vague names. Australasian Journal of Philosophy, 81: 276–9. (2003b). Many many problems. Philosophical Quarterly, 53: 481–501. (2004). Vagueness and pragmatics. Typescript (available at ). (2005). True, truer, truest. Philosophical Studies, 123: 47–70. (2006). Vagueness as indeterminacy. Typescript (available at ). Weiner, Joan (2004). Frege Explained: From Arithmetic to Analytic Philosophy. Open Court, Chicago. Williamson, Timothy (1992). Inexact knowledge. Mind, 101: 217–42. (1997 [1992]). Vagueness and ignorance. In Keefe and Smith (1997b), 265–80. (1994). Vagueness. Routledge, London. (1996a). Putnam on the sorites paradox. Philosophical Papers, 25: 47–56. (1996b). What makes it a heap? Erkenntnis, 44: 327–39. (1999). On the structure of higher-order vagueness. Mind, 108: 127–43. (2000). Knowledge and its Limits. Oxford University Press, Oxford. (2003). Vagueness in reality. In Michael J. Loux and Dean W. Zimmerman (eds.), The Oxford Handbook of Metaphysics, 690–715. Oxford University Press, Oxford. Wright, Crispin (1973). On the coherence of vague predicates. Synthese, 30: 325–65. (1987). Further reflections on the sorites paradox. Philosophical Topics, 15: 227–90. (1992). Is higher-order vagueness coherent? Analysis, 52: 129–39.
331
(1995). The epistemic conception of vagueness. Southern Journal of Philosophy, 33 (supp.): 133–59. (2001). On being in a quandary: Relativism vagueness logical revisionism. Mind, 110: 45–98. (2003). Vagueness: A fifth column approach. In JC Beall (ed.), Liars and Heaps: New Essays on Paradox, 84–105. Clarendon Press, Oxford. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8: 338–53. (1975). Fuzzy logic and approximate reasoning. Synthese, 30: 407–28. Zimmerman, Dean W. (2005). The A-theory of time, the B-theory of time, and ‘taking tense seriously’. Dialectica, 59: 401–57.
This page intentionally left blank
Index Abbott, B. 32, 47 n. 21 action 230–2, 235, 239–41, 243–5 additivity 234, 236 n. 36 adjunction 95 adjustment, rules of 116–17, 119 Akiba, K. 89 n. 66, 94 n. 68 algebra 22 Boolean 23–4, 26, 28, 55, 62, 213, 225 Kleene 24, 62, 63, 213, 225 Lindenbaum 27 n. 6 De Morgan 24 of subsets 28, 63 of truth values 25–6, 31, 50–5, 60–2, 66, 71, 74, 119 n. 93, 213, 225 van Altena, J. vi analytical hypothesis 283 n. 8 antiextension 73 n. antirealism, semantic 47–8, 123–4 antirepresentationalism 47, 123 argument (of a function) 20 arity 29 n. Armstrong, D. vi arrows 21 artificial precision problem 277–9, 290, 292–4, 304 assent 37, 41, 281 assertibility 180, 226, 248–63, 269, 288–9; see also closeness, JA-; WAM condition 182–3, 256, 284 n. 11 assertion 226–7, 229, 246–7; see also degree of assertion grade (statement) 223–4, 250 n. 57 norm of 250 n. 56, 254–5, 288 justified, see assertibility warranted, see assertibility Barnett, D. 244 basis (for a topology) 152–3 Beall, JC 94 n. 68 Beaney, M. 2 n. 4, 136 n. 14 behaviour 281, 283 n. 8 belief 228–9, 235, 244 degree of, see degree of belief partial, see degree of belief
Benacerraf, P. vi, 213 n. Bennett, B. 79 n. bet 241–4 betting quotient 243–4 bijection, see correspondence bivalence 160 n. 55, 175–6, 183, 225, 240, 248, 274–5 Black, M. 60 n., 135, 136 n. 15 Boghossian, P. 282 n. 7 Bonini, N. 252 n. 59 borderline case 1, 119, 133–6, 166, 172–4, 251–3, 260 n. 72, 294 boundary: blurred 1–2, 136, 137, 165–6, 181–3, 203–5, 291 ω-order 194–5 boundarylessness 174 n. 65, 182 Braddon-Mitchell, D. vi Bradley, W. 114 Brady, R. 66 n. Brandom, R. vi Braun, D. 111 n., 137 Brentano, F. 282 n. 7 Brynner, Y. 37 Bueno, O. 131 n. 2, 134, 136 n. 17, 140 n. 22 Building-Block theory 48 Burgess, J. A. 40 n., 41, 86 n. 59, 174 n. 66, 291 Burgess, J. P. vi Burns, L. 112 n. 79, 173 n. calculus, propositional 27 Campbell, K. vi Campbell, R. 34 n. 2 cardinality 89 n. 65, 299 Cargile, J. 34 n. 2 Cartesian product 19 certainty 233–4, 237 Chalmers, D. 245 Chambers, T. 123 n. character 225 Chisholm, R. 282 n. 7 circumstance 225
334 van Cleve, J. 32, 282 n. 7 classical theory of vagueness 4, 5, 15; see also epistemicism; logic, classical; model theory, classical; semantics, classical; set theory, classical closeness 139–43, 179–80, 198–9, 202, 214–15, 270–3, 290, 292, 309–11, 318 absolute 144–5, 153–5, 166 definition of vagueness 145–6, 151–9, 165–75, 180–1, 205 JA- 180–5, 203–5 relative 143–4, 152–3 in respect of truth 147–52, 188–9, 192, 196–8, 201, 271, 300, 309 TA- 203–5 and tolerance 8, 159–65, 250 codomain 20 Colyvan, M. vi, 94 n. 68, 131 n. 2, 134, 136 n. 17, 140 n. 22, 184 n. community 59, 83, 99, 107 comparison 296–7 complement (of a set) 18, 25; see also lattice, complemented complete (logic) 45; see also lattice, complete complexity 285; see also Kolmogorov complexity concatenation 296–7 concept 71 conditional 27, 68–9, 265–9, 273 conditionalization 247–8 conjunction 25, 26, 61, 68, 69, 85 n., 95 connective 26, 265, 297 consequence 10, 82, 94–6, 209, 220–4, 265, 267–8, 271 n. 82 multiple-conclusion 95–6 constant: propositional 26, 62, 64 individual 29, 64 content 101, 182, 199, 225, 254, 258, 263, 272, 288–91, 307–8 context 113–14, 165, 225, 226–7, 254, 258, 263, 288 non-defective 227, 247 set 226–7, 246–7 contextualism 113–22, 200–6, 255–6; see also Sorites paradox, contextualist approach continuity, see function, continuous continuum 154, 177–8, 189
contradiction 88–9, 160, 260 n. 72 convention 112 Cook, R. 305 n. Coombs, C. 300 n. 20 Copeland, J. 279 n., 290 n. correspondence 20 cost/benefit analysis 129–31 counterexample 84 counterfactual 225 credence 239–40 Cusbert, J. vi, 155 n. 45, 169 n. cut-off 57, 79, 83–4, 107–8; see also vagueness, higher-order Davidson, D. 32, 47, 48 Debreu, G. 302 n. 23 definitely operator 174 n. 66, 195–7 definition 127–9, 131–2; see also vagueness, definition of degree: of assertion 246–51, 288–9 of belief 227–46 of membership 61, 63 of truth 10, 61, 147–51, 175, 188–91, 201, 209–20, 224–9, 233–48, 249–50, 270, 278, 291, 293, 296, 318–19; see also supervaluationism, degree form De Morgan: algebra, see algebra, De Morgan laws 24, 69, 236 n. 35 denotation, partial 100 DeRose, K. 255 n. desire 235 n. determinately operator 233 n., 257 n. 68 determination 279, 283–4, 291–3; see also meaning, determination of diachronic account of vagueness 113, 120 disagreement 131–2 discrete: domain 154–5 topology 153 series 178, 189 discursive logic 94–5 disjoint 19 disjunction 25, 26, 61, 68, 69, 85 n., 95 disposition 44, 136 n. 15, 139, 283 n. 8, 284–5, 290–1, 314–15 dissent 37, 41, 281 Divers, J. 46 n. 18
domain: of a function 20 of an interpretation 29, 64 dual 52, 69 duality, see involution Dummett, M. 79 n., 82 n. 55, 140 n. 22, 164 n., 251 n., 260 n. 71 Dutch book 241–3 Edgington, D. 174 n. 66, 222 n. 16, 241 n., 263–5, 305 n. Egan, A. 243 n. Eklund, M. 134 n. 8, 138 n. 20, 139–40 element 17 eligibility (referential) 284–5, 303, 317 epistemic state 233–4, 240, 245–6 epistemicism 34–45, 119, 175–86, 197, 223, 286, 292, 317; see also Sorites paradox, epistemicist approach equivalence, logical 27, 62 error theory 185–6, 205–6 ersatzism 46 n. 18 essence 174 evidence 232, 281 expectation 234–6; see also utility, expected of truth 239–41 expected value, see expectation explosion 94 n. 69 express (a content) 254 expressive completeness 265 extension: of a predicate 2 n. 4, 5–7, 30, 34, 37–40, 43–4, 49, 58–9, 65, 70, 73 n., 91, 96, 99, 166, 280 n. 2, 301 n., 313 of an interpretation 77, 79–81, 83, 88 n. 61, 89, 96–7, 101–3, 116, 192–7, 199 F (a predicate): -connected set 156–7 -diverse set 156–7, 172 n., 179 -relevant respects 140, 146–7, 151–2, 157, 167, 192, 214 -uniform set 156 Fabio 37, 280 Fara, D. 36 n. 5, 115 n. 83, 134, 173, 211 feature, constitutive 135, 136, 137, 171, 244
335
Field, H. 32, 100 n. 75, 137, 233 n., 291 n. 16, 305 n., 311 n. Fine, K. vi, 79, 82 n. 55, 85, 101 n. 75, 133 n. 7, 135, 137 n., 155 n. 46, 173, 194–7, 258 Fodor, J. 199, 200 n., 282 n. 7 Forbes, G. 71, 160 n. 56, 215 n., 256 n. 66, 260 n. 72, 266 n. form 289 formula, well-formed, see wf van Fraassen, B. 79, 100 Freedom 115, 118 n. 92, 121 Frege, G. 2 n. 4, 3, 136 n. 14, 165, 183, 212–13, 276 function 20–1, 212 characteristic 28 n., 71–3, 151–2, 166, 177 composite 21–2 constant 155, 178 continuous 152–5, 162 n., 178, 190 indicator, see function, characteristic into, see function, one-one inverse 22 measurable 236 n-place 22 one-one 21 onto 21 partial 21, 71–3 total 21 future contingents 51 n. 25 fuzzy plurivaluationism 277, 286–96, 300–1, 304–5, 311, 317–19 fuzzy theory of vagueness 60–1, 188–91, 319; see also logic, fuzzy; model theory, fuzzy; set theory, fuzzy; Sorites paradox, fuzzy approach gappy theory of vagueness 72–3, 96–7, 317–18; see also property, gappy; set, gappy; truth gap recursive 74–6 non-recursive, see supervaluationism gavagai 137, 281 Givant, S. 27 n. 6 G¨odel, K. 67, 69 Goguen, J. 60 n., 68 n. 45, 69, 119 n. 93, 149 n., 152 n. 38, 279 n., 294, 297, 301 n., 306 governing view 159–61 Graff, D., see Fara, D.
336 Greenough, P. 131 n. 2, 132 n., 137–8, 157 n. 50, 182–3, 204 Gregory, D. 46 n. 18 Grice, P. 226 Haack, S. 11, 210 n. 2, 276, 277 H´ajek, A. vi H´ajek, P. 32, 60 n., 67 n., 68, 69 n. 48, 149 n., 276 n. 88 Halmos, P. 27 n. 6 Hansson, B. 302 n. 23 Hardin, C. 140 n. 23 Hart, W. 142 n. 26 Heck, R. 174 n. 66 hedging response 1, 37–8, 41, 58, 86 Hesperus 245 Hohwy, J. 282 n. 7 Horgan, T. 118, 305 n. Horwich, P. 34 n. 2, 274 n. Humberstone, L. 86 n. 59 Hyde, D. 82 n. 54, 94–5, 135–6 n. 11 identity 71, 270–1 ignorance 42–4, 134, 138, 243 incoherence 242–3 inconsistency 94 n. 69, 139, 163 indeterminacy 279, 281–2, 284, 298–9; see also reference, indeterminacy of; semantic indeterminacy indeterminism 41 indexical 225 indiscriminability, see indistinguishability indistinguishability 164–5 induction, mathematical 140 n. 22 inference grade (statement) 223–4 infimum 23 injection, see function, one-one instrumentalism 47 intended interpretation 5, 32, 33, 45, 50, 56, 66, 70, 80–1, 92, 96–9, 101–3, 113, 115–16, 119–22, 192, 199, 200, 254, 257 n. 67, 270 n., 286–9, 291–4, 302–3, 311, 313 problem of 279–86 interpretation 4, 26, 29, 56, 62, 64, 112; see also wf, interpreted acceptable 99, 107–8, 120, 122, 198–200, 286–9, 292–3, 295–6, 300, 302–3, 304–5, 311–15
admissible 81, 96, 98; see also extension of an interpretation correct, see intended interpretation; interpretation, acceptable intended, see intended interpretation partial 71–2, 76–7, 81, 89 n. 64, 96–7, 101–3, 115–16, 192 intersection 19, 25 involution 24 Islam, A. vi, 32, 46 n. 18, 101 n. 75, 213 n., 219 n. 13 Ismael, J. vi Ja´skowski, S. 94 Jeffrey, R. vi Johnston, M. vi, 136 n. 15 jolt problem 58, 59 n. 34, 117, 172–4, 192–8, 292, 317–19 judgement-dependence 115 n. 85 Julius Caesar 1 jump point 177, 187–8, 190; see also jolt problem Kamp, H. 79 n., 89 n. 64, 115 n. 83 Kaplan, D. 225 Keefe, R. 15, 16, 40 n., 79 n., 95, 113 n., 134, 173 n., 193–4, 199 n. 20, 214–16, 219, 257 n. 68, 258, 260, 277–9, 294, 304 n., 305 n., 310 n. Kleene, S. 53 n., 54 n., 74, 75 Kleene truth tables: strong 54 n., 75, 78 weak 53 n., 74 Klir, G. 60 n., 67, 68, 69 n. 47 knowledge 138 Kolmogorov complexity 285 n. 13 Krantz, D. 300 n. 20 Kripke, S. 38 n., 73 n., 245 n. 50, 280 n. 4, 282 n. 7 Kripkenstein 39 n. 10, 280, 282 n. 7, 283–4, 285 n. 13, 290, 293, 318 Lakoff, G. 60 n., 266 n., 279 n. lattice 22–3, 55 bounded 23 complemented 23 complete 23, 64, 66 n. distributive 23 identities 23
Lepore, E. 32, 199, 200 n. Lewis, D. 32, 36 n. 7, 89 n. 64, 106, 111–13, 137 n., 225, 239, 245, 284 n. 12 liar paradox, see semantic paradoxes location problem 36–41, 58–9, 99, 117 n. 89, 120, 172, 197–8, 291–2, 317–19 logic: classical 3, 26–8, 29–31, 82, 119–20, 220–4 four-valued 59 fuzzy 60, 62–70, 275–6 intuitionist 122–4 many-valued 50, 59–60, 64 n., 87–8, 119, 188–91, 209–10, 220–1, 251, 275 modal 195–6 non-classical 3 paraconsistent 94 three-valued 50–6, 63, 89 n. 64, 186–7 Luce, R. 300 n. 20 Łukasiewicz, J. 54 n., 59 n. 35, 60 n., 63, 68, 69, 275 MacFarlane, J. 230 Machina, K. 60 n., 86 n. 58, 266 n., 279 n., 297 n. 18 Malinowski, G. 59 n. 35, 60 n., 67 n., 69 n. 46, 149 n. map, see function Mares, E. vi margin for error 41, 42–4 mark, see symptom matching, see indistinguishability mathematics, language of 2–4, 15, 60, 101 n., 121, 211, 306 McGee, V. 100, 199, 200 n., 291 n. 16, 305 n. McLaughlin, B. 199 n. 19, 200 n., 305 n. meaning 97, 124, 281; see also content; interpretation; semantics determination of 36–41, 43–4, 51, 58–9, 72, 99, 107, 283 n. 9, 284–6, 290–1, 302–4, 311–15, 317, 319; see also intended interpretation, problem of; location problem multiplicity of, see meaning, plurality of plurality of 101, 105, 287; see also semantic indeterminacy measure 88–90; see also probability measure
337
measurement 214–20, 296–302 Mehlberg, H. 79 n. member, see element Menger, K. 68 n. 43 Menzel, C. 32 Merrill, G. 32, 284 n. 12 metalanguage, vague 194, 305–8 metric 143–4, 147–50, 154 n. 43 space 68, 145 n. 31 Milne, P. 242 n., 244 n. missing explanation argument 136, 138 Moˇckoˇr, J. 60 n., 62 n. model, see interpretation model-theoretic argument 280 model theory 4, 45–50, 123–4, 210, 254, 284 n. 11 classical 5, 29–31, 122, 200, 211 fuzzy 64–6, 211, 277, 280, 292–3 modus ponens 266 n. Momtchiloff, P. vi MU principle 37 natural kind 38–9 naturalness 290–1; see also eligibility (referential) negation 25, 26, 61, 67, 69, 123 Nguyen, H. 60 n., 63, 67 n., 68 n. 45 normal conditions 138 Norwich, A. 301 n. Nov´ak, V. 60 n., 62 n., 70 n., 152 n. 38, 276 n. 88 n-tuple, ordered 19 object 71, 158 odds 243 open texture 115 n. 84 operation, see function operation on truth values 25, 27 n. 5, 59, 61, 67–70, 225, 265; see also algebra of truth values order: linear 22, 293–6, 302–4 partial 22 Osherson, D. 252 n. 59 pair, ordered 19 parasitic strategy 40–1 Parsons, J. 243 n. Peirce, C. 1 n. 3, 135, 136 n. 15
338 penumbral connection 85–7, 196; see also truth-functionality Perfilieva, I. 60 n., 62 n. Perry, J. 245 Perszyk, K. vi Phosphorus 245 Pinkal, M. 79 n. Plantinga, A. 46 n. 18 Plato 282 n. 7 plurivaluationism 82 n. 55, 96, 98–113, 120, 287, 289, 291, 292, 302 n. 24, 319; see also fuzzy plurivaluationism; Sorites paradox, plurivaluationist approach poset, see order, partial possible world 226, 233–4, 245–6; see also semantics, modal Post, E. 60 n., 149 n. Power 115, 118 n. 92, 121 practice 47, 112–13, 197–9, 290, 292, 295, 302–4, 311–14, 318–19 pragmatics, conversational 226–7, 229, 246–8 pragmatism 111–13 precisification 79–81, 101, 107, 193–6, 199–200, 268, 292, 303, 318; see also extension of an interpretation precomplement 69 predicate 1 n. 1, 29, 64, 157, 281 n. modifier 158–9 multi-dimensional 294 observational 164–5 one-dimensional 157 n. 50 preference 235–6, 302 presupposition 226–7, 246–7 Price, H. vi Priest, G. vi, 94 n. 69, 222 n. 16, 266 n., 290 n. primitivism, semantic 282–3 probability 229–30, 232–4, 237–40, 241–3 axioms 234, 236 n. 36, 244 imprecise 245–6 measure 233–4, 236, 238–40, 245–8 proof theory 123–4, 222 property 71, 157–8, 211, 213 fundamental 157 gappy 96 proposition 27, 63, 226, 234, 254; see also constant, propositional independent 230 mutually exclusive 238
Przełe¸cki, M. 32, 100 n. 75 publicity premiss 281–2, 284 Putnam, H. 32, 38 n., 47 n. 20, 122–3, 280 quaddition 283–4 quandary 138 n. 20 quantifier 29, 65, 83–4 Quine, W. O. 47 n. 20, 137, 280–4, 290, 293, 318 quotation marks 256–7 Rabinowicz, W. 236 n. 33 Raffman, D. 115 n. 83, 165 Raiffa, H. 300 n. 20 Ramsey, F. 210 n. 2, 230–1, 232 n. 27 random variable 234, 236 Rayo, A. 248 n. 53 Rea, G. 123 n. Read, S. 123 realism: metaphysical 280 semantic 46–50, 123–4 reference 46–7 indeterminacy of 47, 49 inscrutability of 280 problem of 285 n. referent 34, 46, 49, 56, 70, 91, 99 relation 20, 71, 157–8 antisymmetric 20, 144 associative 23 connected 20, 144 equivalence 20 idempotent 23 reflexive 20, 144, 195–6 symmetric 20, 196 transitive 20, 144, 195–6 vague 271 n. 82 representation theorem 301–2 Rescher, N. 69 n. 46 residuation 69 residuum 69 Restall, G. 66 n. Robertson, T. 118 n. 92 Rolf, B. 85 n., 279 n., 305 n. Rosen, G. vi, 157 n. 51, 210 n. 3 Routley, R. 94 n. 69 rule 159 Russell, B. 142 n. 25, 164 n.
Sainsbury, M. 134 n. 8, 135, 172–3, 174 n. 65, 210 n. 2, 219 n. 12, 227 n. 24, 265 n., 304 n., 305 n., 310 n. Sanford, D. 85 n., 89 n. 64, 200 n., 297 n. 18 say (a content) 254 scale 219 n. 14, 299–300 absolute 299 interval 299 ordinal 297 ratio 299 Schiffer, S. 36, 135 n. 11, 227–8, 229–31 schort 133, 135, 136 Schwartz, S. 123 n., 279 n., 310 semantic indecision 106; see also semantic indeterminacy semantic indeterminacy 4–6, 44–5, 50, 70–1, 76, 96–8, 100–1 n. 75, 102, 105–7, 121–2, 137, 277, 289–93, 298, 318–19 semantic paradoxes 51 n. 25, 257 n. 67 semantics: applied 46 n. 18 classical 24–33, 44–5, 121, 258, 274, 278 depraved 46 n. 18 frame 195, 225 Kripke 46 n. 18 modal 45–6, 102–4, 195–6 model-theoretic 45–50, 123–4, 254, 284 n. 11, 307 pure 46 n. 18 sentence 254, 258, 272, 281, 288–9 set 17 empty 18 fuzzy, see set theory, fuzzy gappy 72–3 measurable 236 null, see set, empty open, 152–3 partial, see set, gappy power 18 unit 18 set theory: classical 28–9 fuzzy 60, 63–4, 66, 217–18, 220 three-valued 55–6 Shapiro, S. 36 n. 7, 115, 131 n. 2, 134 n. 8, 135, 171 n., 200 n. sharp boundary 172, 182, 193, 204–6, 268–9, 277, 304, 309–10; see also
339
cut-off; location problem; vagueness, higher-order Sider, T. 111 n., 137 σ-field 236, 246 similarity, see closeness simplicity 285–6, 303, 317; see also complexity singleton, see set, unit Skyrms, B. 239, 242 Smith, S. v Smith, N. vi, 158, 254 n. 61, 257 n. 67, 271 n. 82, 276 n. 88, 306 n. Smith, P. 134 Smith, V. v Soames, S. vi, 39 n. 10, 73 n., 113, 115 n. 83, 116 n. 86, 118, 133 n. 7, 171, 204, 218 n., 283 n. 9 Sorensen, R. 34 n. 2, 139 n., 241 n. Sorites: conditional 56–7 dynamic 118, 205 forced march, see Sorites, dynamic paradox, see separate entry below series 2, 145, 160, 166, 167, 171–2, 175, 177, 186 susceptibility 2, 136, 137, 171–2, 183–5, 205 Sorites paradox 56–7, 145, 154, 167, 171–2, 180, 223–4, 261–2, 265–7, 270–3, 291 and closeness 167–72, 184 n. contextualist approach 117–18 epistemicist approach 35–6 fuzzy approach 265–7, 270–3 intuitionist approach 123 plurivaluationist approach 107–9 supervaluationist approach 83–4 three-valued approach 56–7, 117 soundness 45, 98, 222, 288 SPB (standard partial belief ) 230–1 speaker 32 n. 10, 36–7, 83, 86, 101 n., 107, 118, 137–8, 159, 183, 226–7, 254, 256–61, 272, 283–6, 311–15 competent 40, 51, 58–9, 115–16, 119, 121, 139, 143–4 ordinary 170–1, 183, 204, 253 speech act 248–9 stake 242 n. Stalnaker, R. 36 n. 5, 225, 226–7, 229, 246–8, 269 n.
340 state (a content) 254 structure 22, 212 structuralism 212–13 subjunction 95 subset 17–18 proper 18 subvaluationism 93–6, 110–11, 191 Sugeno, M. 67 supervaluation 78 supervaluationism 54, 75, 76–87, 94–103, 106–11, 118–19, 191–7, 199–200, 223, 258, 302, 318; see also Sorites paradox, supervaluationist approach degree form 87–93, 110–11, 112 n. 79, 197, 287 supervenience 39, 40, 44 n., 214, 283 n. 9; see also determination Suppes, P. 129 n., 300 n. 20 supremum 22, 23 surjection, see function, onto survey 252, 254, 312–14 symptom 135, 136, 137, 171 synchronic account of vagueness 113, 120 Tappenden, J. 85 n., 86 n. 59, 113, 115 n. 83, 133 n. 7, 135, 259 Tarski, A. 3, 59 n. 35, 60 n., 274 tautology 78, 82, 89, 221, 238 t-conorm 68 template 210 tendency (to act), see action term 29, 270 n., 281 singular 146 n. 32, 257, 281 n. Thrall, R. 300 n. 20 three-valued theory of vagueness 50–1, 56–9, 71–2, 97; see also logic, three-valued; set theory, three-valued; Sorites paradox, three-valued approach Throop, W. 123 n. t-norm 68 tolerance 7, 139–40, 146, 147, 159–65, 167–71, 182, 250–1, 270–3; see also closeness and tolerance epistemic 137, 182 topology 152–4 transformation 299–301 identity 299 linear 299 monotone 299 similarity 299
translation 281–2 indeterminacy of 281–2, 293, 318 manual 281 truth 31–2, 46, 66, 90–3, 98–9, 147–51, 186, 224–5, 226 condition 263, 284 n. 11 degree of, see degree of truth distance from 264–5 and falsity, symmetry of 40–1 -functionality 54, 57, 75, 78–9, 85–7, 110, 251–65 gap 71–6, 97, 115, 186–7; see also interpretation, partial incomparability in respect of 293–4, 303–4 ordering 25, 54–5, 61, 147, 293–6, 300, 302–4 predicate 197 n., 257, 274–5 simpliciter 31–2, 34, 56, 66, 81–2, 98–9, 280, 293 table 26–7, 74; see also Kleene truth tables value 25, 62, 70, 137, 151, 211–13, 226, 275–6, 298; see also algebra of truth values; operation on truth values; value, designated T-schema 274–5 Turksen, I. 301 n. Tversky, A. 300 n. 20 Tye, M. 136 n. 13, 200 n., 279 n., 304 n., 305 n. UFS (uncertainty-free situation) 237–8, 247 uncertainty 229–32, 238, 243–4, 247 underdetermination 279, 281–2, 284 Unger, P. 168 n., 171 union 19, 25 uniqueness theorem 301–2 Urquhart, A. 85, 278 usage 10, 37–41, 43–4, 58–9, 81, 83, 99, 107, 117 n. 89, 120, 172, 209, 251–3, 263, 283 n. 9, 285–6, 291, 313–15, 317 use 36–41, 44, 51, 58–9, 72, 81, 160, 162–3, 307, 313–15 utility 235 n. 32, 302 expected 235 utter (a sentence) 254 utterance 246, 288–9
vagueness: definition of 1–2, 8–9, 127–40, 152–5, 180–5, 206, 278–9, 291; see also closeness definition of vagueness higher-order 57–8, 117 n. 89, 134, 172–4, 192–3, 197, 290, 304 n., 310–11, 317–18; see also jolt problem; location problem linguistic, see semantic indeterminacy metaphysical 71, 157–8, see also object; worldly vagueness partial versus total 155–7 theory of 3, 15–16, 127, 129–32 validity 10, 35, 60, 82, 95, 98, 123, 130, 168–70, 212, 220–4, 265–6, 270–2, 288–9, 293; see also consequence global and local 82 n. 55 value: designated 220–2 of a function 20 variable 29 free 30 n. Varzi, A. 100, 136 n. 13, 305 n. vector space 145 n. 31 VFS (vagueness-free situation) 237–8, 241–4 Viale, R. 252 n. 59 Vidler, C. vi vocabulary 26, 29 logical 257 n. 67, 271 VPB (vagueness-related partial belief) 230–1 Walker, E. 60 n., 63, 67 n., 68 n. 45 Walker, K. vi
341
Wallace, J. 47 n. 20 WAM (warranted assertibility manoeuvre) 255–6 Weatherson, B. 41, 137 n., 149 n., 150 n., 155 n. 46, 158, 257 n. 68, 259, 260, 262, 273 n., 290–1 Weiner, J. 32 wf 26, 29, 254 atomic 30, 65, 72, 78 closed 30 n. interpreted 112 n. 80, 254, 263 n., 272, 288–9 Williams, R. 41 n. 13, 148 n. Williamson, T. 15, 16, 34 n. 2, 35 n., 36 n. 7, 39–44, 82 n. 55, 85, 123 n., 131 n. 2, 134 n. 9, 173, 181, 183 n., 185 n., 194 n., 224 n. 19, 250 n. 56, 251, 252 n. 59, 256, 258, 260 n. 72, 261, 266 n., 274–5, 279 n., 290 n., 294, 304 n., 305, 307–8 witness 83 problem of missing 83–4, 87, 95–6, 109–10 worldly vagueness 4–6, 44–5, 50, 70–1, 76, 96–8, 121–2, 289, 292 Wright, C. 7, 36 n. 6, 123, 138 n. 20, 139, 159–65, 174 n. 66, 250–1, 267, 304 n. Yager, R. 67, 68 Yuan, B. 60 n., 67, 68, 69 n. 47 Zadeh, L. 60 n., 275–6 Zimmerman, D. 46 n. 18 Zinnes, J. 300 n. 20