Synthese (2011) 183:127–142 DOI 10.1007/s11229-010-9755-x
Why safety doesn’t save closure Marc Alspector-Kelly
Received: 26 February 2010 / Accepted: 2 June 2010 / Published online: 18 June 2010 © Springer Science+Business Media B.V. 2010
Abstract
Knowledge closure is, roughly, the following claim:
For every agent S and propositions P and Q, if S knows P, knows that P implies Q, and believes Q because it is so implied, then S knows Q. Almost every epistemologist believes that closure is true. Indeed, they often believe that it so obviously true that any theory implying its denial is thereby refuted. Some prominent epistemologists have nevertheless denied it, most famously Fred Dretske and Robert Nozick. There are closure advocates who see other virtues in those accounts, however, and so who introduce revisions of one sort or another in order to preserve closure while maintaining their spirit. One popular approach is to replace the “sensitivity” constraint at the heart of both of those accounts with a “safety” constraint, as advocated by Timothy Williamson, Duncan Pritchard, Ernest Sosa, Stephen Luper, and others. The purpose of this essay is to show that this approach does not succeed: safety does not save closure. And neither does a popular variation on the safety theme, the safe-basis or safe-indicator account. Keywords Safety · Sensitivity · Safe · Sensitive · Safe-basis · Safe-indicator · Closed · Closure · Knowledge
1 Sensitivity and closure Knowledge closure is, roughly, the following claim:
M. Alspector-Kelly (B) Department of Philosophy, Western Michigan University, Kalamazoo, MI 49008-5328, USA e-mail:
[email protected]
123
128
Synthese (2011) 183:127–142
For every agent S and propositions P and Q, if S knows P, knows that P implies Q, and believes Q because it is so implied, then S knows Q.1 Almost every epistemologist believes that closure is true. Indeed, they oftenn believe that it so obviously true that any theory implying its denial is thereby refuted. Some prominent epistemologists have nevertheless denied it, most famously Dretske (1970, 2005a,b) and Nozick (1981). Having discerned other virtues in Dretske’s and/or Nozick’s accounts, some closure advocates have introduced revisions of one sort or another in order to preserve closure while maintaining their spirit.2 One popular approach is to replace the “sensitivity” constraint at the heart of both of those accounts with a “safety” constraint, as advocated by Timothy Williamson, Duncan Pritchard, Ernest Sosa, Stephen Luper, and others.3 The purpose of this essay is to show that this approach does not succeed: safety does not save closure. And neither does a popular variation on the safety theme, the safe-basis or safe-indicator account. Nozick and, supposedly, Dretske—I will explain the “supposedly” below4 —require that in order for S to know that P her belief in P must be sensitive, where a belief is sensitive iff were P to be false S would not believe it. This modal conditional is 1 I say “roughly” because it is widely recognized that it is in fact difficult to formulate closure in a man-
ner that is not susceptible to obvious counterexamples (obvious, that is, to both closure advocates and opponents). The closure advocate nevertheless inevitably claims that there is a formulation within near conceptual proximity of the stated but admittedly false version, albeit one the advocate rarely attempts to formulate, but which is obvious and trivial (or, presumably, would be obvious and trivial upon explicit presentation). The history of such claims—that the standard expression of a principle is clearly false but that a conceptual cousin is nearby which just needs a bit of work to formulate, and which is both somehow obvious and can do serious philosophical work—might reasonably give one pause. Consider, for example, the history of the analytic/synthetic distinction. Hawthorne’s (2005) is a notable exception to this breezy attitude toward closure’s formulation on the part of its advocates. His version—modeled on that of Williamson (2002)—is now perhaps the standard formulation rather than that offered above. According to it, if you know P and competently deduce Q from P while knowing P throughout, and come to believe Q as a result, then you know Q. These formulations will be equivalent if a belief’s being based on a competent deduction is equivalent to its being based on knowledge of the inference relation (and I am uncertain which is better if they are not equivalent). At any rate, the reader is welcome to substitute the Hawthorne/Williamson formulation for that presented here. It won’t matter in what follows, since both apply to the cases discussed below. 2 Aside from the safety view that is the subject of this paper, other good examples are the contexualisms
of Stine (1976) and DeRose (1995) and the updated version of Nozick’s view advocated in Roush’s (2005). Roush’s account is a particularly direct version of this approach: her view essentially involves appending knowledge-by-closure to (probabilistic variations of) Nozick’s tracking conditions, so that one can know either by tracking or by inferring from what one tracks even though one might well not track the latter. Aside from the bare fact that it preserves closure by fiat, as it were, it is hard to see why one attracted to a Nozickian tracking account would endorse a view according to which we are in a position to know a lot that we don’t track. (Given the prevailing sentiment in favor of closure, however, perhaps that’s enough.) 3 Williamson (2002), Pritchard (2005, 2008, 2009), Sosa (1999a,b), Luper (2006). One of the earliest statements of such a view—perhaps the earliest—is Luper 1984 (in which a version of what is referred to here as the safe-basis account is defended). I should note that significant variety exists within this family of views (even beyond that which I will mark between safety-belief and safe-basis views). Luper (1984), for example, refers to a sequence of causes, each of which is safe vis-à-vis the proposition known, terminating in the belief; Pritchard (2008, 2009) distinguish between worlds that are very close nearby worlds from those that are nearby but not very close (see note 14). I pass over these differences because, so far as I can discern, the points made here concerning the relation between safety and closure apply to all such variations. 4 See Sect. 4.
123
Synthese (2011) 183:127–142
129
almost universally interpreted in accordance with the Lewis/Stalnaker account of such conditionals: in the nearest world in which P is false S does not believe that P.5 As Nozick and Dretske both concede, sensitivity implies closure failure: it is possible for an agent to have a sensitive belief that P, to infer Q from it, and still have an insensitive belief that Q. For example: you believe that the Broncos beat Buffalo yesterday; you just read that result in today’s Kalamazoo Gazette. That belief is sensitive: if the Broncos had lost, the newspaper would have reported that they had lost. (In the nearest world in which the Broncos lost, the Gazette would report that fact and you would believe that instead.) Since the Broncos did in fact win, the Gazette didn’t erroneously report that they won as a result of an accident while typesetting. Suppose that you believe that the erroneous report did not in fact happen. (It follows, after all, from what you believe, namely, that the Broncos beat Buffalo, and you believe in closure.) That belief is not sensitive: if the paper had erroneously reported the score, you would still believe that they hadn’t. So if knowledge requires sensitivity, you know that the Broncos won. But you don’t know what follows from this, that the Gazette didn’t erroneously report the score. I happen to think that this is the right result. My purpose here is not, however, to attack closure, but only to argue that safety does not preserve it. Notwithstanding such examples, the safety theorist considers closure to be worth preserving. But she still likes the idea that knowledge requires some sort of ruling-out of counterfactual worlds in which you believe P when P is false. So she offers safety instead. 2 Characterizing safety There are a variety of expressions of safety—were S to believe P, P would be true, S’s belief that P is not easily wrong, and so on—but they come down to the same thing when characterized in terms of possible worlds: a belief is safe iff within the space of nearby possible worlds—worlds similar to the actual according to some measure— every world in which the agent believes it, it is true; equivalently, there is no such world in which the belief is false but the agent still believes it. I am aware of no reasonably precise characterization of the boundary between worlds nearby and remote (call this the “nearness boundary”) in the relevant literature. But this much is clear: that boundary does not vary with content of P.6 There are, 5 Lewis (1973), Stalnaker (1968). Nozick does not insist on this account, and in fact tentatively proposes
instead that Q is true throughout the nearest “P neighborhood” of the actual world. See Nozick (1981, p. 680, fn. 8). Dretske’s “official” view—his information-theoretic account formulated in Dretske (1981)—is in not expressed in counterfactual terms at all, but is instead expressed probabilistically: S knows that P iff S’s belief that P is caused (or causally sustained) by (a state which manifests) the information that P, where a state manifests the information that P when the probability of P given the existence of that state is 1. Notwithstanding this fact, and the fact that probability and possible-world remoteness are not equivalent (see footnote 18), Dretske has, to my knowledge, demonstrated no resistance to the subjunctive formulation of his view (modulo the point in Sect. 4 referred to in the previous footnote), and I here continue the tradition of doing so. (The relationship between the two formulations, however, deserves exploration.) 6 This contrasts with sensitivity: the boundary between the sphere centered on the actual world in which there are no not-P worlds and the greater sphere (doughnut?) beyond—set by the distance to the nearest possible not-P world(s)—varies with the content of (not-)P.
123
130
Synthese (2011) 183:127–142
as a result, two very different ways in which a belief can be safe. If there are worlds in which it is false which lie within that boundary, then safety requires that the agent not believe it in any such world. Call these “near-safe” beliefs. In these cases the safety requirement is more stringent than sensitivity and implies it. If a belief is near-safe then there are nearby worlds in which P is false (by definition of “near-safe”). Therefore the nearest world in which P is false—one of those worlds—is itself nearby. Since the belief is near-safe, the agent believes P in none of those nearby not-P worlds. Therefore, she does not believe it in that nearest world. Therefore, her belief in P is sensitive. Sensitivity does not, however, imply near-safety. Suppose that the nearest not-P world is nearby. Sensitivity requires that she not believe it in that world; but that is consistent with her believing it in more remote but still nearby not-P worlds, thereby violating near-safety. A belief’s being near-safe is not assured by the content of the proposition believed: one person can believe that proposition and know it, while another can believe it and not know it. Something—some feature of the way things actually are with the agent and/or her environment—must therefore rule out the existence of such worlds.7 However, if there are no nearby worlds in which the proposition believed is false—if the nearest such world lies beyond the nearness boundary—then the belief is automatically safe: it is not easily wrong simply because the proposition believed is not easily false. Call these “far-safe” beliefs. A far-safe belief is therefore known as soon as it is believed: it is impossible for one person to believe it and know it and another to believe it and not know it.8 That knowledge therefore imposes no conditions on the agent or her environment whatsoever (beyond, of course, those facts which ensure that the proposition’s negation is remote). It is hard to imagine two more disparate knowledge-conditions. One constrains the agent and/or her environment in order to ensure that certain modal conditions hold, conditions more stringent than those imposed by sensitivity. The other imposes no such conditions whatsoever, beyond mere believing itself: with far-safe beliefs, to believe is to know, no matter why you believe it.
3 Two concerns in passing While not directly concerned with the subject-matter at hand—the relation between safety and closure—I cannot resist presenting two brief concerns about the safety account just characterized. We will return to the main program in Sect. 4.
7 The mythology of possible worlds tends to encourage the image that the truth-maker of modal claims is
settled independently of the facts on the ground of the actual world. The actual world is, after all, only one world among many and enjoys no causal contact with them; how could the nature and distribution of all those infinitely many other worlds depend on how things happen to be on the actual world? But the structure of possible-worlds space is a function of the actual facts: it is because of the actual molecular structure of glass that it is fragile, because Bob is actually standing right next to the cliff’s edge that the world in which he falls is unsettlingly nearby, and so on. Modal/dispositional properties have categorical/actual bases. 8 This assumes that a belief’s being safe is sufficient for its being known. We will examine that assumption
more closely in Sect. 8.
123
Synthese (2011) 183:127–142
131
First, the fact that mere believing can suffice for knowledge of far-safe beliefs is a very worrisome aspect of the view. Safety theorists respond to this worry by pointing out that we do seem somehow committed to believing their paradigmatic far-safe belief—that I am not a brain in a vat—notwithstanding the apparent utter lack of evidence for it (by design, of course).9 Perhaps such an in-principle-undetectable possibility would have to be known by courtesy, as it were, assuming it to be known at all. But in other cases such knowledge-by-courtesy is simply untenable. For example, in 1999a Sosa presents the famous garbage-chute case against the sensitivity account. I drop a bag of garbage down the garbage chute in my apartment and it arrives as expected in the basement. I believe, and (Sosa insists) I know, that the bag is in the basement. But if the bag were not in the basement I would still believe that it is, since it would have failed to arrive in virtue of having been caught on a protruding nail (or by having suffered some fate along those lines) in which case I would still think it made it all the way down.10 Sosa (and others, including Pritchard 2008) claim that the safety theorist is better positioned to deal with this case, since the belief that it is in the basement is, they claim, safe. It is not, however, near-safe. If it were, then “the bag is in the basement” would be easily false: there would be a nearby world in which the bag is not in the basement, presumably as a result of the same protruding nail that brought the sensitivity account to grief. But then I would still believe it was in the basement, rendering my belief unsafe.11 So if my belief is safe at all, it is far-safe. But far-safe beliefs are known merely by being believed. So my reason for believing that the bag is in the basement— that I dropped it down the chute—is irrelevant; I know it whatever my reason might be, so long as I believe it. Now vary the example a bit. Suppose that the custodian, having noticed my putting the garbage bag in the hallway closet instead of down the chute, later (and out of my sight) begrudgingly removed it from the closet and dropped it down the chute. I believe that my bag is in the basement because I bizarrely (and unreasonably) think that I teleported it there by placing it in the closet. My belief remains far-safe: if there was no nearby world in which the bag failed to arrive in the basement in the original scenario, nothing in these alterations changes that. But the suggestion that I know this proposition—that my bag is in the basement—merely in virtue of my believing it, despite my being utterly unaware that it went down the chute—is rather hard to swallow. 9 Sosa, for example, takes the sensitivity theorist to task for failing to make sense of your (unknown, according to her) belief that you are not a brain in a vat in Sosa (1999a). 10 There are issues here concerning the status of “backtracking” conditionals, the relation between subjunctive conditionals and their Lewis/Stalnaker interpretations, and related matters that I will not pause to investigate, granting instead for the sake of argument that sensitivity fails in this case. 11 Presumably “were P true then S would believe P” admits of no more backtracking than does “were P
false S would not believe P”. In neither case can we alter the past fact that the bag was dropped into the chute: we can look only to worlds in which the bag was dropped when evaluating either counterfactual. So if there is a nearby world in which the bag is nevertheless not in the basement, this must be because something happened to it on the way down. Consistency, at least, appears to require this of the safety theorist who wields this example against sensitivity.
123
132
Synthese (2011) 183:127–142
Nor is this an isolated concern. For similar reasons, the safety theorist must treat my knowing that there is not a mule cleverly disguised to look like a zebra in the zebra paddock (as per Dretske’s famous example in his 1970) as far-safe. (If there were a nearby world in which it is a disguised mule I would still think that it isn’t, rendering the belief unsafe.) But then even if I don’t visit the zoo at all—even if I know absolutely nothing about zoos or their inclinations vis-a-vis the disguising of animals—so long as I nevertheless believe, for whatever bizarre reason, that there isn’t a disguised mule in the zebra paddock of that particular zoo, I will know it. And similarly for any number of parallel, easily constructible cases. The present concern assumes that safety is offered as sufficient for knowledge. The safety theorist might suggest instead that while safety is necessary for knowledge, it is not sufficient.12 She might, for example, suggest that knowledge requires both that the belief be safe and that the agent possess an adequate reason for that belief. Doing so will allay the concern being canvassed. Such a view might, however, strike one as ad hoc. A natural understanding of the purpose of requiring that the agent possess an adequate reason is to eliminate a belief’s being merely luckily true; that is, after all, a traditional motivation for including a justification component in the analysis of knowledge. But insofar as safety is understood—as it is by Pritchard, for example—to ensure that the belief is not true merely by luck, then such an understanding is inapplicable vis-à-vis far-safe beliefs, which are guaranteed to be non-lucky no matter what the agent’s reasons might be. On the other hand, if the adequate-reason requirement is not imposed in order to eliminate luck, then it is unclear why the safety theorist would impose it (except merely as a stopgap in way of resolving this problem).13 My second concern in passing is that, given the dramatic shift in the requirements for knowledge and when the boundary between worlds nearby and far is crossed, one would expect a clear, non-arbitrary specification of the line separating them. But, as indicated earlier, I am aware of no such specification. This would perhaps be a more acceptable state of affairs if the boundary is vague (as indeed it intuitively is). But there is no obvious corresponding continuum in the shift in the requirements for knowledge on the safety account: if the negation of the putatively known proposition is far, mere belief is enough, whereas if it is near, much more is required. Knowledge itself might be a vague property, but if so that does not translate into vagueness in the applicability of these very different requirements for knowledge when it exists.14 12 I will be considering this view in more detail in Sect. 8. 13 Safety theorists are well aware of this problem. Logically necessary propositions, for example, are
inevitably far-safe because there can be no nearby worlds in which the proposition is false (simply because there are no worlds, near or far, in which the proposition is false at all). As a result, some safety theorists—Pritchard, for example—restrict the equivalence of knowledge and safe belief to “fully” contingent propositions capable of being false in nearby possible worlds (see, e.g., Pritchard 2007). That restriction will not however resolve the present problem, which concerns clearly contingent propositions. 14 Pritchard’s version of safety might be employed to respond to this concern. In Pritchard (2008) he
suggests that safety perhaps requires only that S not believe P in the nearest of the nearby worlds, although S must not believe P in most of the nearby worlds overall. Perhaps one could generalize this into a continuum by saying that the further out one travels in possible-worlds space away from the actual world, the greater the proportion of not-P worlds in which S believes P to those in which S does not believe P can be, consistently with S’s knowing P (until eventually one reaches a distance at which any such proportion is so consistent).
123
Synthese (2011) 183:127–142
133
Sosa (1999a) and others have complained that advocates of relevant alternatives theory (RAT) rarely offer a precise characterization of the distinction between relevant and irrelevant alternatives. But it would appear that the same complaint can be lodged against Sosa’s own safety account.15 This is especially so given that the separation between worlds near and far plays a similar role in the safety account to that played by the relevant/irrelevant alternatives distinction in RAT theories: both are intended to prevent distant sceptical scenarios from having an adverse impact on everyday knowledge, notwithstanding our apparent lack of evidence against them.16 4 Closure-failure and safety Put those concerns aside, however, and consider the question with which we are directly concerned: does safety preserve closure? Assume that S knows P and knows that P implies Q, believing Q as a result. Assuming also that knowledge requires safety, S’s belief that P is safe (as is S’s belief in the inference). If closure fails, then S’s resulting belief in Q must be unsafe. If it is unsafe, the nearest not-Q world cannot lie outside the nearby-words boundary; that would ensure its far-safety. So S’s belief in Q can only be unsafe because there are nearby worlds in which not-Q and S believes Q. That P implies Q requires that every world in which not-Q is one in which not-P. Therefore if there is a nearby not-Q world in which S believes Q, it is also a not-P world. Since S’s belief in P is safe, S does not believe P in such a world. The question whether safety implies closure turns, then, on whether the fact that S does not believe P in every nearby world in which Q is false implies that S also does not believe Q in every such world. If it does, then safety implies closure; for there would be no nearby world in which S believes Q and Q is false, as would be required for a counterexample to closure. One might argue for this—that S’s not believing that P in nearby not-Q worlds means that S does not believe Q in those worlds—as follows. (This reasoning is my best guess as to why it is often believed that safety implies closure.) We are, after all, talking about closure, wherein S believes Q because she infers it from P. But if S believes Q because she infers it from P, then S believes P. So the counterexample must describe a nearby world in which S infers Q from P, and therefore believes P, and in which P is safe, but where S’s belief in Q is not. But that’s impossible. Again, if S’s belief in Q is unsafe, then the world demonstrating that this is so is a nearby world in which Q is false but S believes it true. But, again, if Q is false then so is P in that 15 It should be noted that Sosa no longer imposes a safety requirement on knowledge, insisting instead
that the belief be “apt”. See Sosa (2007). In this paper I am concerned with Sosa-as-safety-advocate. 16 It is, moreover, not clear that the complaint succeeds against, at least, Dretske’s version of RAT. On
his view, the alternatives to P that are irrelevant for S’s knowledge of P are simply those that don’t show up in the nearest possible world in which P is false. “The thermometer is broken while reading 101 when my temperature is 98.6” is not a relevant alternative to “my temperature is 101” (as indicated by the intact thermometer) because the thermometer is not broken in the nearest world in which my temperature is not 98.6. In that world—the nearest world in which my temperature is not 98.6—that different temperature is reflected in the thermometer’s reading. So although “my temperature is not 98.6” is relevant—on Dretske’s view the negation of P itself is always relevant—the thermometer’s reading puts me in a position to rule it out.
123
134
Synthese (2011) 183:127–142
(nearby) world. If S’s belief that P is safe, then S must not believe P in that world. But then she can’t have inferred Q from P, which requires that she does believe P. But we’re assuming that she did infer Q from P. So there is no such world. So S’s belief in Q must be safe if her belief in P is. Q.E.D. But this reasoning is flawed. It is true that S did in fact infer Q from P. But it does not follow from that actual fact that she also inferred Q from P in every counterfactual nearby world in which she believes Q. She could, consistently with her actually inferring Q from P, believe Q for other reasons entirely in those worlds. So she could believe Q for other reasons entirely in a nearby world in which Q is false while in no nearby world—not that one nor any other—does she believe a false P. So her belief in P could be safe while her belief in Q is unsafe; safety does not preserve closure. For example: my Lamborghini is in parking lot space 3B and, being paranoid, I am continuously monitoring a closed-circuit monitor trained on it. It is not stolen, and I do not believe that it is. However, one car, chosen randomly, is stolen by the parking lot attendant each night (but they are never stolen by anyone else; aside from these inside jobs it’s a very secure lot).17 I have no idea that this is the case and trust the attendant implicitly, so that if I see him drive the car away on the monitor I will assume that he is just moving it to another space. A world in which the attendant steals my car is a nearby world: the world in which it is stolen is just like the actual except that the attendant randomly chose my car rather than someone else’s.18 In every nearby world in which my car is not in 3B (because the attendant stole it) I will believe that it is not in 3B, since I will see the attendant drive it away on the monitor. So my belief that it is in 3B is safe. That belief also implies that it is not stolen, which is why I actually believe that it isn’t stolen. However, in a world in which the attendant does steal it I will still believe that it was not stolen (thinking that he only moved it to another space). Therefore there is a nearby world in which my car is stolen in which I believe that it is not stolen; my true belief that my car is not stolen is not safe. Closure fails. As it happens, closure also fails in this case on the sensitivity view. My belief that my car is in space 3B is sensitive because the nearest world in which it is not in that space is one in which I don’t believe that it is (thanks to the operation of the monitor). However, in the nearest world in which the proposition that it isn’t stolen is false (that is, in which it is stolen), which is the world in which the attendant steals it, I still believe it is not stolen. 5 Safe-basis views and closure Although surely a disappointment, this result might not be that much of a surprise to some advocates of safety. Indeed, Sosa concedes that safety does not preserve closure. 17 Nor is there a nearby world in which the car is stolen while remaining in 3B (by some sort of illicit transfer of ownership, say). 18 Improbability and world-remoteness are sometimes confused in the literature: that a proposition is improbable is sometimes incorrectly cited as demonstrating its remoteness. But they are not the same. Advocates of safety, moreover, need to keep them separate. For I don’t know that I have won the lottery before the draw, notwithstanding the improbability of my doing so. That it is improbable had therefore better not make my winning remote, since that would make my belief that I will lose safe. (See Pritchard 2008 for discussion).
123
Synthese (2011) 183:127–142
135
However, he thinks that a “safe-basis” or “safe-indicator” account does do so, as do Luper and Pritchard (Sosa 1999a,b; Luper 1984, 2006; Pritchard 2008). I will refer to this version as the “safe-basis” account, in contrast to the “safe-belief” account we have been examining. According to the safe-basis account the modal property is predicated, not of the belief itself, but of the basis (or grounds, source, or reason) for the belief: if basis R of a belief that P held, then P would be true. Equivalently: the belief that P has a basis R that it would have only if true. Expressed in terms of possible worlds: There is no nearby world in which S believes that P on the basis of R and in which P is false. It is compatible with this that there nevertheless exist nearby possible worlds in which S believes that P on a different basis than R, and further that P is false in some such world. So that a belief has a safe basis does not imply that the belief itself is safe.19 It is in this way that the safety account will avoid running afoul of, for example, Nozick’s grandmother case (famously also a counterexample to sensitivity). In that case a grandmother, seeing her grandson walking about hale and hearty in front of her, knows as a result that he is not ill, although if he were ill (and so out of her view in a hospital bed) her relatives would still convince her that he was well (to avoid the shock that the bad news would produce).20 Assume that her grandson’s profligate lifestyle is such that illness is a proximate threat, and her belief that her grandson is well is unsafe: there is a nearby world in which he is ill and she believes him well. However, the basis of her belief—her view of him walking about—is safe: there is no nearby world in which he is ill but she enjoys the same view.21 Similarly, the basis of the grandmother’s belief is sensitive: in the nearest world in which he is ill, she will not believe him well on the basis of her view of him walking about (although she will believe him well). So a sensitive-basis account also avoids Nozick’s grandmother case (without Nozick’s infamous resort to method-relativity). This is in fact Dretske’s account. Notwithstanding his being frequently characterized as requiring that a known belief be sensitive, he requires instead that a belief that P be caused by a condition that is sensitive to P’s holding (which is compatible with the insensitivity of the belief itself). The grandmother case (and others often cited against Nozick, such as the red barn case) demonstrates the significance of the difference.22 The Lamborghini case of earlier also provides an example of a belief that is unsafe but with a safe basis. My belief that my car is not stolen is, we saw, unsafe, since there is a nearby world in which the attendant stole it while I watch it all on the monitor and assume that he is only moving it to another space as a result. But notice that the actual basis for my belief that it is not stolen is different than it is in this nearby world: it is instead that the car sits in 3B, as indicated on the monitor, from which I infer that it 19 However, that a belief has an unsafe basis does imply that the belief itself is unsafe. For in a nearby
world demonstrating that a belief has an unsafe basis the belief is false but has the same basis as the actual, which requires that it be falsely believed. 20 Nozick (1981, p. 179). 21 This example demonstrates that the safety theorist cannot require that a belief be both itself safe and
that it have a safe basis in order to be known (which would preserve closure); the grandmother intuitively knows that her grandson is well, notwithstanding the fact that her belief to that effect is unsafe. 22 Dretske distinguishes his view from Nozick’s with respect to this case in Dretske (2005a).
123
136
Synthese (2011) 183:127–142
is not stolen. Since there is no nearby world in which I believe that it is in 3B when it isn’t—thanks to the monitor—there is also no nearby world in which I infer from it’s being in 3B that it is not stolen when it is stolen. So although my belief that it is not stolen is unsafe, it nevertheless has a safe basis. As a result, this case does not constitute a counterexample to closure on the safe-basis account. Sosa (1999b) and Luper (2006) claim, moreover, that the safe-basis account preserves closure in general, since R safely indicates P only if R safely indicates Q, where Q is anything entailed by P. That is correct; for if R safely indicates P, then in every nearby world in which R exists P is true. But in those same worlds Q is true as well. Therefore in every nearby world in which R exists Q is true, and so R is a safe basis for belief in Q. Therefore, so long as the bases of P and Q are the same, closure is preserved. As a demonstration that the safe-basis account preserves closure, however, this reasoning is also fallacious. Assume that the agent infers Q from P; the basis of the agent’s belief in Q is then that it follows from P. But the basis of S’s belief in P is not that it follows from P. If it is an inferential belief, it is inferred from some other proposition; if not, it is based on perception, or memory, or the like. So the actual bases of P and Q are distinct. So the fact that the safe basis relation preserves closure in the sense that if R is a safe basis for P then R is a safe basis for anything that follows from P (including Q) is irrelevant.23 There is no reason why there could not be a nearby world in which my belief in P has a different basis than it has in the actual world, and in which world P is false. (Recall that such a world is allowed by P’s having a safe basis, since it’s having such a basis is compatible with the belief itself being unsafe, and so with a nearby world in which it is false but I believe it anyway.) So the actual basis of my belief in P could be safe, compatibly with the existence of a nearby world in which P is false but I still believe it, albeit on a different basis than in the actual world. Further, P’s being false in that world allows Q to be false in that world. And since, as we just noted, the bases of my beliefs in P and Q are different in the actual world, the basis of P could differ in that world from its basis in the actual world while the basis of Q remains the same as in the actual world. So Q could be false in that world while it has the same basis as in the actual world. Such a world would be one in which demonstrates that Q has an unsafe basis while P has a safe basis, and so that closure fails. Here is an example based on the Lamborghini case. In the actual world I believe that my car is in 3B because I am watching it on the closed-circuit monitor, as before. And, as before, there is a nearby world in which the attendant steals it. However, in 23 Luper presents the argument of the previous paragraph in his 2006. In that same paper, he takes Dretske
to task for suggesting that the fact that perception is not closed (that one perceives that p and that p implies q does not require that one perceives that q) is evidence that knowledge is similarly not closed. As Luper points out, the principle “If S knows p noninferentially via perception, and S believes q because S knows that q is entailed by p, then S knows q noninferentially via perception” is false, but only trivially so; one obviously can’t know something non-inferential by inference. That truism, he rightly points out, hardly counts as evidence against knowledge-closure. But by the same token S’s knowledge that q does not derive from the same indicator as S’s knowledge that p: q is known by inference, while p is known by perception. So the fact that a safe indicator of p is also a safe indicator of q does nothing to demonstrate that the safe-indicator account preserves closure.
123
Synthese (2011) 183:127–142
137
that world the agent will first cut the wires to the monitor. I will then call him in a panic to find out what’s going on, and he will reassure me (from his cell phone while joyriding in my car) that the power has blown in the parking lot but the car is still in 3B. And I believe him since I trust him implicitly. I then infer from my car’s being in 3B that it is not stolen, just as I do in the actual world. My belief that my car is in 3B is based on my perception of the image of it sitting in 3B on the closed-circuit monitor. That basis is safe: there is no nearby possible world in which the monitor produces that image and my car is not in 3B.24 That it is sitting in 3B implies that it is not stolen. But the basis of my belief that it is not stolen is not safe: there is a world in which I believe it on the same basis, having inferred it from its sitting in 3B, but in which it is false, since the attendant is falsely assuring me that it is in 3B by phone while joyriding in my car. So closure is not preserved on the safe-basis account either. And as before, it so happens that closure also fails on the corresponding version of the sensitivity account, the sensitive-basis account. On that account, S’s knowledge that P requires that the basis of S’s belief that P be sensitive, that is, that in the nearest possible world in which P is false, R does not exist. The basis of my belief that my car is in 3B is sensitive: in the nearest world in which my car is not in 3B—being the world in which the agent steals the car and cuts the wires to the monitor—I do not have the reason I do have—namely, the image on the monitor—for believing that it is in 3B. However, the basis for my belief that my car is stolen is not sensitive, so long as we assume that the nearest world in which my car is stolen is the one in which the attendant stole it. For then my reason for believing that my car is not stolen—namely, inferring it from my belief that the car is in 3B—remains. 6 Basis-individuation The advocate of the safe-basis account might counter that the basis of the belief that the car is not stolen in the world in which the attendant steals the car after having cut the wires is not in fact the same as in the actual world. For in the actual world the basis is inference-from-the-car’s-being-in-3B-as-established-by-viewing-the-monitor, not merely inference-from-the-car’s-being-in-3B. Obviously that is not the same basis as in the counterfactual world described above (wherein I believe it is in 3B because the attendant tells me so while joyriding). In general the basis, so specified, ensures that in the nearby worlds in which the belief in Q has that basis the belief in P will have the same basis as in the actual world (simply because the basis of P is incorporated into that of Q). P will therefore be true (since that basis is safe), and so ensure that Q is true as well (since it implies Q). So the basis of the belief in Q, so specified, is guaranteed to be safe and closure is preserved. It is difficult to say anything decisive here about which is the correct description of the basis. For we face a version of the generality problem: the basis could be 24 Assume that the security is such that only the attendant has a chance of stealing it, and he wouldn’t
dream of attempting to monkey with the monitor to play a misleading loop of tape when he can just cut its wires and reassure me by phone.
123
138
Synthese (2011) 183:127–142
individuated more or less finely, and so tailored to suit the purposes of the theorist. We could construe the basis of the belief in Q more broadly as “by inference”, which would bring the earlier counterexample back in (since in the relevant counterfactual world the agent did infer Q), and so closure would fail. Or, we could construe the basis as “inference-from-the-car’s-being-in-3B-as-established-by-viewing-the-monitor-inthe-actual-world”, so that there can be no possible worlds at all other than the actual in which the belief in Q has the same basis, which ensures—surely too easily—that there are no such worlds in which Q is false. Without attempting to resolve that issue in general, there are nevertheless reasons to resist the safe-basis theorist’s proposed specification of the basis of the belief in Q. First, it is rather ad hoc to select a characterization of that basis which encompasses the basis of the belief in P. Why should Q’s basis, which, it is agreed by all, is different than that of the P’s, nevertheless necessarily encompass P’s basis? One would, prima facie, expect their bases to be characterized independently. Second, the proposal does not line up with the relevant speech behavior. If asked why I believe that my car isn’t stolen, I will say “because it is still in 3B (and it can’t be in 3B if it’s stolen).” My interlocutor might then ask why I believe that it is in 3B, and my answer to that question will be different in the actual and counterfactual worlds. But intuitively those are (different) answers to the question why I believe P—that the car is in 3B—not answers to why I believe Q—that it isn’t stolen. Intuitively, the topic has simply changed from a question concerning the basis of one belief to the basis of another. The point can be made more generally. I believe that the current U.S. president’s last name begins with a letter in the second half of the alphabet, and so (now, at least) do you. And you and I believe this for what is intuitively the same reason, namely, that “Obama” begins with “O”, which is in the second half of the alphabet. Our reasons would differ if you inferred this instead from your belief that the president’s name is Josaphat Xerxes. But it remains the same reason if I learned that the president’s name is “Obama” by reading about it on the net while you saw it on TV. Indeed, if we insist that the basis of a belief always includes the basis of every belief it is inferred from, then even if you and I infer Q from P and P from R, so long as we believe R on different non-inferential grounds, our beliefs in Q still have different reasons. The overall result will be that it will be only very rarely correct to say that two people believe P for the same reason. In general, we appear to individuate the basis or reason of a belief arrived at by inference by reference to the mode of inference and the proposition(s) from which the belief is inferred, but not also by reference to the sources of those beliefs themselves. If the safe-basis advocate who wishes to preserve closure wants to conveniently depart from this in her characterization of the cases that cause her difficulty, she needs to formulate a principled reason for doing so that doesn’t obviously derive from that wish.
7 Good news, bad news However much all of these additional counterexamples to closure might disturb the safety or safe-basis advocate, the sensitivity theorist might be pleased with the
123
Synthese (2011) 183:127–142
139
presentation of yet more grist for her anti-closure mill. But such complacency would be misplaced. For in both versions of the Lamborghini case we have discussed, I intuitively know both P and Q. In each, I know that the car is in 3B in virtue of seeing it there on the monitor, and infer from its presence there that it has not been stolen. That is, intuitively, enough for knowledge that the car has not in fact been stolen, whatever the car’s fortunes in non-actual possible worlds might be. So these are not just counterexamples to closure on the four views (safety, safe-basis, sensitivity, and sensitive-basis) we have discussed. They are also counterexamples to each of those views as theories of knowledge, whether or not those views affirm or deny closure. Nevertheless, these cases do demonstrate that no advantage vis-a-vis closure is to be gained by shifting from sensitivity (or sensitive-basis) views to safety (or safe-basis) views. That supposed advantage is, I suspect, the most influential consideration among those cited in favor of safety over sensitivity. It is, in particular, a crucial component in the safety theorist’s response to scepticism. According to the safety theorist we know not only that we have hands and similar mundane facts, but also that (for example) we are not brains in vats (BIVs), which hypothesis does not therefore threaten our mundane knowledge. The disadvantage of this approach is simply that it is unintuitive that we do know that we are not BIVs— which hypothesis is, after all, expressly designed to neutralize all possible empirical evidence we have available. The benefit, however, is that we need not resort to the denial of closure in order to protect our mundane knowledge from sceptical attack in the manner of Dretske and Nozick (according to whom we know that we have hands but don’t know that we are not BIVs, closure failing from the former to the latter). The fact that closure fails on the safety account as well, however, eliminates that supposed advantage of the safety theorist’s response to scepticism. This is not the place to explore these different responses to scepticism in detail. And it is certainly true that Dretske’s and Nozick’s views have problems of their own (most famously, perhaps, being the “abominable conjunction” problem).25 But since the widely advertised advantage of the safety account—that it can respond to scepticism without violating closure—is accompanied by violations of closure elsewhere on that account, it is rather less clear that safety represents a significant advance over its sensitivity predecessor. 8 Safety never tried to save closure The safety advocate might at this point simply deny that the safety condition (of either the belief itself or its basis) is intended to preserve closure, insisting that some other condition of knowledge has that job instead.26 There are two versions of this response. On the first, safety is necessary for knowledge, but not sufficient. This is likely the most common version; Williamson, for example, explicitly disavows the project of 25 The problem, named as such by Keith De Rose in his 1995, is the infelicity in asserting (for example)
“I know I have hands but I don’t know that I’m not a handless BIV”. 26 Thanks go to two referees whose comments prompted this section.
123
140
Synthese (2011) 183:127–142
specifying sufficient conditions for knowledge.27 On this version, closure is preserved because P in the counterexamples is not known, notwithstanding its being safe, since it fails to satisfy an additional closure-preserving condition. This version, however, violates the quite robust intuition noted above to the effect that both P and Q in the counterexamples are known. Indeed, the intuition is, I suspect, especially strong that I know P in particular; surely continuous scrutiny of the closed-circuit monitor image provides me with knowledge that my car is in 3B. Indeed, the strategy of preserving closure by denying knowledge of P ultimately leads to scepticism. The counterexamples can easily be revised with my knowledge of P based on unaided perception rather than via the closed-circuit monitor. In the first Lamborghini case, for example, suppose my (extreme) paranoia leaves me unsatisfied with the closed-circuit monitor: I now sit in the parking lot itself with the car in full view, watching it continuously. (However, when I watch the attendant drive it away in order to steal it, while assuring me he is moving it to another space, I still trust him as before.)28 Denial of knowledge of P now requires denying knowledge of my car’s presence in front of my watchful eyes. Generalized, the result is (at least) external-world scepticism. So no response to these examples that involves denying knowledge of P will be tenable. But the attempt by the safety theorist to preserve closure by appeal to safety as a necessary but insufficient condition can only proceed by denying knowledge of P.29 So closure cannot be recovered by construing safety as a necessary but insufficient condition of knowledge.30 The second version of the response under consideration avoids this difficulty. On this version, safety is sufficient for knowledge, but not necessary. While likely to be less popular among safety advocates—it is indeed unclear whether such a position would count as a within the family of safety accounts at all—such a position could be modeled on Roush’s modified tracking account.31 In way of reconciling closure with Nozick’s tracking account, Roush simply counts as known both those beliefs that track the truth and those that are known consequents of those that track the truth. So while tracking is not a necessary condition of knowledge, the only exceptions to its being required are precisely those which ensure that closure succeeds. The safety theorist could attempt a similar move: known beliefs are those that are safe or the known consequents of such beliefs, whether or not those consequents are safe. On this 27 Indeed, he disavows provision of any analysis of knowledge whatsoever. Notwithstanding that stance, he does nevertheless appear committed to safety as at least a necessary condition. See Williamson (2002). 28 In the second version of the case, imagine that instead of cutting the cables to the vehicle before stealing it, the attendant cuts the lights in the parking lot, and proceeds to provide me with verbal assurance by cell-phone the car remains in 3B. (Assume also that I am deaf and so cannot hear the car’s departure.) 29 Such a view cannot instead affirm knowledge of (both P and) Q since Q fails to satisfy the (on this view)
necessary condition of being safe. 30 It might also be worth noting that the additional condition contemplated in Sect. 3—that the agent pos-
sess an adequate reason for believing the proposition, in addition to its being safe—will not help preserve closure. For in the counterexamples the agent does possess what appears to be such a reason. In the first Lambourghini case, for example, I have what appears to be a perfectly good reason to believe that the car is in 3B, namely, that I (appear to) see it on the monitor. 31 See footnote 2.
123
Synthese (2011) 183:127–142
141
version, closure is preserved in the examples above because both P and Q are known notwithstanding Q’s not being safe, since it is the known consequent of safe beliefs. I leave it to the safety advocate to decide whether construing safety as an unnecessary condition for knowledge recovers enough of the safety account to be worth endorsing. At any rate the attempt to, as it were, shift the burden of sustaining closure onto another characteristic of knowledge amounts to conceding that point I aim to establish here: safety itself does not preserve closure. It cannot therefore be cited as an advantage of a safety account (over sensitivity accounts, in particular) that its adoption preserves closure where other views do not. Moreover, even if the resulting view also provides a response to scepticism, that advantage also cannot be ascribed to the safety condition. For if closure is preserved in the account by means independent of the safety condition, that fact alone grounds a “Moorean” response to scepticism. “I am not a (handless) BIV”, for example, follows from my (presumably safe) belief that I have hands. So whether or not the former anti-sceptical belief is safe, the view under consideration implies that it is known, since it is a known consequence of a known belief. The invocation of safety is therefore superfluous: the independent closure-satisfying condition grounds a Moorean response that renders the “neo-Moorean” response invoking safety otiose. (Compare Roush’s account, which provides precisely this response to scepticism, while grounded in a Nozickian sensitivity account.32 ) So while the safety condition could be a sufficient but unnecessary component of an overall theory of knowledge that preserves closure, it is no longer clear that component makes any contribution worth citing over a corresponding view incorporating a sensitivity component. And if the safety advocate insists that safety be construed as at least a necessary condition of knowledge, if not a necessary and sufficient condition, and the she is also not a sceptic (as indeed none are), then closure failure is a consequence of both the safe-belief and safe-basis accounts as much as it is of the corresponding sensitivity accounts. So it remains the case that no obvious advantage is to be gained by endorsing a safety-based account over its sensitivity-based predecessors, at least with respect to the status of closure.33
References DeRose, K. (1995). Solving the skeptical problem. Philosophical Review, 104, 1–52. Dretske, F. (1970). Epistemic operators. Journal of Philosophy, 67, 1007–1023. Dretske, F. (1981). Knowledge and the flow of information. Cambridge: MIT Press. Dretske, F. (2005a). The case against closure. In M. Sosa & E. Steup (Eds.), Contemporary debates in epistemology (pp. 13–26). Oxford: Blackwell. Dretske, F. (2005b). Reply to Hawthorne. In M. Sosa & E. Steup (Eds.), Contemporary debates in epistemology (pp. 43–46). Oxford: Blackwell. Hawthorne, J. (2005). The case for closure. In M. Sosa & E. Steup (Eds.), Contemporary debates in epistemology (pp. 26–43). Oxford: Blackwell. Lewis, D. (1973). Counterfactuals. Cambridge: Cambridge University Press.
32 See Roush (pp. 51–57). 33 Much thanks to two anonymous referees for helpful comments.
123
142
Synthese (2011) 183:127–142
Luper, S. (1984). The epistemic predicament: Knowledge, Nozickian tracking, and scepticism. Australasian Journal of Philosophy, 62(1), 26–49. Luper, S. (2006). Dretske on knowledge closure. Australasian Journal of Philosophy, 84(3), 379–394. Nozick, R. (1981). Philosophical explanations. Oxford: Oxford University Press. Pritchard, D. (2005). Epistemic luck. Oxford: Oxford University Press. Pritchard, D. (2007). Anti-luck epistemology. Synthese, 158, 277–297. Pritchard, D. (2008). Sensitivity, safety, and anti-luck epistemology. In J. Greco (Ed.), The oxford handbook of scepticism (pp. 437–455). Oxford: Oxford University Press. Pritchard, D. (2009). Safety-based epistemology: Whither now?. Journal of Philosophical Research, 34, 33–45. Roush, S. (2005). Tracking truth: Knowledge, evidence, and science. Oxford: Clarendon Press. Sosa, E. (1999a). How to defeat opposition to Moore. Philosophical Perspectives, 13, 141–154. Sosa, E. (1999b). How must knowledge be modally related to what is known. Philosophical Topics, 26, 373–384. Sosa, E. (2007). A virtue epistemology: Apt belief and reflective knowledge (Vol. 1). Oxford: Oxford University Press. Stalnaker, R. (1968). A theory of conditionals. American Philosophical Quarterly, monograph no. 2: 98–112. Stine, G. (1976). Skepticism, relevant alternatives, and deductive closure. Philosophical Studies, 29, 249– 261. Williamson, T. (2002). Knowledge and its limits. Oxford: Oxford University Press.
123
Synthese (2011) 183:143–160 DOI 10.1007/s11229-010-9756-9
Indexicals, context-sensitivity and the failure of implication Gillian Russell
Received: 23 February 2010 / Accepted: 3 June 2010 / Published online: 24 June 2010 © Springer Science+Business Media B.V. 2010
Abstract This paper investigates, formulates and proves an indexical barrier theorem, according to which sets of non-indexical sentences do not entail (except under specified special circumstances) indexical sentences. It surveys the usual difficulties for this kind of project, as well some that are specific to the case of indexicals, and adapts the strategy of Restall and Russell’s “Barriers to Implication” to overcome these. At the end of the paper a reverse barrier theorem is also proved, according to which an indexical sentence will not, except under specified circumstances, entail a non-indexical one. Keywords Indexical · Context-sensitive · Language · Logic · Implication barrier theorem 1 Introduction By an implication barrier thesis I shall mean a claim which says that no set containing only sentences of one kind entails a sentence of another kind, for example, the claim that no set containing only descriptive sentences entails a normative sentence, or the claim that no set containing only particular sentences entails a universal one. The aim of the present paper is to formulate and prove an indexical barrier theorem, according to which (extremely roughly) no set containing only nonindexical sentences entails an indexical sentence. Though a number of obstacles to the proof of such a theorem exist, the thought that there is some non-trivial theorem to be discovered is motivated by well-known thought experiments from the philosophies of language and mind, such as those of Hector-Neri Casteneda, John
G. Russell (B) Washington University in St. Louis, St. Louis, MO, USA e-mail:
[email protected]
123
144
Synthese (2011) 183:143–160
Perry and David Lewis (Castaneda 1968; Lewis 1979; Perry 1988). The work presented here might be thought to belong to the domain of logic, but it is hoped that its most interesting applications will be in philosophy more generally, for example, in providing an underlying explanation for the phenomena noted by Perry, Lewis et al., or in providing further data-points in disputes over whether certain philosophically interesting expressions—such as vague expressions, the truth-predicate, and knowledge attributions—are genuinely indexical. Perhaps it might also be used to explain the non-derivability of the A-series from the B-series in the philosophy of time. In the first section of the paper I present some well-known general obstacles to the formulation of barrier theses. In the first section of the paper I present some wellknown general obstacles to the formulation of barrier theses. Section 3 explains how the model-theoretic approach employed in Restall and Russell (2010) can be used to overcome these obstacles, and Sect. 4 then applies this same strategy to the indexical case and addresses some new problems that arise, before formulating and proving the indexical barrier theorem. 2 Barriers to implication The main obstacle to the establishment of implication barrier theses is the existence of putative counterexamples. Since an implication barrier thesis holds that sets of sentences of one kind never entail a sentence of another, such counterexamples take the form of valid arguments from premises of the first kind to conclusions of the second. Many of the counterexamples proposed in the literature were first intended as objections to the controversial thesis known as Hume’s Law, which says that no set of descriptive sentences entails a normative one (e.g. Prior 1960; Searle 1964; Jackson 1971; and see also Russell 2010 for discussion of more.) However many of those arguments are easily transformed into putative counterexamples to the less controversial theses—including an indexical barrier thesis. For example, A.N. Prior takes the following to be a counterexample to Hume’s Law: Tea-drinking is common in England. Tea-drinking is common in England, or all New Zealanders ought to be shot. Aware that some readers will be tempted to respond that the conclusion of this argument is not normative, he suggests that if it is not, then we take it as a premise in the following argument: Tea-drinking is common in England, or all New Zealanders ought to be shot. Tea-drinking is not common in England. All New Zealanders ought to be shot. The force of Prior’s point comes from the pressure exerted by both arguments together: if it were not for the second, we might happily call the disjunction descriptive, and thus dismiss the first counterexample, and if were not for the first, we might happily call the disjunction normative and maintain Hume’s law that way. But taking both together, neither way out looks particularly attractive.
123
Synthese (2011) 183:143–160
145
If we use D as a schematic letter replaceable by any descriptive sentence, and N as a schematic letter replaceable by any normative sentence, then we may represent the forms of Prior’s arguments more succinctly as: D D∨N
D∨N ¬D N
It is clear that this argument is easily turned into an argument against other barrier theses. For example, if one is considering whether it is possible to derive a general claim G from a particular one P, or alternatively, an indexical claim I from a constant claim C, then the following classically valid schemata pose putative counterexamples: P∨G C∨I P ¬P C ¬C P∨G G C∨I I More potential counterexamples are to be found in the fact that within classical logic anything follows from a contradiction, and a theorem follows from anything, giving us such arguments as: C C ¬C I I ∨ ¬I One might think that the only thing for the responsible philosopher to do in response to the counterexamples to Hume’s Law is to give up the the claim as a misleadingly intuitive, but ultimately mistaken, thought. Yet this response in the controversial case looks much less attractive once it is realised that the same objections seem to apply to philosophical platitudes such as ‘you can’t get general claims from particular ones’ or ‘you can’t deduce claims about the future from claims about the past.’ Surely there is something right about these ideas, and it is very tempting to think that what the counterexamples really suggest is that our straightforward formulation of the claims as ‘claims of type B are not entailed by premises of type A’ was overly simplistic, and we need to do some philosophical work to come up with a more sophisticated version of these barrier theses which avoids (hopefully in some non–ad hoc way) the counterexamples whilst still capturing something that is plausibly the intuitive content of the thesis. One strategy is to become more careful about what we mean by premise- and conclusion-class types like ‘particular’ and ‘universal’ or ‘constant’ and ‘indexical.’ We can try to define these classes in such a way that none of the arguments above count as arguments from a set of premises of the relevant premise-class to a sentence of the relevant conclusion class. A version of this strategy and its success in simple cases is presented below in Sect. 3, and then in Sect. 4 we will adapt it for the more complex case of indexicals. 3 The barrier construction theorem The particular/general barrier is the simplest case. Instead of thinking of general sentences syntactically, as those which contain a universal quantifier, and
123
146
Synthese (2011) 183:143–160
Fig. 1 True particular claims stay true when the model is extended
Fig. 2 True universal claims may become false when the model is extended
particular sentences as those which do not, we can characterise our kinds of sentence model-theoretically. Suppose that a sentence like Fa is true in some model. Then one thing we can say is that such a claim seems to be a local one. It is made true by some particular part of the model and as a result if we extend that model by adding extra elements to the domain, that will not make Fa false (Fig. 1). Universal claims—like ∀x F x—are not local but global; they make claims about the entire model (Fig. 2). As a result, they are such that whenever one of them is true in a model, it can be made false by extending the model, in this case by adding an element which is not ‘F.’ A little more formally, we say: Definition 1 (Extension (a binary relation on FO-models)) A model M is an extension of a model M (M ⊇ M) just in case M can be obtained from M by adding more objects to the domain and extending the interpretation of the predicates to cover the cases of the new objects. (If F is an n-place predicate and α an assignment of variables to values in the domain of M (avoiding the extra objects in M ) then M, α | F x1 , . . . xn if and only if M , α | F x1 , . . . xn .) The intuitive idea is that one model extends another if you can get it from the first by adding elements and extending the interpretation function in some appropriate manner. We then use this relation over models to define our two classes of sentences. Definition 2 (Genuine Particularity) A sentence is genuinely particular iff for each M, M , if M A and M ⊇ M then M A. Sentences which are genuinely particular on this definition include Fa, Fa ∧ Fb, ¬Fa, ∃x F x and ¬∀x F x. Definition 3 (Genuine Universality) A sentence is genuinely universal iff for each M where M A, there is some M ⊇ M where M A.
123
Synthese (2011) 183:143–160
147
Sentences which are genuinely universal on this definition include ∀x F x, ∀x(F x ∧ Gx), ∀x(F x ⊃ Gx), ¬∃x F x, ∀x∃y F x y and ∃y∀x F x y, as well as (standard translations of) some sentences that might be thought to involve ‘hidden’ universality, such as ∃x(F x ∧ G x ∧ ∀y(F y ⊃ x = y)) (The only F is G) and ∃x∃y(F x ∧ F y ∧ x = y ∧ ∀z(F z ⊃ (z = x ∨ z = y))) (There are exactly two Fs.) Some further consequences of the definitions are worth observing. First, one feature of such a model theoretic characterisation is that any sentence that is logically equivalent to a genuinely particular sentence is itself genuinely particular. For example, since Fa is genuinely particular, so are Fa ∨ Fa and ∀x(F x ∨ ¬F x) ⊃ Fa. Moreover, any sentence which is equivalent to a genuinely universal sentence is itself genuinely universal. For example, since ∀x F x is genuinely universal, so is ¬∃x¬F x. This would seem to be just as it should be. It is also worth noting that by the Ło´s-Tarski theorem (Hodges 1997, pp. 143–146) the set of genuinely particular sentences characterised here will be the set of ∃1 sentences, or sentences which in prenex normal form consist of a string of existential quantifiers followed by a quantifier-free formula. Third, the two classes of sentences defined are not exhaustive of the set of sentences; there are some sentences which count neither as genuinely particular nor genuinely universal. One example is Fa ∨ ∀x Gx. Whether or not a model which makes this sentence true can be extended to one which makes it false depends on the details of how the sentence is made true in the first place. If the model makes Fa true, then the disjunction will be true in all extensions of the model. But if the model makes ∀x Gx true without making Fa true then there will be extensions of that model which make Fa false and ∀x Gx false as well, making the entire disjunction false. This is the heart of the response to the Prior-style counterexamples: since disjunctions are neither genuinely universal nor genuinely particular, neither of Prior’s arguments is one from purely particular premises to a general conclusion. Since many mixed conditionals such as Fa ⊃ ∀x F x are equivalent to such disjunctions, such conditionals will be classified as neither too. Fourth—perhaps more surprisingly—the two classes of sentences are not exclusive either, since contradictions trivially satisfy both definitions. I take contradictions to be degenerate cases of these definitions. Fifth, and least happily, I note that all theorems of the logic count as particular, even though some of them are uncannily universal looking, such as ∀x(F x ∨ ¬F x). Yet this is consistent with our motivating idea that universal sentences are those that restrict the entire model in some way; theorems cannot restrict our models, since they are true in all. With our two classes of sentences in hand, we can formulate our particular/universal barrier thesis in those terms: Theorem 1 (Particular/General Barrier Theorem) No satisfiable set of genuinely particular sentences entails a universal sentence. One advantage of this version of the thesis is that it is provable, as we will see before the end of this section. The strategy just illustrated can be applied in other cases, such as the past/future barrier thesis. What we will require to proceed in each new case is:
123
148
Synthese (2011) 183:143–160
1. a formal language appropriate to the kinds of sentences we are interested in (such as a modal logic, if we are interested in the merely actual/necessity barrier thesis, or a deontic logic, if we are interested in Hume’s Law) 2. a model theory for that language, i.e. a set of structures with respect to which the sentences of the language are true or false. This gives us the resources for defining our two classes of sentences, and provides a sufficiently precise notion of logical truth and logical consequence for arguments expressed in the formal language. For example, in the case of the barrier thesis which says that no set of sentences just about the present or past entails any sentence about the future, we get an appropriate language by adding these unary tense operators to the language of a simple, truth-functional logic: P (‘at some time in the past it is the case that’) and F (‘at some time in the future it is the case that’.) The structures (T, p, ≤, I ) for this language will consist of a sets of points (or times) t ∈ T , of which one, p (the present moment) is special and used for defining truth in that structure. The elements of T are ordered by the relation ≤ (‘is earlier than’) and finally an interpretation function I maps each atomic sentence and time to a truth-value. We extend the interpretation function to cover the rest of the language with the usual recursive clauses, to which we add these clauses for our new operators: Pq is true at t iff there exists some t such that t ≤ t and q is true at t . Fq is true at t iff there exists some t such that t ≤ t and q is true at t . A sentence p is true in a structure (T, p, ≤ I ) if it is true at p. A sentence is a logical truth iff it is true in all structures, and a sentence B is a logical consequence of a set of sentences S iff whenever every member of S is true in some structure, B is true in that structure as well. Now we can apply our strategy for formulating the barrier thesis. We need a binary relation defined on our structures, analogous to the extension relation that we used in the particular/general case. We use the relation of future-switching (). Intuitively, one structure stands in the future-switching relation to another whenever the atomic sentences get the same truth-values up until and including the present time, and may or may not get the same values for any point after that. Sentences which are genuinely about the present or the past are such that if they are true in a structure, they will be true in every future-switched structure. Sentences which are genuinely about the future, by contrast, will be such that if they are true in a structure, future-switching will always be able to make them false. More carefully: Definition 4 (Future-Switching (a binary relation on tense-logical models)) A model M = (T , p, ≤, I ) is a future-switch of a model M = (T, p, ≤, I )(M M) just in case for all atomic sentences φ in the language, if t ≤ p, then I (φ, t) = I (φ, t). Intuitively, in a future-switched model the world is the same at the times up to and including the present moment, and after the present moment it may diverge. Then we suggest that truths which survive such a change are genuinely historical, and truths which do not are genuinely future-directed. Definition 5 (Genuinely Historical (or present) Sentences) A sentence is genuinely historical iff for each M, M ∈ U , if M A and M M then M A.
123
Synthese (2011) 183:143–160
149
Sentences which will count as genuinely historical (or present) on this definition include q, Pq, P¬q, and ¬P¬q (at all earlier or present times q.) Definition 6 (Genuinely Future-Directed Sentences) A sentence is genuinely future-directed iff for each M ∈ U where M A, there is some M M where M A. Sentences which will count as genuinely future-directed on this definition include Fq, F Fq, ¬F¬q (at all future times q). One interesting case which counts as neither include F P p since this can sometimes be made false by changing the future and sometimes by changing the past. We use our new definitions to formulate the barrier thesis: Theorem 2 (Past/Future Barrier Theorem) No satisfiable set of genuinely historical (or present) sentences entails a genuinely future-directed sentence. The general pattern should hopefully be clear from these two cases. We take a logic (a formal language with a model theory) appropriate to characterising the implication relations between the two kinds of sentence. Then we define an appropriate relation, R over the set of models, and use it to characterise two kinds of sentences: Definition 7 (R-Preservation) A sentence A is R-preserved with respect to a class of models M if and only if for all models M ∈ M if M satisfies A and M R M then M satisfies A. Definition 8 (R-Fragility) A sentence A is R-fragile with respect to a class of models M if and only if for all models M ∈ M, if M satisfies A then there is some M ∈ M such that M R M and M does not satisfy A. Then we use these sentences to construct an implication barrier thesis, which will then be a consequence of the following theorem: Theorem 3 (Barrier Construction Theorem) Given a class M of models, and a collection X ∪ {A} of formulas. If (a) X is satisfied by some model in M; (b) A is R-fragile; and (c) each element of X is R-preserved, then X M A. Proof Since X is satisfied by some model (a), choose one such model, M. If M A, then X M A and we are finished. On the other hand, if M A, then since A is R-fragile (b), there is some M where MRM and M A. Now, since each element of X is R-preserved (c), M satisfies each element of X , and M is our counterexample to the validity of the argument from X to A: X M A. (Restall and Russell 2010) 4 The indexical case 4.1 Motivations Following Kaplan (1989b) we generally think of indexicals as expressions whose content varies with context of utterance. Paradigm cases include ‘I’, ‘now’, ‘here’ and ‘today’ as well as demonstratives such as ‘here’ and ‘that tall man.’ We do not
123
150
Synthese (2011) 183:143–160
normally speak of whole sentences as indexical, but since the content of a sentence is determined by the content of its parts, any sentence containing an indexical will inherit indexicality itself. One reason to pursue my project is the hope that it might provide an underlying logical explanation for the phenomena we find in some famous thought experiments. In “the Problem of the Essential indexical” Perry writes: I once followed a trail of sugar on a supermarket floor, pushing my cart down the aisle on one side of a tall counter and back the aisle on the other, seeking the shopper with the torn sack to tell him he was making a mess. With each trip around the counter, the trail became thicker. But I seemed unable to catch up. Finally it dawned on me. I was the shopper I was trying to catch. I believed at the outset that the shopper with a torn sack was making a mess. And I was right. But I didn’t believe that I was making a mess. That seems to be something I came to believe (Perry 1979, p. 3). And in “Frege on Demonstratives” we find another famous case: An amnesiac, Rudolf Lingens, is lost in the Stanford library. He reads a number of things in the library, including a biography of himself, and a detailed account of the library in which he is lost . . . He still won’t know what he is, and where he is, no matter how much knowledge he piles up, until that moment when he is ready to say, “This place is aisle five, floor six, of Main Library, Stanford. I am Rudolf Lingens (Perry 1977). Perry puts his examples to a purpose other than mine, but it does not seem too great a stretch to hold that both the messy shopper and the amnesiac in the library must finally come to their realisations through non-deductive inference, such as inference to the best explanation. Perry holds that it is only when the agent finally characterises their information indexically that we have any good explanation of the change in their behaviour (stopping his pursuit and checking his own shopping cart, or suddenly throwing up his hands with the realisation “I am Rudolf Lingens”) when we attribute a belief using the indexical ‘I’. The indexical is said to be essential to the explanation—and one explanation for that fact might be that it isn’t derivable from claims that the agent already accepts. Responding to Perry’s example, Lewis devises a case of his own: We can imagine a more difficult predicament. Consider the case of the two gods. They inhabit a certain world . . . They are not exactly alike. One lives on top of the tallest mountain and throws down manna; the other lives on top of the coldest mountain and throws down thunderbolts (Lewis 1979, pp. 520–521). I adjust his case slightly here to suit my needs: each god is omniscient, but in a certain very special sense: he knows the truth-value of every constant sentence, e.g. the truth-values of sentences like there are two mountains and the god who lives on the tallest mountain throws thunderbolts. Being gods, they do not have normal animal perception, but simply intuit the truth-values of these sentences. Our question is: is either god able to deduce the truth-values of any indexical sentence, such as I am the
123
Synthese (2011) 183:143–160
151
god who lives on the tallest mountain or the tallest mountain is located here? Lewis supposes that “[n]either one knows whether he lives on the tallest mountain or on the coldest mountain, nor whether he throws manna or thunderbolts” (Lewis 1979, p. 521). One might think that one could translate these suppositions into the present framework by saying that neither god knows the truth-value of the sentence ‘I live on the tallest mountain’ or ‘I live on the coldest mountain’ (relative to the relevant context of utterance) nor whether ‘I am the god who throws thunderbolts’ or ‘I am the god who throws manna’ is true. A further reason to seek an indexical barrier theorem is that our present strategy looks particularly well suited to this case. Shouldn’t we expect indexical sentences to be fragile with respect to change in context? And constant sentences to be preserved with respect to change in context? It looks as if all we have to do is find a model theory suitable for addressing entailment questions between sentences containing indexicals and the rest should be easy. In fact it will not be quite that easy, as we will see. 4.2 A logic for indexicals Kaplan’s Logic LD provides a framework for considering implication relations between sentences containing indexicals. The logic is extremely complex and is presented in full in Kaplan (1989a), and so I confine myself here to its most salient features. We begin by helping ourselves to a first-order language (including the quantifiers ∃ and ∀) with two kinds of variables, (making it a two-sorted logic in the sense of Burgess (2005)). One kind of variable, vi is to be thought of as ranging over individuals, as is normal, and the other kind, pi , over places. The arity of predicates and functors is given by a pair of numbers, the first member of which tells us the number of individual-variables, and the second the number of place-variables required to form a formula. For example, Kaplan’s logical predicates Exists and Located have arities of (1,0) and (1,1) respectively. Names may be thought of as zeroplace functors. We also introduce the expressions , ♦, F and P, the former two to be our modal operators, and the latter our tense operators. Finally, we introduce some more unusual expressions which are intended to be context-sensitive: the singular terms I and Here and the operators A (actually) and N (Now.) The structures which we will use for our model theory contain two domains of quantification, a set of individuals, U , and a set of places P. They will also include a set of possible worlds, a set of times, a set of contexts and an interpretation function, which assigns appropriate extensions to the non-logical expressions in the language. The upshot is that a structure for LD is an ordered sex-tuple C, W, U, P, T, I . Within the formal system contexts are taken to be ordered quadruples of an agent a (taken from the set of individuals) a place p (taken from the set of places) a possible world w and a time t. The interpretation function assigns extensions to the non-logical expressions relative to a time and a possible world. Such time-possible world pairs play the intuitive role of circumstances of evaluation. The expressions I and H er e, on the other hand, are assigned extensions only relative to a context. The relevant recursive clauses for assigning denotations to them are:
123
152
Synthese (2011) 183:143–160
|I |c, f,t,w, = ac , |H er e|c, f,t,w, = pc ,
(i.e. the denotation of I relative to a context, c, assignment f , time t, and possible world w, is the agent of the context c).1 (i.e. the denotation of H er e relative to a context, c, assignment f , time t, and possible world w, is the place of the context c.)
The relevant clauses for A and N are: c f tw N φ iff f cT w φ (this entails that N φ is true relative to a context c only if φ is true at the time of the context) (this entails that Aφ is true relative to a context c f tw Aφ iff f tcW φ c only if φ is true at the world of the context) One final restriction on structures is necessary in order that translations of sentences like the infamous ‘I am here now’ come out as theorems: we require that agents are located at the place of the context, in the world of the context, at the time of the context. If R is a non-logical predicate of arity (1,0) then, the sentence R(I ) will be true with respect to a context c and a circumstance of evaluation (w, t) in a structure M just in case ac ∈ I R (w, t). Though we normally think of indexicals as having extension with respect to contexts of utterance and circumstances of evaluation, when it comes to evaluating the truth of a sentence containing one we need only mention the context, since the relevant time and the world will be provided by context. Hence we can abbreviate the above slightly to: If R is a non-logical predicate of arity (1,0) then, the sentence R I will be true with respect to a context c in a structure M just in case ac ∈ I R (wc , wt ). Logical truth in LD is then defined as truth in all contexts in all structures. Kaplan does not define logical consequence explicitly, but I will take it to be defined as follows: Definition 9 (Logical Consequence in LD) A closed formula A is a logical consequence of a set of sentences iff whenever every member of is true with respect to a context c in a structure M, A is also true with respect to c in M. 4.3 Defining indexicality and constancy We wish to exploit the idea that context-sensitive sentences are fragile with respect to change in context, whilst constant sentences are preserved with respect to the same changes. Suppose our binary-relation on structure-context pairs is that of context-shift. Definition 10 (Context-Shift) A structure-context pair (M , c ) is a context-shift of structure-context pair (M, c) iff M = M. This definition, by insisting that M = M, requires that i) the interpretation of all the non-logical expressions remains the same, ii) that the domains of places and individuals remain the same size (hence a sentence like ∀v Fv, where F is a non-logical 1 Since it is only sentences, i.e., closed formulae, in which we are interested, and since such formulae are either true on all assignments, or true on none, we will generally suppress mention of the assignment when discussing truth in a context.
123
Synthese (2011) 183:143–160
153
unary predicate) will not be affected this way) and that the interpretation of the tense and modal operators is unaffected (we are not suddenly adding new possible worlds for example, which might affect the truth of Fa.) However, since it does allow the context to shift, at least prima facie, one might think we can use it to define constant and context-sensitive sentences as follows: Definition 11 (Constant Sentence (1st attempt)) A sentence A is constant iff A is context-shift-preserved. Definition 12 (Context-Sensitive Sentence (1st attempt)) A sentence A is indexical iff A is context-shift–fragile. Theorem 4 (Context-Sensitivity Barrier Theorem (1st attempt)) If (a) X is satisfied in some context c in a structure M; (b) A is context-shift–fragile; and (c) each element of X is context-shift–preserved, then X M A. Here we hit three serious problems. The problem with universally satisfied predicates Our thesis is provable, but when we formulate barrier theses according to this modeltheoretic method the main danger is not so much that the thesis will be untrue, but that it will not say what we want it to say. We need to be sure that the sentences which are defined as context-sensitive or constant above are the kind of sentences that we might have some intuitive inclination to call context-sensitive, or constant. One problem with the definitions is that many intuitively indexical sentences— including, for example—F I —do not fall under the strict definition of an indexical sentence. We can see this by considering what happens in a structure where F is satisfied by every element in the domain of individuals, at every time and in every possible world. In such a structure the sentence F I will be true regardless of the context, making it fail to count as indexical on Definition (12). Perhaps even more seriously, thinking about universally satisfied predicates presents a problem for the intuitive indexical barrier thesis which we are attempting to formalise. Suppose that one of the gods in the Lewis thought experiment knows that the universal sentence ‘all gods are powerful’ is true. Presumably he would not be able to deduce that the sentence ‘I am powerful’ is true (relative to his context) since this would require knowledge that ‘I am a god’ is true. But if he knew the truth of a sentence that predicated something of every individual, perhaps ‘everything is extended,’ it seems clear that he could deduce ‘I am extended.’ In our formal language the argument would be: ∀v Fv FI This argument is both intuitively valid and valid in LD; whenever ∀v Fv is true relative to a context, F I will be true relative to that context too, because the agent of the context is always taken from the domain of individuals. Together these problems suggest that universally satisfied predicates present us with a dilemma: either we keep the definition we have, and maintain that F I should
123
154
Synthese (2011) 183:143–160
not be as classified context-sensitive, or we modify the definition so that it is classified as context-sensitive. But if we did the former, what claim would we have to be capturing anything like the intuition behind the informal indexical barrier thesis? And if the latter, the arguments above suggest that the barrier thesis so-formulated would be false. The contingency problem A second problem with the definitions above is that some sentences which we might intuitively have expected to be paradigm cases of constant sentences are not preserved over context-shirfts. These include all contingent sentences, such as Fa (in which a is a 0-place non-logical function, i.e. a name). This happens because in Kaplan’s formal system time and possible world are elements of the context. If we are allowed to change the possible world every time we change the context, then we will be able to change the truth-value of contingent sentences simply by context-shifting. Similarly, since sentences can take different truth-values with respect to different times in LD, and time is also an element of the context, a sentence whose truth-value changes over time will also fail to count as constant. One (not terribly attractive) option here might simply be to bite the bullet and hold that such sentences are not really constant. There is some prestigious precedent for this, such as in the writings of David Lewis: “When truth-in-English depends on matters of fact, that is called contingency. When it depends on features of the context, that is called indexicality. But need we distinguish? Some contingent facts are facts about context, but are there any that are not? Every context is located not only in physical space, but in logical space. It is at some particular possible world—our world if it is an actual context, another world if it is a merely possible context. . . . it is a feature of any context, actual or otherwise, that its world is one where matters of contingent fact are a certain way. Just as truth-in-English may depend on the time of the context, or the speaker, or the standards of precision, or the salience relations, so likewise may it depend on the world of the context. Contingency is a kind of indexicality” (Lewis 1980, pp. 24–35). John MacFarlane has also recently distinguished two kinds of context-sensitivity, which he calls “non-indexical context-sensitivity” and “indexical context-sensitivity.” Using MacFarlane’s terminology an expression is non-indexically context-sensitive if its extension varies with context, and indexically context-sensitive if its content varies with context.2 (MacFarlane 2009). On MacFarlane’s broader, non-indexical, definition, every contingent sentence exhibits a kind of context-sensitivity—perhaps this might motivate our removing it from the class of sentences we call ‘constant’. It seems that the notion of context-sensitivity that we have defined above is closer to being a version of the broader, non-indexical kind of context-sensitivity—after all, 2 Note that this is not the same as the distinction between utterance-sensitivity and the assessment-sensitivity of MacFarlane-style relativism: both relativism and the more traditional kinds of context-sensitivity can be subdivided into indexical/ non-indexical varieties.
123
Synthese (2011) 183:143–160
155
it is variation of truth-value (the extension of the sentence) which our definition tracks. But prestigious precedent or not, this bullet should not be bitten if we are interested in the philosophical implications of an indexical barrier theorem. It is clear this is is not the barrier thesis we can have had in mind when we considered thought experiments like the Two Gods one. We said the gods know the truth-values of sentences like the vengeful god lives on the taller mountain, counting such sentences as constant, and so it seems clear that it was a narrower sense of ‘context-sensitivity’ that we had in mind—something closer MacFarlane’s indexical context-sensitivity. Moreover this is not a barrier thesis that will give us any substantial help in classifying controversial sentences such as ‘Mary knows that the bank is open’ and ‘The card is blue.’ No-one involved in these debates would consider the contingency of these sentences to be proof of their context-sensitivity. The problem with Actually and Now One more problem arises from the fact that possible worlds and times are elements of the context of utterance. Intuitively, we would want to test for the narrower indexical kind of context-sensitivity by varying the context of utterance without varying what the world is like (without varying the circumstance of evaluation.) If we could change the truth-value of a sentence that way, then the cause must surely be indexicality; after all, if the content expressed is the same, and the circumstance of evaluation is the same, the truth-value would be the same as well. Hence if the truth-value has changed when we varied the context, we know that the content must have changed, i.e. that the sentence is genuinely indexical. The problem is that since the circumstance of evaluation is determined by the context (that is, we evaluate the truth of the sentence in the context of utterance relative to the possible world and the time of the context) when we change these elements of the context, we automatically change the circumstances of evaluation. This does not simply lead to a bad characterisation of indexicality, it is also the source of new counterexamples to the intuitive thesis: Fa Fa Fa N Fa AFa AN Fa Returning to the Two Gods thought-experiment, we are forced to admit that if one of the gods knows that the sentence the vengeful god lives on the taller mountain is true, he will have no trouble deducing actually, the vengeful god lives on the taller mountain now. But this also hints at a possible non–ad hoc solution to both this problem and the problem of contingency: the barrier only holds for a restricted kind of indexicality, indexicals whose content varies with an element of the context of utterance that is not also a part of the circumstance of evaluation. In Kaplan’s formal model, ‘I’ and ‘Here’ are of the relevant restricted kind, whereas ‘N’ and ‘A’ are not.3 3 In a model in which the truth of propositions varied only with possible world, ‘now’ might a singular term referring to the time of the context, instead of an operator, and in that case it would be of the relevant restricted kind. On the other hand, if we took the truth of a proposition to be something that could vary with location, then place might become part of the circumstance of evaluation, and ‘Here’ could become
123
156
Synthese (2011) 183:143–160
There are two ways we might use this insight to modify our definition of contextshift. First, we might allow sentences to be evaluated with respect to circumstances of evaluation which were not determined by the context. On an intuitive level that might mean determining what proposition is expressed by a sentence relative to one context of utterance, and then considering whether or not that proposition was true relative to the circumstance of evaluation determined by another. At a formal level it would mean returning to the idea that a sentence is true relative to a structure, and a context and a time and world pair, since the latter would no longer be given by the context. I shy away from this approach altogether on the grounds that if we discovered that the sentence could be false relative to such a thing, it need not have any bearing on the validity of arguments involving it: logical consequence is defined in terms of structures and contexts (sotto voce: and the circumstances of evaluation they determine) not in terms of structures and contexts and unrelated circumstances of evaluation. A more promising approach would be to employ a relation of partial context-shift in our definitions, which allows one to vary only those elements of the context which are not also elements of the circumstance of evaluation. Within Kaplan’s formal system that would mean we are allowed to vary the agent and place, but not the world or time. Both the argument from sentences containing universally satisfied predicates and the arguments which exploit ‘actually’ and ‘now’, are arguments from a constant sentence to an indexical one, and they are both intuitively valid and valid in LD. I hold that we should accept these as counterexamples to an unrestricted statement of the barrier thesis, and restrict our thesis accordingly. This we will now do. 4.4 The restricted indexical barrier theorem Definition 13 (Partial Context Shift) A context c∗ = a ∗ , p ∗ , t ∗ , w ∗ in a structure M ∗ stands in the partial context shift relation to a context c = a, p, t, w in a structure M iff M ∗ = M and t ∗ = t and w ∗ = w (i.e. the structures remain identical and contexts are allowed to shift only in their agent and place elements.) Definition 14 (Constant Sentences) A sentence A is constant iff whenever (M, c) A, and (M ∗ , c∗ ) is a partial context shift of (M, c), (M ∗ , c∗ ) A. The idea here is that a constant sentence is one such that changing the agent and the place will never affect the truth-value. Definition 15 (Indexical Sentences) A sentence A is indexical iff there is some structure-context pair (M, c) and some partial context shift of (M, c), (M ∗ , c∗ ) such that (M, c) A but (M ∗ , c∗ ) A. Footnote 3 continued a operator. And if we took the even more radical view that propositions got their truth-values relative to speakers, we would be be able to treat I (or more plausibly for me as an operator and include agent in the circumstances of evaluation and in such a logic there are two gods could entail there are two gods for me.)
123
Synthese (2011) 183:143–160
157
The idea here is that an indexical sentence is one such that changing the context alone is sometimes sufficient to change the truth-value. Now let A(v/α) be the result of replacing all occurrences of the indexical α in A with the variable v. Definition 16 (Complete Indexical Generalisation) An indexical generalisation of an sentence A with respect to an indexical term α is a sentence ∀ξ(A(ξ/α)) where ξ does not already occur in A. For example ∀v(Fv ∧ G H er e) is an indexical generalisation of F I ∧ G H er e with respect to ‘I’. A complete indexical generalisation of A is the result of repeating this process until there are no more indexicals in the sentence, e.g. ∀ p∀v(Fv ∧ Gp) is a complete indexical generalisation of F I ∧ G H er e.4 We can now formulate and prove our indexical barrier theorem: Theorem 5 (Restricted Indexical Barrier Theorem) No consistent set of constant sentences X entails an indexical sentence A unless X also entails all of A’s complete indexical generalisations. Proof Suppose X A and let A be a complete indexical generalisation of A. We show that X A . Let us suppose we number the indexicals in the sentence in turn from left to right: α1 , . . . , αn Note that A will be (or will be equivalent to) the last in a finite sequence of formulas A, A1 , . . . , An such that A j is ∀ξ j A( j−1) (ξ j /α j ) Induction Hypothesis: for all Am where m < j, X Am . Induction Step: We show that X A j . Let M, c be an arbitrary structure-context pair. Suppose M, c makes every member of X true. Each member of X is constant, and hence for all c ∈ C (where C is the set of contexts in the structure M), M, c will make every member of X true as well. Since by the induction hypothesis X Ai , it follows that for all c ∈ C, Ai will be true at M, c . Now suppose there were some assignment, f , of objects to variables with respect to which Am (ξ j /α j ) is false for some M, c∗ . Then with respect to a context which has f (ξ j ) as its first member (2nd member, if α j is a p-term instead of an i-term) Ai would be false at M, c∗ . But this contradicts what we have already found. Hence there is no assignment which makes the open formula false. It follows that ∀ξ j Ai (ξ j /α j )—that is A j —is true with respect to M, c. Hence X A j . It follows by complete induction that if X A, then X A . 4.5 Some remarks on the theorem Remark 1 The Restricted Indexical Barrier Theorem differs slightly from the other barrier theorems in its approach to the definitions of the premise and conclusion classes. Instead of looking at whether the truth of a sentence is always preserved over changes, the definitions of constant and indexical sentences look at whether truth-value is preserved over changes. To better understand the consequences of this difference, we can look at some examples of sentences that are classified as either constant or indexical 4 Given the formation rules for LD, specifying that ∀ξ(A(ξ/α)) has to be a sentence ensures that ξ is of
the correct term-type (i.e. position or individual) for that argument place in the predicate.
123
158
Synthese (2011) 183:143–160
by the definitions, and then some arguments which are or are not ruled out by the theorem. Some sentences which are classified as indexical include: F I , G H er e, R I ∧ Fa, F I ∨ H I . Sentences which are classified as constant include: Fa, ¬Fa, Fa ∧ Fb, Fa ∨ Fb, Fa → Fb, ∃v Fv and ∀v Fv, ∃v Fv, ∃ pSp, Ra → Rb and Ra ∨ Rb, as well as some sentences containing indexical operators which operate on aspects of the context which are also aspects of the circumstances of evaluation, such as AFa and N Fa. Theorems of the logic, including those containing indexical expressions, are also classified as constant, with the consequence that the (obvious translation of the) famous sentence I am here now is classified as constant. For readers to whom this is a concern, let me make three points. First, as we saw in Sect. 3 it is an artefact of this method of establishing barrier theses that theorems of the logic are counted in the premise class. Second, this is entirely in keeping with the intuitive motivations for the theorem: can either of the gods establish I am here now from his premise class? Of course—they both can. Theorems are not among the sentences which they cannot deduce. And third, I note that things simply have to be this way so long as we are dealing in standard logics: theorems are consequences of any set of premises you like, and hence any attempt to reclassify them as falling in the conclusion class would simultaneously break the barrier theorem. Unlike in the particular/universal cases, however, the disjunction of a constant sentence with an indexical is not classified as neither constant nor indexical, but rather as indexical. Consider Fa ∨ F I . It is sometimes possible to change the truth value of this sentence by changing the context (i.e. by switching between contexts where the agent ‘is F’, and those where the agent is not ‘F’.) Hence the disjunction counts as indexical. Moreover, unsatisfiable sentences are no longer classified as both indexical and constant. Since such a sentence never changes its truth value, the definitions classify it as constant. Remark 2 Since the disjunction of an indexical sentence with a constant sentence is now being classified as indexical, we should return to Prior’s argument to see what has become of it. The most obvious worry is that Prior has (per impossibile) a counterexample to our theorem, with the argument: Fa Fa ∨ F I This is indeed a valid argument and our definitions classify the premises as constant and the conclusion as indexical. However it is not a counterexample to the restricted indexical barrier theorem because the premises entail the complete indexical generalisation of the conclusion sentence: Fa ∀v(Fa ∨ Fv) Remark 3 Some barriers appear to ‘go both ways’, i.e. just as it is impossible to derive claims genuinely about the future from claims genuinely about the past, so it is impossible to derive claims genuinely about the past from claims genuinely about the future. Other barriers are uni-directional; one cannot deduce universal claims from particular claims, but one can certainly deduce particular claims from universal ones. With
123
Synthese (2011) 183:143–160
159
the original barrier theses, whether or not the barrier was uni-directional seemed to depend on features of the binary ‘R’ relation used to define preservation and fragility. If that R-relation was symmetric (as in the future-switching case), then the barrier established would go both ways. If it was not symmetric (as in the extension case) the barrier would be uni-directional. On our new approach to the indexical barrier theorem, the relation remains symmetric, but the style of our definitions has changed and the statement of the theorem has been complicated with the clause ‘unless X also entails all of A’s complete indexical generalisations.’ So one might wonder, is there a barrier to deriving constant claims from indexical ones? There are certainly valid arguments from indexical premises to indexical conclusions, as in: FI ∃v Fv
FI ∃v(Fv ∨ H a)
Located(I, H er e) ∃v∃ p(Located(v, p))
Hence any such thesis could not be completely unrestricted. However, we can establish at least the following Reverse Restricted Indexical Barrier Theorem: Theorem 6 (Reverse Restricted Indexical Barrier Theorem) No indexical sentence B entails a constant sentence A unless A is entailed by each of B’s complete indexical existential generalisations. Proof Let B be an indexical sentence and A a constant one. Any complete indexical existential generalisation of B will be (or at least be equivalent to) the last in a finite sequence of sentences B, B1 , . . . , Bn formed by, at each stage, replacing the next indexical αn in the sentence with a variable ξn , and prefixing the sentence with ∃ξn (where this takes the entire formula as its scope.) Induction Hypothesis: for all Bi where i < j, Bi A. Induction Step: We show that B j A. Suppose B j is true at some structure-context pair M, c. If B j is B we already know that B j A. Otherwise B j is ∃ξ j Bi (ξ j /α j ). Since this is true, we know there is some assignment, f , of objects to variables such that Bi (ξ j /α j ) is true relative to M, c. Let c∗ be a context which takes f (ξ j ) as its first member (or second member if α j was a p-term.) Then Bi is true with respect to M, c∗ . By the induction hypothesis Bi A, so A is true at M, c∗ as well. But A is constant, meaning that if it is true at M, c∗ then it is also true at M, c. Hence B j A. It follows by complete induction that Bn A. Remark 4 Finally, I note that the restricted indexical barrier theorem is not quite as simple and elegant as the intuitive thesis originally entertained—that no indexical sentence can be derived from a non-indexical sentence. But I would urge two points in its favour, namely that it is true, and that we have a proof. Acknowledgements I would like to thank the following people for helpful discussion of this material: Alexei Angelides, David Braun, Lara Buchak, Alexis Burgess, Fabrizio Cariani, Mark Crimmins, Paul Dekker, Catarina Dutilh, Kevin Edwards, Mylan Engel, Branden Fitelson, André Gallois, Mark Heller, Claire Horisk, Paul Hovda, Deke Gould, Jeroen Groenendijk, Stephan Hartmann, Tomis Kapitan, Krista Lawlor, John MacFarlane, Matt McGrath, Andrew Melnyk, Reinhard Muskens, Stephen Neale, Eric Pacuit, Sarah Paul, Geoff Pynn, Greg Restall, Phillip Robbins, Debra Satz, Ori Simchen, Jan Sprenger,
123
160
Synthese (2011) 183:143–160
Martin Stokhof, and Frank Veltman, as well as the other members of the audiences of talks given at UC Berkeley, Stanford University, Syracuse University, the University of Missouri at Columbia, Northern Illinois University, The University of Amsterdam, TiLPS and INPC13. My thanks also to two anonymous referees who provided valuable comments and suggestions. This research was supported by a fellowship from Tilburg Center for Logic and Philosophy of Science (TiLPS). All diagrams were produced using XY-pic.
References Burgess, J. (2005). Fixing frege. Princeton, NJ: Princeton University Press. Castaneda, H.-N. (1968). On the logic of attributions of self-knowledge to others. Journal of Philosophy, 65(15). Hodges, W. (1997). A shorter model theory. Cambridge: Cambridge University Press. Jackson, F. (1971). Defining the autonomy of ethics. The Philosophical Review, 83, 88–96. Kaplan, D. (1989a). Afterthoughts. In J. Almog, J. Perry, & H. Wettstein (Eds.), Themes from Kaplan. New York: Oxford University Press. Kaplan, D. (1989b). Demonstratives: An essay on the semantics, logic, metaphysics, and epistemology of demonstratives. In J. Almog, J. Perry, & H. Wettstein (Eds.), Themes from Kaplan. New York: Oxford University Press. Lewis, D. (1979). Attitudes de dicto and de se. The Philosophical Review, 88, 513–543. Lewis, D. (1997/1980). Index, context and content. In Papers in philosophical logic, Cambridge studies in philosophy, (Chap. 2, pp. 21–44). Cambridge: Cambridge University Press. MacFarlane, J. (2009). Non-indexical contextualism. Synthese, 166, 231–250. Perry, J. (1977). Frege on demonstratives. The Philosophical Review, 86(4), 474–497. Perry, J. (1979). The problem of the essential indexical. Nous, 13. Perry, J. (1988). The problem of the essential indexical. In N. Salmon & Soames S (Eds.), Propositions and attitudes, Oxford readings in philosophy. Oxford: Oxford University Press. Prior, A. N. (1960). The autonomy of ethics. The Australasian Journal of Philosophy, 38, 199–206. Russell, G. (2010). In defence of Hume’s law. In C. Pigden (Ed.), Hume, is and ought: New essays. Hampshire: Palgrave MacMillan. Restall, G., & Russell, G. (2010). Barriers to inference. In C. Pigden (Ed.), Hume, is and ought: New essays. Hampshire: Palgrave MacMillan. Searle, J. R. (1964). How to derive ‘ought’ from ‘is’. The Philosophical Review, 73, 43–58.
123
Synthese (2011) 183:161–174 DOI 10.1007/s11229-010-9757-8
Robustness and idealization in models of cognitive labor Ryan Muldoon · Michael Weisberg
Received: 24 April 2009 / Accepted: 17 June 2010 / Published online: 23 July 2010 © Springer Science+Business Media B.V. 2010
Abstract Scientific research is almost always conducted by communities of scientists of varying size and complexity. Such communities are effective, in part, because they divide their cognitive labor: not every scientist works on the same project. Philip Kitcher and Michael Strevens have pioneered efforts to understand this division of cognitive labor by proposing models of how scientists make decisions about which project to work on. For such models to be useful, they must be simple enough for us to understand their dynamics, but faithful enough to reality that we can use them to analyze real scientific communities. To satisfy the first requirement, we must employ idealizations to simplify the model. The second requirement demands that these idealizations not be so extreme that we lose the ability to describe realworld phenomena. This paper investigates the status of the assumptions that Kitcher and Strevens make in their models, by first inquiring whether they are reasonable representations of reality, and then by checking the models’ robustness against weakenings of these assumptions. To do this, we first argue against the reality of the assumptions, and then develop a series of agent-based simulations to systematically test their effects on model outcomes. We find that the models are not robust against weakenings of these idealizations. In fact we find that under certain conditions, this can lead to the model predicting outcomes that are qualitatively opposite of the original model outcomes.
R. Muldoon (B) Joseph L. Rotman Institute of Science and Values, University of Western Ontario, London, ON N6G 2V4, Canada e-mail:
[email protected] M. Weisberg Department of Philosophy, University of Pennsylvania, Philadelphia, PA 19104-6304, USA e-mail:
[email protected]
123
162
Synthese (2011) 183:161–174
Keywords Robustness analysis · Agent-based simulation · Division of cognitive labor · Models · Idealization · Constrained maximization 1 Introduction Scientific research is almost always conducted by communities of scientists of varying size and complexity. Such communities are effective, in part, because they divide their cognitive labor: not every scientist works on the same project. Scientists manage to do this without a central authority allocating them to different projects. Thanks largely to the pioneering studies of Philip Kitcher1 and Michael Strevens,2 understanding this self-organization has become an important area of research in the philosophy of science. One fruitful way to study how scientists divide their cognitive labor is by constructing and analyzing mathematical models of the social structure of science. Such analyses can help us understand what an optimal distribution of cognitive labor would be and how scientists can organize themselves to promote such a distribution. In order to derive these benefits, models of cognitive labor must be simple enough for us to understand their dynamics, but faithful enough to reality that we can use them to analyze real scientific communities. To satisfy the first requirement, we must rely on idealizations to reduce the complexity found in real-world scientific communities. However, the second requirement demands that these idealizations not be so extreme that we lose the ability to explain the actual benefit of divided cognitive labor. These two requirements are not impossible to satisfy simultaneously, but they do impose significant restrictions on model choice. In particular, they suggest that when idealizing assumptions are made for the sake of simplicity, it is incumbent upon the modeler to show that these assumptions are either approximately true or that they themselves do not drive the major results of the model. Kitcher and Strevens attempt to balance realism and tractability by constructing representative agent models of cognitive labor that employ what we call the marginal contribution/reward (MCR) approach. This approach has been fruitfully employed in economics and seems like a reasonable place to begin studying cognitive labor. However, we will argue that at least two assumptions of the MCR approach are neither approximately true of the scientific community nor are models containing them robust against perturbations of these assumptions. On this basis, we conclude that the MCR approach as developed by Kitcher and Strevens is not fully sufficient for analyzing the division of cognitive labor. 2 The MCR approach and its assumptions According to Kitcher and Strevens, the problem of optimally distributing cognitive labor is equivalent to a resource allocation problem. There is a certain good (cogni1 See Chapter 8 of Kitcher (1993) and Kitcher (1990). 2 Strevens (2003).
123
Synthese (2011) 183:161–174
163
tive labor) that can be allocated among different consumers (scientific projects). Each project has a marginal utility curve, called a return function by Kitcher and a success function by Strevens, which represents the ability of the project to productively utilize the cognitive resources of scientists and turn those resources into the possibility of a successful outcome. Such functions take the number of contributing scientists as an input, and output the probability of the project being successfully completed. When the success functions are considered simultaneously and the number of agents is known, the optimal distribution of scientists to projects can be calculated. In economics, the procedure for solving such a problem is called constrained maximization. One of Kitcher’s and Strevens’s main arguments is that classic epistemic norms for individuals are likely to cause scientists to misallocate themselves across projects. Consider the following very simple example: There are two possible approaches to synthesizing some new chemotherapeutic molecule. The first approach has a high probability of success when a reasonable number of scientists are working on the project. The second approach has a relatively low probability of success, but this probability of success can be realized if a small number of scientists work on the project. If every individual scientist followed a classical epistemic norm such as “take the approach most likely to lead to the truth,” no scientist would choose the second approach. Yet the scientific community would be better off if a small but significant number of scientists chose the second project because of the possibility that the first approach would not be successful no matter how many scientists worked on it. The optimum distribution of cognitive labor thus deviates from what classical epistemic agents would choose. Kitcher and Strevens further argue that the optimal division of cognitive labor might be achieved when scientists act according to self-interest instead of epistemic norms. This is accomplished by means of community-level reward schemes that assign money or credit to scientists working on whichever program happens to be successful in the end. For example, in the scheme Strevens calls Marge, scientists are rewarded according to the marginal contribution that they make to their project’s probability of success. If the community followed this reward scheme, individual scientists determine their expected reward by calculating their marginal contribution to the probability of the project’s success and multiplying this by the total reward. When scientists maximize their chance at reward, as opposed to their chance of working on the successful project, then they will more closely approximate the optimal community distribution of workers to projects. Strevens further argues that the Priority reward scheme, where the first scientist to successfully complete the project gets the whole reward, pushes the community of scientists to a nearly optimal distribution. At the core of MCR models is a procedure by which scientists calculate their expected rewards. Individually, each scientist calculates his or her marginal contribution to the probability of a project’s success and then uses this information to calculate his or her expected reward. MCR models can differ in their reward schemes, their success functions, the maximum probabilities of projects’ success, and so forth, but they all embody this core assumption. To instantiate the MCR approach in specific models, a number of further idealizing assumptions are made by Kitcher and Strevens. They first assume that scientists are
123
164
Synthese (2011) 183:161–174
utility-maximizers, responding rationally to the established incentive system. Second, they assume that the division of cognitive labor can be represented as a choice amongst a number of pre-defined projects. Third, every scientist knows the distribution of cognitive labor before she chooses what project to work on.3 We call this the distribution assumption. Finally, each project has a success function, which takes as input units of cognitive labor (work from scientists) and outputs objective probabilities of success. The success function assumption is the assumption that these functions are known by all of the scientists in the model. In the remainder of this paper we will show that when either the distribution assumption or the success function assumption is relaxed, MCR models misallocate scientists to projects.4
3 Methodology of evaluation Evaluating the distribution and success function assumptions involves constructing more flexible MCR models than the ones offered by Kitcher and Strevens. In their own versions of MCR, Kitcher and Strevens use a representative-agent approach, where the calculations are done from the point of view of a single agent who is assumed to be in exactly the same situation as all the others. In order to evaluate the status of the distribution and success function assumptions, we could not make this assumption because we wanted to study what happens when scientists have different information at their disposal. We therefore adopted an agent-based approach,5 where every scientist is represented explicitly.6 While this approach allows us to relax key assumptions, it comes at the cost of requiring us to study our models by computer simulation rather than symbolic manipulation of closed-form mathematics. Since our models are agent-based, we were required to make a number of assumptions about the nature of individuals and projects that could simply be left abstract in Kitcher’s and Strevens’ models. The first assumption we had to make was about the exact form of the success functions. Since this term canceled out in Strevens’ analysis, he left the form of the function abstract in order to gain greater generality. However, our models require that each scientist-agent actually calculate their marginal contribution to their project’s success using a specific function. Following Kitcher, we used the logistic growth equation7 for the success function. Let K be the maximum probability of success, N the number of scientists working on the project, and r the 3 Strevens (2003, pp. 64–65); Kitcher (1990, pp. 10–12, 14). 4 Following Kitcher, Bill Brock and Steve Durlaf developed a similar economic model of the distribu-
tion of cognitive labor in (1999). While they abandon “success functions,” they do retain the distribution assumption. As a result, our first critique, though not our second, applies to their work as well. 5 A standard discussion of the advantages of agent-based modeling can be found in Thomas Schelling’s
classic book Micromotives and Macrobehavior (1978). More recent discussion of these issues can be found in Miller and Page (2007). 6 All of the simulations discussed in this paper were developed in NetLogo 3.1.4. Code for the models, including the parameter sets used in this paper, is available from the authors upon request. 7 This deviates slightly from Strevens’ requirement that success functions should have decreasing marginal returns, but nothing hinges on this difference—our conclusions also hold if we only consider the portions of the success functions that have decreasing marginal returns.
123
Synthese (2011) 183:161–174
165
easiness of the project. Here “easiness” determines both how much cognitive labor is required to realize the maximum probability of success and to the marginal probability that each new agent contributes. Easier projects require fewer cognitive resources. The probability that a project will be successful is calculated with the following function: P =
K 1 + e−r N
The second assumption that we made concerns the order in which agents make their decisions about which project to work on. Insofar as one can interpret MCR models as saying something about individuals, the procedure assumes that there is only one individual left to make a choice, and that everyone else has already allocated their labor appropriately. In our models, however, every agent has to decide which project to work on. This happens sequentially, but in a randomized order each time the simulation is run. We made these first two assumptions in order to reinterpret MCR models into an agent-based framework, but a third assumption of our models introduces structure not found in Kitcher’s and Strevens’ models. In our models, agents are distributed in random locations on a torus of 35 × 35 units. This spatial dimension corresponds to communication distance, not physical proximity. We further defined a radius of vision within which agents can “see” the project choices of other agents. Agents inside the radius of vision of one another are√ within communication distance. When the radius of vision is greater than or equal to 578 units, the spatial structure is superfluous and all agents can see and communicate with one another, making our model essentially equivalent to MCR. Adjusting the radius of vision is the main tool by which we can relax the distribution assumption. Our models are initialized with a specified number of agents randomly assigned to a project. In random order, each agent first determines what the other agents in its radius of vision are working on. It then calculates its marginal contribution to its current project as well as its potential marginal contribution to the other projects. On the basis of the payoff function, each agent then calculates its expected reward for each project and then chooses the project that maximizes its expected reward.8 Since each agent follows this procedure in sequence, we repeat the procedure ten times to ensure that the community of agents finds the equilibrium distribution.9 When the radius is maximized and every agent has the same success function, we were able to replicate Kitcher and Strevens’s results. For example, we examined how our model compares to Strevens’ model when the agents chose between two pro8 The expected payoff of working on a given project J with a total payoff π is as follows: EUJ = (Pr (J(n + 1)) ∗ π (J(n + 1))—An agent chooses which project to work on by taking the MAX over all EU J . For any social utility calculations, we assume that the payoffs combine additively. 9 This is an artifact of applying MCR to a concrete population of individuals: Ordering effects that are
not relevant or possible at a high level of abstraction become possible with discrete decisions in a finite population. MCR models consider the decisions of the marginal agent, so the rest of the population has already made decisions. Several rounds are needed to ensure that each agent can be treated as the marginal agent. Without each agent performing this calculation, there is no way to see what the overall distributional effects of a given incentive system would be.
123
166
Synthese (2011) 183:161–174
jects that differed only in degree of difficulty (values for r). Using the Marge payoff function, we observed the following: With small numbers of agents, all chose to work on the easier project. As the number of agents was increased, an incentive was created for a minority of scientists to work on the harder project. When the number of agents was increased further, scientists allocated themselves to both projects, and eventually the number of scientists working on the harder project overtook the number working on the easier project. These are qualitatively10 the same results one gets with Strevens’ constrained maximization model, suggesting that we have successfully interpreted it without distortion. We next report what happens when the distribution and success function assumptions are relaxed.
4 The distribution assumption The distribution assumption is the assumption that all the agents know the distribution of cognitive labor at all times. Because it seems extremely implausible that every scientist knows the full distribution of cognitive labor across all of science, we interpret this as the claim that all scientists in a particular research domain know what all the other scientists are doing in that domain. The projects in MCR models thus correspond to a single domain. In some very small sub-fields, the distribution assumption might be approximately true, although in general we believe that it is an idealization.11 It is unlikely that scientists know what all of the others in their subfield are working on because there are too many scientists, too many research programs, and too much physical distance between scientists to maintain the level of communication necessary for this knowledge. Scientists are much more likely to be informed of the work being carried out by colleagues in their own laboratories, colleagues in close proximity, and those with whom they have pre-established relationships.12 Prima facie, this creates a problem because accurate information about how many scientists are working on each project is crucial for calculating the payoff of working on a project in the MCR scheme. To assess the extent of the problem, we report on a set of simulations where we systematically reduce the distribution information available to scientists. In order to reduce the information each scientist possesses about the distribution of cognitive labor, we reran the Marge simulations described above, but √ systematically varied each agent’s radius of vision from a full radius of vision ( 578 or approximately 24.04 units) down to a radius of one unit, allowing them to “see” only those agents to which they are adjacent. Scientists become unaware of the work of those 10 Qualitative match is the most we could aim for, as Strevens does not offer particular quantitative results
to compare against. However, our results match the behavior he describes. 11 That is not to say that there are never instances in which this idealization holds true. While we affirm
that Kitcher’s motivating case of DNA was an instance of an appropriately small sub-field, and further agree that there are other similarly small and tight-knit research communities, we doubt the generality of these cases. 12 For some empirical support of these claims, Mark Newman has done extensive study of scientific
research networks. For example, see Newman (2001).
123
Synthese (2011) 183:161–174
167
Fig. 1 Allocation of agents as a function of radius
agents outside of their radius of vision, and base their decision only on what they can see. To keep the analysis simple, we studied a 2-project model and fixed the number of total scientist-agents at 500. We kept the maximum probability of success and the total payoff equal for the two projects, but we made Project 2 harder than Project 1, meaning that more scientists are required to realize Project 2’s probability of success. Figure 1 summarizes the results of this study. As the radius of vision becomes smaller, meaning that each agent lacks information about what some of its colleagues are working on, the distribution of cognitive labor begins to deviate from the optimal distribution, with increasing numbers of scientists working on the easier project. These scientists have limited knowledge of the wider population’s choices for projects, and are less able to make informed choices themselves. We observe a qualitative shift in the distribution of scientists when the radius is 12 units or smaller. At this point, a majority of scientists begin working on the easier project, which is a severe misallocation of cognitive labor given the total number of scientists that are available. As the radius of vision drops below 7 units, not a single scientist works on the harder project. Subsequent simulations reveal that these qualitative changes occurs under many different parameter sets, but the radius at which the qualitative shifts in distribution are observed depends on the population density. As the population of scientists is more densely packed on the torus, small samples become more representative of the whole population and it takes considerably smaller radii to affect qualitative shifts.13 These simulations show that the first epistemic assumption—perfect knowledge of the division of cognitive labor—is crucial for generating the main result of MCR models. When communication between scientists is reduced and they are not in possession of perfect information about how many scientists are currently working on 13 This is only true so long as the initial distribution of projects to agents is random, with a uniform distri-
bution. In all of our simulations this condition is maintained. If this condition were weakened to any given distribution, the problems for MCR become worse, as localized sampling would be biased. As mentioned earlier, it is unlikely that this idealization holds true in actual populations of scientists, and so the problems would in fact be more severe than presented here.
123
168
Synthese (2011) 183:161–174
the projects, self-interested choices do not necessarily lead to an optimum division of cognitive labor. In fact, when this information is limited considerably, all scientists may chose to work on the easier project, or the project with higher payoff. Thus, a major result of MCR models is not robust to changes in the distribution assumption. Before analyzing the mutual knowledge assumption, it is worth considering how MCR models might be altered to accommodate this lack of robustness. One possibility is that each agent could treat the distribution of cognitive labor that it sees as a representative sample of the larger population. This seems like a sensible enhancement of the model because it probably is how scientists actually do try to assess the current division of cognitive labor. Despite moving the model in the direction of greater realism, we do not think this will solve the problem we have discussed above. For one thing, any sample the agent sees is unlikely to be representative because local clusters of scientists such as lab groups and research units are usually composed of people who are working on the same or similar problems. Similarly, scientists tend to talk about research more often with those that have similar interests than with random members of the scientific community. For both of these reasons, any sample the scientist sees is likely to be biased. Perhaps these problems could be overcome and a non-biased sample could be taken, but there is an even more significant problem for the sampling approach. The MCR approach requires that scientists know the actual number of scientists working on each project, not just the proportional distribution. To find this, scientists would have to take their sample and scale it up with reference to the total number of scientists in a research domain. But there seem to be no mechanisms by which an average scientist can determine the size of such a research community. While the membership in professional societies or numbers of conference attendees are occasionally reported, these are weak proxies for the actual number of active scientists in a given discipline. So not only is there no sufficient way for a scientist to take a non-biased sample of the larger population, such a sample couldn’t be translated into the actual distribution. Thus, we see little hope for enhancing MCR models by adding more realistic scenarios for the assessment of the current division of cognitive labor. 5 Success function assumption The second assumption of MCR we will discuss is the success function assumption, which says that every agent knows the true success function for each project. Although Kitcher and Strevens do not explicitly discuss this requirement, it is entailed by the structure of MCR models. Agents make decisions based on calculating the expected utility of joining a given research program, which requires knowing the actual success function for each project.14 Without knowing the precise shape of this function, calculating one’s marginal contribution to the project is impossible. But how plausible is this assumption? Are success functions the kind of things that could be agreed upon by everyone in a community? 14 Strevens (2003, pp. 64–65).
123
Synthese (2011) 183:161–174
169
Recall that success functions take the number of scientists working on a project as input, and output the probability that the project will be successful. To some extent, the plausibility of the success function assumption depends on how the probabilities that success functions output are interpreted. We do not believe that these probabilities can be frequencies of success because, presumably, the projects are novel and will only be worked on once. This leaves two possible interpretations of the functions: They tell us the objective intrinsic probabilities of success, or they are subjective assessments about probabilities of success. Much of what Kitcher and Strevens say about success functions suggests they have in mind the first interpretation, where the probabilities reflect the intrinsic chances that the project will succeed.15 A project with a 0.8 probability of success has some intrinsic features that make it very likely to succeed. But we are not sure that one can make sense of this interpretation. Say that the success function tells us that a project’s intrinsic maximum probability of success if 0.25. Clearly this doesn’t mean that if we repeatedly set up scientific communities, one quarter of them would be successful. The project either has faulty assumptions behind it, in which case it will fail, or good assumptions behind it, in which case it will be successful.16 Phlogiston chemists had no chance of ever understanding the real principles of chemical reactivity. The intrinsic probability for finding those principles from phlogiston-based chemistry must be zero, no matter what the optimism amongst chemists at the time was.17 Proponents of DNA as the molecule of heredity were correct, so the intrinsic probability that their research program would succeed must have been one, despite skepticism and alternative views in the scientific community. Thus, if success functions have probabilities not equal to one or zero, they must be subjective; they are best understood as the bets the scientific community is willing to take. Although we believe that subjective probabilities provide the best way to interpret the outputs of success functions, the use of such probabilities raises a new set of questions. While MCR models assume that success functions are uniform among scientists, if the probabilities really are subjective, this may often not be the case. As the number of scientists in a research community increases, the probability of all their priors agreeing almost certainly will decrease, unless there is some mechanism by which their priors can be coordinated. Since we do not see where such a mechanism could come from, we think a more plausible option is that empirical information learned in the course of scientific research leads to a convergence of posteriors. Empirical information gained from scientific investigation must somehow force scientists to
15 Strevens refers to the “intrinsic potential” of projects repeatedly. See Strevens (2003, pp. 61, 64, 69,
70, 72, 74). Kitcher refers to the objective probabilities of projects, and in his footnote 6 claims that his account is compatible with any common interpretation of probability. See Kitcher (1990, pp. 5, 7, 9–12). 16 Though the case for failure is straightforward, the case for success is a bit more contentious. One could
have faulty lab equipment and not know it, for example. But this just points to more difficulties with the original account of intrinsic probabilities: it is not clear whether the probabilities ought to refer to the concept of the experimental design, or an instance of it. 17 Note that the epistemic state of the scientists in the case of phlogiston is quite different from the facts
of the matter. The facts dictate that phlogiston had no possibility of success, but their estimates of the probability of success were nonzero.
123
170
Synthese (2011) 183:161–174
have nearly identical success functions. So we should ask: Do scientists have access to enough information about the projects they are choosing to work on that would cause their posteriors to converge? We doubt that this could happen very often. In the most extreme cases like Kuhnian revolutionary science, all the concepts and approaches are novel, so there is little information that could be used to construct success functions. In more everyday cases, success functions might be constructed by examining the successes and failures of past research, but it seems to us that this only gives qualitative support for constructing success functions. It is compatible with scientists holding quantitatively different success functions, even while all agreeing, for example, that Project 1 has a higher probability of success than Project 2. It seems clear, then, that the success function assumption is another idealization. But do the main results of the MCR approach depend crucially on this assumption? We set out to test the status of this assumption by systematically weakening it in our models. Once again, we began with a two-project, agent-based version of Strevens’ Marge model. However, instead of giving the agents uniform beliefs about the success functions for the two projects, we let these functions vary among agents. To keep the analysis simple, every agent knew that the success function was logistic and knew the function’s true value for K. However, we allowed the agents to have different beliefs about the value of r, the easiness parameter, but fixed the mean of these beliefs to the “true” values: 0.02 for Project 1 and 0.03 for Project 2. When the simulation initialized, each agent was assigned a value for r that was drawn from a normal distribution whose mean was fixed to the true value. By changing the variance of the distribution, we could increase or decrease the degree of uniformity among agents’ beliefs. In our simulations, we began with populations of 500 agents that drew their values for r out of a normal distribution with zero variance. As predicted, this initial simulation yielded identical results to the ones generated by Kitcher’s and Strevens models: Agents correctly allocated themselves to the two projects. We then increased the variance in the distribution of r values and observed misallocation in the direction of the easier project. This was done several additional times with increasing values for the variance. With the smallest variance in r that we tested (2.5E−5), 1% of agents are misallocated when compared to the correct distribution. When the variance was increased to 1.0E−4, this increases to a misallocation of 2.5% of the agents. With the much larger variance of 4E−4, we found that 7% of the agents are misallocated. The percentages of misallocated agents are fairly small, but they show that considerable misallocation occurs even when the distribution of beliefs is symmetric and the success functions are similar, but not identical. A more dramatic result happens when we increase the relative difficulty of one of the projects. By changing Project 2’s true value of r to 0.06, we make the project considerably easier than it was originally, requiring almost 14% fewer agents to optimally maximize its probability of success. Looking again at our diagnostic variance levels, with variance 2.5E−5 the agents misallocate by more than 2%. A variance 1.0E−4, this increases to more than 4%, and at variance of 4E−4, there is a nearly 12% misallocation of agents to projects. In every case with 500 scientists, the direction of this misallocation is the same: Too many scientists are allocated to the easier project.
123
Synthese (2011) 183:161–174
171
However, the trend actually reverses with much smaller total population sizes. With only 50 agents to allocate between projects, there is about 17% misallocation from the easier project to the harder project at the smallest variance level. To understand these results, consider how the scientists in the model decide which project to work on. Each scientist must determine the number of scientists working on each project, input that information into the projects’ success functions, and then determine which project would most gain from an additional scientist. By relaxing the success function assumption, each scientist has a slightly different belief about the precise shape of the projects’ success functions. Since agents not only come to have different beliefs, but also act on these beliefs, which in turn can affect the actions of others, these slight deviations in beliefs add up to significant deviations from the correct allocation of cognitive labor. This interactive nature of scientists’ beliefs helps explain why these deviations do not simply average out—beliefs would only have a chance of averaging out if they were independent of each other, but our scientists receive cues from each other, and thus have dependent beliefs. It should be noted that the deviations would be greater if agents disagreed on the success functions’ functional form. This weakening is very conservative: it supposes widely agreed-upon, accurate beliefs about the nature of the investigations at hand, and only allows for a normal distribution of beliefs in the instance where there is any disagreement at all. It is unlikely that even this weakened idealization is ever instantiated in real scientific communities. Instead we ought to expect even greater divergence of beliefs about the shape of projects’ success functions. We have now shown that weakening either the distribution or the success function assumption independently can lead to misallocation of scientists, and in dramatic cases, can allocate nearly all of the agents to the easier project. Our final analysis involves weakening both assumptions simultaneously. To do this, we assumed once again that 500 scientists needed to allocate themselves, Project 1 had a true value of 0.02 for r, and Project 2 had a true value of 0.03 for r. Following the procedure described above, we distributed the agents’ r values normally with variance 1E−4, but this time, we additionally decreased the radius of vision in order to restrict information about the current distribution of cognitive labor. As information about the distribution of cognitive labor was restricted, the misallocation of agents to projects increased. When agents had incomplete, but still considerable information about the distribution of labor (radii of vision between 24 and 9), they misallocated themselves to the easier project more severely than when either assumption was relaxed alone. However, when they were given even less information about the current distribution of cognitive labor (radii of vision between 9 and 5), the agents actually improved their allocation and came closer to the expected MCR result. This seemed to be the result of two kinds of distortion canceling each other out. Less information about the distribution of labor (small radii) encourages more scientists to work on the easier project because it seems to them that their marginal contribution to the project’s success is larger than it actually is. But if views about the differential difficulty of the two projects (values of r) vary among agents, some of the agents will think the harder project is easier than it actually is. So this partially mitigates the effect of the small radius size, although not for an epistemically interesting reason.
123
172
Synthese (2011) 183:161–174
6 Conclusions Using MCR models, Kitcher and Strevens tried to explain how scientists could manage to more-or-less optimally divide their cognitive labor without a central planner. We have argued that Kitcher’s and Strevens’ explanations are problematic because of the way that their MCR models depend on non-robust idealizations. The distribution and success function assumptions will only rarely obtain in real scientific communities. Despite the rapid flow of information in scientific communities, scientists do not typically know the full distribution of cognitive labor, nor do they have uniform beliefs about which projects are more likely to be successful. Further, we have shown that when the distribution and success function assumptions are weakened, the resulting MCR models can divide cognitive labor far from optimally. So because these assumptions are neither robust nor are they likely to be instantiated in real communities, models based on them do not possess enough fidelity to explain the self-organized division of cognitive labor in the scientific community. In light of this analysis, we must re-evaluate the broader conclusions that Kitcher and Strevens draw from their models. Like Kitcher, we believe that the scientific community is often right to hedge its bets by distributing cognitive labor, and also believe that self-interested motives of individual scientists can lead the community to better fulfill its epistemic goals. Since Kitcher’s discussion of MCR was largely oriented toward building a basic model for the study of the social structure of science, he did not use the model to make a more specific epistemic or policy recommendation. However, Kitcher claims that the MCR approach he develops can be easily extended to include more realistic assumptions. The main challenge to this, he argues, is mathematical complication.18 We believe that the results presented here challenge this assumption. Slightly more complex models show qualitatively different behavior, and this suggests that the basic models themselves need to be reconsidered, even if the intuitions behind them are sound. While Kitcher was right to argue that the scientific community is better off when some scientists work on projects that are less likely to succeed, the model that he developed to illustrate that phenomenon has some deficiencies that militate toward developing an alternative. Strevens puts MCR models to work in service of a more ambitious claim. He uses such models to argue that the Priority Rule, a payoff function that gives all the credit to whichever scientist successfully completes the project first, is the best incentive structure for promoting the optimal distribution of cognitive labor. The Priority Rule encourages scientists to hedge their bets by distributing cognitive labor, but biases the distribution in favor of the project most likely to succeed. In order to show this, Strevens compares the results of MCR models employing the Marge scheme to those employing the Priority rule. The latter encourages more agents to work on the project with the higher maximum probability of success, which is closer to the optimal distribution. While we fully agree with Strevens’ argument for MCR models making the distribution and success function assumptions, we do not believe that his analysis holds up if
18 Kitcher (1990, pp. 7, 18, 22).
123
Synthese (2011) 183:161–174
173
either of these assumptions is relaxed. We have shown that relaxing these assumptions misallocates cognitive labor toward the easier project. Similar results are obtained when the projects are equally easy but one has a higher maximum probability of success. In this case, agents are over-allocated to the high-probability project. If we add the Priority Rule on top of this over-allocation, agents will be grossly over-allocated, perhaps resulting in no one working on the low-probability project at all. This is precisely what an incentive system inspired by social epistemology should be able to avoid. The desirability of the Priority Rule in Kitcher’s and Strevens’ model scientific communities is hard to dispute. However, we do not believe that real scientific communities optimally distribute their cognitive labor when this rule is in place. Indeed, the opposite may be true in a wide variety of cases. Like Kitcher, we think that it would be surprising if the current set of incentives in science are anywhere close to optimal.19 These incentives were designed for communities of scientists that much more closely approximated the epistemic assumptions of MCR than the scientific community of the twenty-first century. As we have demonstrated, once the epistemic assumptions of MCR models are weakened, the direction of the allocation of cognitive resources can change. Adopting the Priority Rule exacerbates this problem, resulting in greater misallocation than then when the Marge payoff function is employed. What we found is that the results, which hold true of the original model, have no clear connection to actual scientific communities. While it may be true that the Priority Rule is good for real scientific communities, the models presented by Strevens cannot fully support this conclusion. In closing, we note that it might be possible to develop revised MCR models, either by weakening the idealizations as we have done or by finding ones that are more robust to perturbation. This work may turn out to be fruitful, although our own experience of modifying MCR models to test for robustness suggests that this approach generates extremely complex models. As an alternative, we propose the development of a new framework for studying the division of cognitive labor that makes fewer epistemic assumptions about scientists from the outset. In such a framework, scientists make decisions about what projects to work on “on-line,” as information about their own and their community’s success and failures become known.20 Such a framework may advance the project of understanding the social nature of scientific epistemology championed by Kitcher and Strevens by providing a more fertile basis for continued research.
References Brock, B., & Durlaf, S. (1999). A formal model of theory choice in science. Economic Theory, XIV, 113–130. Kitcher, P. (1990). The division of cognitive labor. Journal of Philosophy, LXXXVII(1), 5–22. Kitcher, P. (1993). The advancement of science. New York: Oxford University Press. Miller, J. H., & Page, S. (2007). Complex adaptive systems. Princeton: Princeton University Press.
19 (op. cit., p. 22). 20 In Weisberg and Muldoon (2009), we argue for an alternative modeling approach inspired by ecological
models that remains simple while also ensuring that the agents in the model are under similar epistemic constraints as actual scientists.
123
174
Synthese (2011) 183:161–174
Newman, M. E. J. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Science USA, 98, 404–409. Schelling, T. (1978). Micromotives and macrobehavior. New York: Norton. Strevens, M. (2003). The role of the priority rule in science. Journal of Philosophy C, 2, 55–79. Weisberg, M., & Muldoon, R. (2009). Epistemic landscapes and the division of cognitive labor. Philosophy of Science, 76(2), 225–252.
123
Synthese (2011) 183:175–185 DOI 10.1007/s11229-010-9759-6
An old problem for the new rationalism Yuval Avnur
Received: 29 January 2010 / Accepted: 18 June 2010 / Published online: 29 July 2010 © Springer Science+Business Media B.V. 2010
Abstract A well known skeptical paradox rests on the claim that we lack warrant to believe that we are not brains in a vat (BIVs). The argument for that claim is the apparent impossibility of any evidence or argument that we are not BIVs. Many contemporary philosophers resist this argument by insisting that we have a sort of warrant for believing that we are not BIVs that does not require having any evidence or argument. I call this view ‘New Rationalism’. I argue that New Rationalists are committed to there being some evidence or argument for believing that we are not BIVs anyway. Therefore, New Rationalism, since its appeal is that it purportedly avoids the problematic commitment to such evidence or argument, undermines its own appeal. We cannot avoid the difficult work of coming up with evidence or argument by positing some permissive sort of warrant. Keywords
Epistemology · Skepticism · Warrant
The skeptical paradox concerning epistemic warrant1 can be formulated as follows: (1) You lack epistemic warrant to believe that you are not a brain in a vat (from here on, ‘BIV’). (2) If you lack epistemic warrant to believe that you are not a BIV, then you lack epistemic warrant to believe anything on the basis of your senses.
1 I am using ‘epistemic warrant’ in Wright (2004) sense rather than Plantinga (1993) sense. That is, an
epistemic warrant is a positive status relating to the likely truth of a belief (more on this below), but it is not defined as whatever makes a true belief constitute knowledge. The point of using ‘warrant’ rather than ‘justification’ is that, according to some epistemologists, ‘justification’ suggests an evidence-involving status, which in the context of this paper would be a problematic feature to build into the status under discussion. Y. Avnur (B) Scripps College, Claremont, CA, USA e-mail:
[email protected]
123
176
Synthese (2011) 183:175–185
(3) You have epistemic warrant to believe many things on the basis of your senses. One of (1)-(3) must be false. (3) is sacrosanct to epistemology, and rejecting (2) is regarded by most epistemologists as a “non-starter” strategy since it entails a rejection of a plausible closure principle for warrant. So most epistemologists reject (1). The most straightforward options for rejecting (1) are: Old Rationalism: There is some warrant-giving, a priori argument that you are not a BIV.2 and Mooreanism: There is some warrant-giving, empirical evidence that you are not a BIV.3 Both of these options face serious difficulties.4 Old Rationalism faces the difficulty of positing an a priori argument for a deeply contingent truth.5 Mooreanism faces the difficulty of explaining away the intuition that sensory evidence cannot itself support the hypothesis that sensory experiences are not misleading, since such an inference is epistemically circular. I call the problem of facing these difficulties ‘the Old Problem’, since philosophers have grappled with these difficulties for a long time now.6 The Old Problem motivates a third option: New Rationalism: You have epistemic warrant to believe that you are not a BIV even if there is no evidence or argument that you are not a BIV. The idea is to posit a permissive sort of warrant that does not require the problematic evidence that raises the Old Problem. It is new in the sense that it appears to avoid the old problem and has become increasingly popular in recent times. Some recent proponents are Cohen (2000), Davis (2003), Garrett (2007), Wright (2004), and White (2006). It is a form of rationalism in the sense that the sort of warrant that it posits is not perceptual. Some New Rationalists call this sort of warrant ‘entitlement’, to be 2 The most obvious example is Descartes’ theistic arguments in the Mediatations. Another example is the “paradigm case” argument for the reliability of the senses, which rests on a priori claims about how ordinary predicates (like ‘is red’ and ‘is round’) must get their meaning. See Alston (1993) for more on this approach. Insofar as an appeal to explanatory simplicity is an a priori form of inference, there have also been various abductivist versions of Old Rationalism: Bonjour (2005), Jackson (1977), Russell (1997), and Vogel (1990), to name a few. 3 Some recent examples are Bergmann (2004), Pryor (2004) and Markie (2005). I call them ‘Mooreans’ because their response to the skeptical problem—an appeal to the senses—resembles Moore (1939) famous “proof” of the external world. 4 Alston (1993) surveys some of these difficulties. For more recent criticism of Mooreanism, see Cohen (2000), White (2006), and Wright (2007). 5 This is not the only sort of problem that Old Rationalists face. Another problem is that they must hold that our warrant to believe that we are not BIVs rests on some direct, rational insight. They must hold this because they must ultimately appeal to such insight in order to justify the premises of any a priori argument that we are not BIVs. This is problematic not only because it entails a priori evidence for a deeply contingent truth, but also because it requires an account of how rational insight works and why we should regard it as reliable. 6 It is at least as old as Hume’s Enquiry (sect. XII, Part I).
123
Synthese (2011) 183:175–185
177
contrasted with a sort of warrant derived from a cognitive achievement (such as sensory experience). But all New Rationalists posit some sort of default warrant that does not rest on any evidence or argument. It is important to note that some philosophers who posit a sort of warrant that does not require the believer to possess any evidence or argument may still require that there is some evidence or argument—which is “in principle” available—that we are not BIVs. For example, Peacocke (2004) does not require subjects to appreciate his “complexity-reduction” argument that we are not BIVs, even though that argument “explains” their warrant to believe the contents of their perceptual states. Since he thinks that there is an argument that we are not BIVs, I regard him as an Old Rationalist, since his claim engages with the Old Problem.7 The appeal of New Rationalism is that it seems to avoid the Old Problem, since it posits no evidence (from here on by ‘evidence’ I mean evidence or argument) that we are not BIVs.8 However, I will argue, New Rationalism raises the Old Problem. For, in order to establish New Rationalism itself, some evidence must be given that we are not BIVs. Because it derives its appeal from its promise to avoid commitment to such evidence, New Rationalism lacks appeal. The more general upshot is that we cannot avoid the difficult work of coming up with evidence by positing some extremely permissive status that requires no evidence. In Sect. 1, I propose a principle about epistemic warrant. In Sect. 2, I argue that this principle undermines the appeal of New Rationalism. In Sect. 3, I address a potential objection. 1 A principle about epistemic warrant Consider the following principle, where ‘S’ stands for any Subject, ‘A’ stands for any Argument, and ‘P’ stands for any Proposition: (4) If S judges that A is evidence that S’s own belief that P is epistemically warranted, then S should, in doing so, judge that A is evidence for P. I will defend (4) with some arguments. But first I need to make three points of clarification. First, (4) is compatible with internalism, externalism, and the idea that epistemic warrant does not always require evidence. Those are all views about being warranted. (4) is a principle about making a judgment about one’s warrant. (More on internalism/externalism below.) Second, the consequent of (4) does not make any claims about gaining any new evidence for P. Consider the argument “I have evidence e for P, and e warrants my belief that P; therefore, my belief that P is warranted.” Arguably, this argument constitutes evidence that my belief that P is warranted, but does not give me any additional evidence for P over and above the evidence provided by e. Yet, as (4) entails, this 7 In some cases it is difficult to tell whether an anti-skeptic endorses New Rationalism. McDowell (2008,
see especially pp. 385–386), for example, seems ambivalent between Mooreanism and New Rationalism. 8 New Rationalism also purports to address the problem noted in footnote 5 by positing an a priori warrant
without commitment to any rational insight. They would therefore avoid the need to explain how such insight works or why we should regard it as reliable.
123
178
Synthese (2011) 183:175–185
argument can also serve as evidence for P, since its first premise supports P (and none of the other premises undermine this). So, (4) is compatible with this scenario. More Generally, (4) tells us only that A is evidence, not whether A is merely a reformulation of our original, first-order evidence for P. Third, and finally, S might be confused about the concept of epistemic justification, or S might be equivocating. One might, for various reasons, deny that in such a case S should judge that e is evidence for P. Let us set such cases aside, and stipulate that ‘S’ in (4) stands for any competent subject. Having made these clarifications, let us turn to some arguments for (4). (4) follows from the distinguishing feature of epistemic evaluation. ‘Epistemic’ is used to distinguish an evaluative status as one that bears positively on the relevant belief’s likely truth. In other words, an epistemic warrant is an evaluative status that a belief enjoys in virtue of some property that bears positively on its own likely truth. After clarifying the role of ‘its own’ in this formulation of epistemic warrant, I will show that, unless we understand epistemic warrant in this way, we cannot make good sense of the skeptical paradox. The argument that (4) follows from this formulation of epistemic warrant will then be straightforward. Epistemic warrant marks a positive relation between a belief and its own likely truth. If your belief that P is warranted merely in virtue of how it bears on the likely truth of other beliefs, or your total web of belief in general, then the belief may still fail to enjoy a positive connection with its own truth. For illustration, suppose that you have no opinion about whether the number of stars in the universe is even. God reveals to you that if you believe that the number of stars is even then he will answer truthfully any question you ask him. Setting aside the question of whether it is within your power to choose to believe that the number of stars is even, we can still ask: if you could now believe, in light of God’s revelation, that the number of stars is even, would that belief fare better with respect to its own truth than it did before God made the promise? Intuitively, the answer is ‘no’. It seems no more likely than before that the number of stars is even (setting aside the thought that God would never tempt you to believe something false). This is so even though believing it would lead to significantly better knowledge and many more warranted beliefs about the world. Whether a belief is epistemically warranted depends on whether that belief fares well with respect to its own truth. So the belief that the number of stars is even is, in this scenario, not epistemically warranted. So far, I have emphasized that being epistemically warranted is a property relating to a belief’s own likely truth. This element of epistemic warrant can now be seen to be essential to epistemic warrant’s role in the skeptical paradox. Unless (1) is taken to be a challenge to our conviction that our belief that we are not BIVs has some feature that makes it likely to be true (i.e. epistemic warrant), it cannot, along with (2), plausibly be taken to challenge our conviction that our ordinary beliefs have such a feature. But it is precisely the idea that our ordinarily beliefs may lack such a truth-related feature that troubles us. Accordingly, the standard anti-skeptical denial of (1), which is a response to the troubling challenge presented by the paradox, must be understood as a claim that our belief that we are not BIVs has a feature that makes it likely to be true. So it must be taken as a claim that our belief is epistemically warranted in the sense of being positively related to its own likely truth. To put this another way: if our reply to
123
Synthese (2011) 183:175–185
179
skepticism does not address the worry that our belief that we are not BIVs lacks some feature that relates positively to its likely truth and that therefore our ordinary beliefs also lack such a feature, then our reply does not address a skeptical paradox which does play on precisely that worry. So our solution to the paradox, if it is going to be satisfactorily complete, must address the issue of whether our belief that we are BIVs has a feature that relates positively to its own likely truth. The custom is to call this feature ‘epistemic warrant’. (4) follows from this, since if you think that something (e.g. a philosophical argument) is evidence that your belief has a feature that makes it likely, then you should think that this something is evidence that your belief is likely. This is just to say that it is evidence for your belief. Thus, (4) follows from any view of epistemic warrant that makes sense of the skeptical paradox by connecting it with the likely truth. The nature of warrant’s connection with the likely truth is a matter of controversy between externalism and internalism, so it is important to see that they both entail (4). Suppose that epistemic warrant is a matter of having an objective connection with the likely truth, as externalists hold. Then if you judge that your own belief has this positive, objective relation with truth, you should think that your belief is (to that degree) likely to be true. For, obviously, if you judge that the belief is objectively likely, you should think that it is to that extent likely. Now suppose instead that epistemic warrant is a matter of having a positive connection with the likely truth relative to the believer’s perspective, as internalists hold. Then if you judge that your own belief has this positive relation with truth then you should think that it is to that extent likely, because of course you are thinking from your own perspective. One should never be in a position in which, though one judges that from one’s perspective some proposition is quite likely, one does not judge that the proposition is quite likely!9 Thus, on both externalism and internalism, evidence that your own belief is warranted should be regarded as evidence for your belief, since you must regard the evidence as an indication that your belief is likely. One might object that, since being warranted does not entail being true, evidence that a belief is warranted is not evidence that it is true.10 This objection misconceives the nature of evidence. For example, though the fact that your friend is sick would not entail that she stayed home, evidence that she is sick can be evidence that she stayed home. Likewise, the argument for (4) exploits such probabilistic relations, rather than entailments, between warrant and truth. If you think that A indicates that your belief that P has a status that bears positively on its likely truth, then you should regard A as evidence—not necessarily a guarantee—that P is likely to be true. Otherwise, your belief has no positive relation to its own likely truth. This does not assume or imply that being warranted entails being true, any more than we assumed or implied, in the above example, that being sick entails staying home. I have been arguing that (4) follows from the nature of epistemic warrant. Since there are various ways to make this plausible by pointing out that epistemic warrant must have some positive connection to the likely truth, it may be worth offering the 9 I am of course setting aside cases in which you judge that, before you obtained some new evidence that you are momentarily ignoring, some proposition was likely from your (old) perspective. 10 This objection is due to an anonymous referee from another journal.
123
180
Synthese (2011) 183:175–185
following more brief reformulation of the argument. Suppose that (4) is false. Then when I learn that some belief of mine is epistemically warranted, I may not think that I have thereby discovered anything that bears on whether my belief is true. So, given that I am generally interested in figuring out the truth about the world, I have not yet learned whether I should find plausible some entailment of that belief, whether I should appeal to that belief in settling any questions, or whether I should use that belief in my reasoning at all. It seems, then, that I should not care about whether my beliefs are epistemically warranted, and I should not care whether (3) is true. The “paradox” should not puzzle or worry me at all. Clearly, this is a mistake. I should care about epistemic warrant. I am puzzled about the fact that apparently plausible premises entail not-(3). So the supposition in the preceding paragraph must be rejected: (4) must be true.11 Here is a similar argument from White (2006) (who sympathizes with New Rationalism, and whose ‘justified’ is interchangeable with my ‘warranted’): Justification is a kind of guide to the truth. We seek to form justified beliefs as a means to forming true beliefs. This is why a rational inquirer is sensitive to questions of justificatory status in forming his beliefs. In a serious inquiry as to whether P, we ask ourselves whether we would be justified in believing P. An affirmative answer should serve to boost our conviction in P, while a negative answer should undermine it. Justification can play this role only on the assumption that justified beliefs tend to be true, so that it is not typical to be fully justified in believing something false. If we were very often justified in believing what is false, it would be of little help in our pursuit of truth to believe only what we are justified in believing. Hence the fact that I will be justified in believing P, counts as a reason to suppose that it is true (unless this reason is undermined, as in the case immediately above where I have reason to believe that this particular future justification will be in something false).” (p. 539) If the fact that I will be justified in believing P counts as evidence for P, then surely some evidence that I am (and will continue to be) justified in believing P should count as evidence for P. Now that (4) is established, let us see how it bears on New Rationalism.
11 An anonymous referee has proposed a counterexample to (4): Call the proposition that there is water ‘W’, and call the proposition that it appears to me that I believe that P ‘A’. I may judge that A is evidence that my belief that W is warranted. But this need not commit me to the judgment that A is evidence for W. It follows that (4) is false. However, if I judge that A is evidence that my own belief that W is epistemically warranted, then I should also think that my beliefs tend to be warranted. For, if I do not think that my beliefs tend to be warranted, then I should not think that appearing to be a belief of mine is itself any indication of being a warranted belief. But if I think that my beliefs tend to be warranted, then I should think that A is also evidence for W, contrary to what the objection claims. For, if my beliefs tend to be warranted—that is, they tend to have some property that bears positively on their own likely truth—then being a belief of mine is a property that tends to bear positively on that belief’s truth. So I should regard any evidence that something is a belief of mine as evidence that that thing is true. Thus, (4) is vindicated.
123
Synthese (2011) 183:175–185
181
2 New rationalism and the old problem According to New Rationalism, our belief that we are not BIVs is epistemically warranted. As responsible philosophers, we should accept New Rationalism only if we think that there is some evidence for it, perhaps some good philosophical argument. Therefore, if we accept New Rationalism, we should also judge that there is some evidence that we are epistemically warranted in believing that we are not BIVs. (4) entails that if we judge that something is evidence that we are warranted in believing that we are not BIVs, then we should also judge that it is evidence that we are not BIVs. So: (5) If we accept New Rationalism, then we should also judge that there is evidence that we are not BIVs. (5) is bad news for New Rationalism. For the whole point of New Rationalism is to avoid commitment to the existence of any such evidence! If the evidence featured in (5) is a priori, then New Rationalism has no advantage over Old Rationalism. For if we thought it implausible that we can find some evidence that we are not BIVs by thinking about God (as, say, Descartes might do), why would we think it plausible that we can find such evidence by thinking about warrant, or about epistemology? That is just as bizarre as the view that you can, from the armchair, discover that there is water in your environment, or that the number of stars is even. These are all deeply contingent claims. Meanwhile, if the evidence featured in (5) is not a priori, then New Rationalism has no advantage over Mooreanism. For we must then explain away the persistent intuition that this appeal to sensory evidence in support of the claim that our senses are not deceptive is question-begging. These difficulties constitute the Old Problem that New Rationalism was designed to avoid. And yet New Rationalists must face the Old Problem anyway. Some New Rationalists (in particular Wright 2007) appear to offer philosophical arguments in support of a second-order claim: that you have warrant to believe that you have warrant (specifically, a default “entitlement”) to believe that you are not a BIV.12 It is therefore worth pointing out that such second-order claims are as problematic as the first-order claim that you have warrant to believe that you are not a BIV. For, if one regards the New Rationalists’ arguments as evidence for the claim that one has warrant to believe that one has warrant to believe that one is not a BIV, then according to (4) one should regard those arguments as evidence that one has warrant to believe that one is not a BIV. According to (4), as we have seen, one should therefore regard those arguments as evidence that one is not a BIV. So this second-order version of New Rationalism carries the same problematic commitment to (5). The same considerations that support (5) also support (6), which raises another problem from New Rationalism: (6) If we do not judge that some argument is evidence that we are not BIVs, then we should not judge that that argument is evidence for New Rationalism. Clearly, as their authors would surely admit, the essays written in support of New Rationalism (cited above) do not contain evidence that we are not BIVs. The point 12 Thanks to an anonymous referee for pointing this out.
123
182
Synthese (2011) 183:175–185
of these essays is to argue for New Rationalism without providing such problematic evidence. But, according to (6), since these essays do not contain evidence that we are not BIVs, they should not be regarded as containing evidence for New Rationalism. They therefore fail to support New Rationalism. In contrast, essays in support of Old Rationalism do purport to give such evidence, in the form of a priori arguments that we are not BIVs. And essays in support of Mooreanism appeal to empirical evidence that we purportedly have for the belief that we are not BIVs. Note that this problem is distinct from the one based solely on (5), namely that New Rationalism carries a problematic commitment. The problem posed by (6) is that New Rationalism has not been argued for in the first place, even setting aside the fact that its commitments are problematic. One might suggest that, just as our warrant for believing that we are not BIVs does not require any evidence that we are not BIVs, so too our warrant for belief in New Rationalism does not require any evidence for New Rationalism. So neither (5) nor (6) is problematic. However, insofar as there is no evidence for New Rationalism, it is not an attractive philosophical view (recall that I am using ‘evidence’ to mean ‘evidence or argument’). A philosophical theory should be argued for, not taken on blind faith. Next, let us consider a similar but perhaps more serious objection. 3 An objection The New Rationalist might object that I have misconceived her argument. Perhaps she argues for her view by eliminating the non-skeptical alternatives, and therefore need not argue that we have warrant. Rather, she need only argue that, given that we have warrant, the warrant we have for believing that we are not BIVs is of the sort that does not require evidence. Thus, according to this objection, the argument for New Rationalism is: (a) (3) is true (i.e. skepticism is false). (b) New Rationalism is the most plausible view on which (3) is true. Therefore, (c) New Rationalism is true. 13 where (a) is an assumption of the anti-skeptical inquiry, and (b) is established by arguing against the denial of (2) and against Old Rationalism and Mooreanism, presumably by appeal the Old Problem. In this way, New Rationalism is established without ever arguing for (3), and therefore without explicitly arguing that we have warrant for believing that we are not BIVs (setting aside the fact that one would still need to establish (2) before inferring, from (3), that we have warrant to believe that we are not BIVs). In this way, the New Rationalist may reject both (5) and (6). Although tempting, this objection does not succeed. For, even if we grant that (a) is an assumption of the New Rationalist’s inquiry rather than something for which the New Rationalist argues, the status of the assumption still determines the plausibility and philosophical significance of New Rationalism. To see how consideration of the status of assumption (a) shows that New Rationalism is not off the hook, notice that 13 This is, roughly, the basic argument in Wright (2004) and White (2006).
123
Synthese (2011) 183:175–185
183
the New Rationalist faces a dilemma: she either does or does not claim to have some warrant to believe (a). Suppose the New Rationalist claims to have warrant to believe (a). Then (a), together with our presumably a priori warrant for (b), constitutes evidence that we are not BIVs. To see why, notice that the New Rationalist is committed to (7): (7) (a) and (b) constitute evidence for New Rationalism (that our belief that we are not BIVs is warranted even if there is no evidence). According to (4), if the New Rationalist is committed to (7) then she is committed to (8): (8) (a) and (b) constitute evidence that we are not BIVs. This is just as problematic as (5), since it entails (5) and raises the Old Problem. More specifically, if the warrant for (a) is a priori (and assuming the warrant for (b) is also a priori), then the New Rationalist inherits Old Rationalism’s difficulties. And if the warrant for (a) is a posteriori, then the New Rationalist inherits Mooreanism’s difficulties (for in that case there is an argument, one of whose premises is a posteriori, which constitutes evidence that we are not BIVs). If we knew how to address these difficulties, which I have been calling ‘the Old Problem’, we would have no need for New Rationalism in the first place: we would be either Old Rationalists or Mooreans. Moreover, the argument (a)–(c) is self-undermining. To see why, notice that (b) states that views on which there is evidence that we are not BIVs are relatively implausible. Presumably, the support for (b) is something like the claim that the Old Problem cannot be satisfactorily addressed, since there is no such evidence. But, as we have just seen, if (a) and (b) constitute evidence for New Rationalism, then they constitute evidence that we are not BIVs. This contradicts the presumed support for (b), so (b) is undermined. Therefore, (a) and (b) cannot constitute good evidence for New Rationalism, since if they do, then (b) is undermined, which entails that they do not constitute good evidence for New Rationalism. Thus, the objection fails if the New Rationalist considers her assumption, (a), as something that she has warrant to believe. Suppose instead that the New Rationalist does not appeal to warrant to believe (a). If so, then the New Rationalist argues only that hers is the best non-skeptical strategy, where such a strategy is understood within the context of an inquiry that is based on the assumption (a). She can succeed at this, it seems, without ever appealing to any warrant to believe (a). Since (b) is the only thing that the New Rationalist needs to argue for, the New Rationalist may reject (5) and (6). Still, New Rationalist is at a distinct disadvantage compared to the other, alternative non-skeptical views. Let us call those who are engaged in the aforementioned inquiry, which simply assumes, and does not address or claim warrant for, premise (a), ‘anti-skeptics’. When anti-skeptics are asked whether their project is intellectually worthwhile, their most natural reply is to claim that they have epistemic warrant to believe the assumption on which their entire project is based, namely (a). Old Rationalists, Mooreans, and those who reject (2) (and accept (1)) can answer this question without undermining their own anti-skeptical view. An Old Rationalist can say that
123
184
Synthese (2011) 183:175–185
anti-skeptics have warrant to believe (a) because they can see that there is a warrantgiving a priori argument that we are not BIVs, and that therefore our experiences give us warrant for our perceptual beliefs. Mooreans and those who reject (2) can claim that the anti-skeptic can see that our experiences are themselves sufficient to give us some immediate warrant for our perceptual beliefs. (Mooreans go on to say that we can then infer, with warrant, that we are not BIVs, while those who deny (2) and accept (1) deny that we have any warrant to believe that we are BIVs). In all of these cases, warrant to believe assumption (a) is accounted for, so that these anti-skeptical strategies are internally self-supporting: they can account for, and are consistent with, the anti-skeptic’s warrant for the assumption of anti-skepticism, (a). This is clearly a philosophical advantage to these views. But when a New Rationalist attempts to account for her warrant to believe the assumption on which her inquiry is based, she is immediately embarrassed. If she cannot account for any warrant, then she is embarrassed by her lack of warrant for the assumption on which her inquiry is based. If she does appeal to some warrant, then as I have argued above, she commits herself to some evidence that she is not a BIV via commitments to (7) and (8). She thereby inherits the same commitments as either the Old Rationalists or the Mooreans. But unlike the Old Rationalists and the Mooreans, she cannot embrace the claim that there is such evidence, since that undermines the motivation for her view (specifically, (b)). Thus, New Rationalism is not internally self-supporting: the New Rationalist cannot, while staying consistent with her support for her view, show that her inquiry is intellectually worthwhile. I conclude that the objection that I have been considering in this section does not save New Rationalism. In arguing that her view is true or that her inquiry is worthwhile, the New Rationalist must undermine the motivation for her view. The upshot is that we cannot avoid the difficult work of coming up with evidence by positing some extremely permissive evaluative status that does not require evidence. The skeptical problem is harder than that, and the only non-skeptical alternative to appealing to evidence that we are not BIVs, short of renouncing our concept of warrant altogether, is to deny (2). Acknowledgements For comments and discussion, I thank Eliza Block, David Barnett, Don Garrett, Paul Horwich, Matt Kotzen, Jim Pryor, Karl Schafer, and Stephen Schiffer.
References Alston, W. P. (1993). The reliability of sense perception. New York: Cornell University Press. Bergmann, M. (2004). Epistemic circularity: Malignant and benign. Philosophy and Phenomenological Research, 69, 709–727. Bonjour, L. (2005). In: E. Sosa & M. Steup (Eds.), Contemporary debates in epistemology (Chapter 5). Cambridge, MA: Blackwell Publishers. Cohen, S. (2000). Contextualism and skepticism. Philosophical Issues, (Skepticism) 10, 94–107. Davis, M. (2003). The problem of armchair knowledge. In S. Nuccetelli (Ed.), New essays on semantic externalism and self-knowledge (pp. 23–55). Cambridge, MA: MIT Press. Garrett, D. (2007). Reasons to act and reasons to believe: Naturalism and rational justification in Hume’s philosophical project. Philosophical Studies, 132(1), 1–16. Jackson, F. (1977). Perception: A representative theory. Cambridge: Cambridge University Press. Markie, P. (2005). Easy knowledge. Philosophy and Phenomenological Research, 70(2), 406–416.
123
Synthese (2011) 183:175–185
185
McDowell, J. (2008). 16. The disjunctive conception of experience as material for a transcendental argument. In A. Haddock & F. Macpherson (Eds.), Disjunctivism: Perception, action, knowledge. Oxford: Oxford University Press. Moore, G. E. (1939). Proof of an external world. Proceedings of the British Academy, 25, 273–300. Peacocke, C. (2004). The realm of reason. Oxford: Oxford University Press. Plantinga, A. (1993). Warrant: The current debate. Oxford: Oxford University Press. Pryor, J. (2004). What’s wrong with Moore’s argument? Philosophical Issues (Epistemology), 14(1), 349–378. Russell, B. (1997). The problems of philosophy. Oxford: Oxford University Press. Vogel, J. (1990). Cartesian skepticism and inference to the best explanation. Journal of Philosophy, 87, 658–666. White, R. (2006). Problems for dogmatism. Philosophical Studies, 131(3), 525–557. Wright, C. (2004). Warrant for nothing (and foundations for free)?. Aristotelian Society Supplement, 78(1), 167–212. Wright, C. (2007). The perils of dogmatism. In S. Nuccetelli & G. Seay (Eds.), Themes from G. E. Moore: New essays in epistemology and ethics. Oxford: Oxford University Press.
123
This page intentionally left blank z
Synthese (2011) 183:187–210 DOI 10.1007/s11229-010-9758-7
Assertion and grounding: a theory of assertion for constructive type theory Maria van der Schaar
Received: 13 January 2010 / Accepted: 18 June 2010 / Published online: 30 July 2010 © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract Taking Per Martin-Löf’s constructive type theory as a starting-point a theory of assertion is developed, which is able to account for the epistemic aspects of the speech act of assertion, and in which it is shown that assertion is not a wide genus. From a constructivist point of view, one is entitled to assert, for example, that a proposition A is true, only if one has constructed a proof object a for A in an act of demonstration. One thereby has grounded the assertion by an act of demonstration, and a grounding account of assertion therefore suits constructive type theory. Because the act of demonstration in which such a proof object is constructed results in knowledge that A is true, the constructivist account of assertion has to ward off some of the criticism directed against knowledge accounts of assertion. It is especially the internal relation between a judgement being grounded and its being known that makes it possible to do so. The grounding account of assertion can be considered as a justification account of assertion, but it also differs from justification accounts recently proposed, namely in the treatment of selfless assertions, that is, assertions which are grounded, but are not accompanied by belief. Keywords
Assertion · Judgement · Constructive type theory
1 Introduction Assertions play a role as premises and conclusions in our reasoning. Assertion is therefore an important topic for philosophers. Recent papers on assertion focus on the question: under what condition is one entitled to make an assertion? A well-known account of assertion is given by Williamson (1996), who defends the thesis that one should assert that S only if one knows that S. Such an account of assertion is called M. van der Schaar (B) Faculty of Philosophy, Leiden University, Leiden, The Netherlands e-mail:
[email protected]
123
188
Synthese (2011) 183:187–210
a knowledge account of assertion. In this paper I give an account of assertion for Per Martin-Löf’s constructive type theory. From a constructivist point of view, one is entitled to assert that a proposition A is true, only if one has constructed a proof object a for A, and, in general, an assertion may be made only if it has been grounded by means of an act of demonstration. This account of assertion may therefore be called a grounding account of assertion. A theory of assertion should not merely answer the question under what condition one is entitled to assert. It should also give an account of the relation between assertion and judgement, between the speech act of assertion and the declarative sentence, and it should give an account of the fact that assertions may be correct or incorrect. In constructive type theory the notion of judgement plays an important role, because inference rules are understood as applying to judgements. If it is true that assertion and judgement are correlated notions, the constructivist account of judgement will yield a theory of assertion for constructive type theory. By relating the notion of judgement to the notion of assertion, it is possible to give a pragmatic interpretation of judgemental force. In order to give a proper evaluation of a constructivist theory of assertion, the question is raised what every theory of assertion has to account for (Sect. 2). From a constructivist point of view, grounding an assertion of the form A is true amounts to knowing that A is true. It is therefore important to understand to what extent the grounding account of assertion can be understood as a knowledge account of assertion. And, because an assertion that is grounded by an act of demonstration is justified, the grounding account of assertion is also a justification account. The three accounts may all be called epistemic accounts of assertion. In Sect. 3, I focus on the problems that philosophers have put forward for knowledge accounts of assertion. Does this critique apply to all epistemic accounts of assertion? In Sect. 4, the concepts that are essential to a constructivist theory of assertion will be elucidated; these will include the concepts grounding, proof, proposition, knowledge, judgement, judgemental correctness and propositional truth. A full constructivist theory of assertion is developed in Sect. 5. The final Sect. 6 gives an evaluation of the constructivist account of assertion proposed: to what extent does it differ from both knowledge and justification accounts of assertion as they are commonly understood; and is the grounding account of assertion an improvement of these other accounts in at least some aspects? 2 Elements of a theory of assertion Every theory of assertion has to explain in what sense: (1) assertion is a speech act; (2) assertions are related to judgements; (3) assertions may be correct or incorrect. And every theory of assertion has to say what type of speech act assertion is by indicating: (4) under what condition one is entitled to assert. (1) Assertion is a speech act. All speech acts have certain characteristics in common. Every speech act results in a product. The act of promising results in a promise made;
123
Synthese (2011) 183:187–210
189
and the act of pronouncing a judgement upon the accused results in a verdict. The act of promising and the act of judging exist only for a short time, but the promise made and the verdict are still in force after the act of promising or judging has ended. Like other terms for speech acts, the term ‘assertion’ shows the act/product ambiguity (1a): the term may either stand for the act of assertion or for the assertion made. There is an internal relation between act and product: the act of asserting necessarily results in the assertion made. And there is no assertion made without a corresponding act of assertion, although that act may no longer exist when the assertion made is still in force. In reasoning, our assertions function as premises and conclusion (see 3e). These assertions cannot be acts, for these are gone when the conclusion is reached; premises and conclusion are assertions made, the products of acts of assertion. The act/product distinction is not the same as the act/proposition distinction (1b), which holds for most speech acts. The assertion ‘John is the father of Mary.’ and the question ‘Is John the father of Mary?’ contain the same proposition that John is the father of Mary. The fact that different speech acts may have their proposition in common explains, for example, that the above-mentioned assertion can be considered as an answer to the corresponding question. Propositions are thus essential for relating different speech acts to each other. In all speech acts we are free either to perform the act or not. Assertion is up to us (1c), although we will see below that there is also a sense in which this is not the case. Each type of speech act has a certain quality that distinguishes it from other types. It is the judgemental or assertive force (1d) that makes the act an act of assertion. The assertive force may be indicated by the assertion sign (). The assertion sign shows that the assertion has been made: the assertion sign therefore precedes one’s premises and conclusions. The explanation of assertive force is an explanation of what kind of speech act assertion is. And what kind of speech act assertion is, becomes clear when the pragmatic rule of assertion is given, that is, when it is shown under what condition one is entitled to assert (see point 4). Since Russell we use the assertion sign as an undivided unity, but Frege intended it to be composed of the horizontal, called ‘content stroke’, and the vertical, called ‘judgement stroke’ (‘Urtheilsstrich,’ Frege 1879, § 2). The latter is a sign that the sentence is used with judgemental or assertive force. The content stroke is a sign that what follows is a judgeable content. According to Tyler Burge: “the vertical judgement stroke represents judgmental force, and the horizontal alone comes to represent a semantical predicate, such as ‘is a fact,’ or ‘is true”’ (Burge 1986, 114). The ‘is true’ part is thus not a sign that the content is asserted.1 The ‘is true’ part rather indicates that the sentence is standardly used with assertive force, in contrast, for example, to a part such as ‘Is it true …?’ or ‘May it be true’. A theory of assertion has to make a distinction between the ‘is true’ part and a sign of judgemental force. Further, a theory of assertion should answer the question whether there is only one type of assertive force, as Frege has argued for, or whether one has to acknowledge besides assertoric force a special force of denial. It is to be noted that the phrase ‘I assert that’, like the phrase ‘is true’, cannot function as sign of assertive force, because we may use this phrase in front of a declarative that is used as 1 Frege himself is ambiguous concerning the phrase ‘is a fact’ in the Begriffsschrift, for he seems to imply that the predicate ‘ist eine Tatsache’ may function as sign of judgemental force (Frege 1879, § 3).
123
190
Synthese (2011) 183:187–210
antecedent of a conditional sentence. To assert, though, is a performative verb, because the verb may be used to effect what it signifies. Without indications to the contrary, in saying ‘I assert that S,’ I did assert that S (Austin 1962, 122); ‘to assert’ is therefore a performative verb, on Austin’s account. Each speech act is in need of language in order to be performed. In order to be well understood, one may use a special type of sentence for making a certain type of speech act. In the case of assertions, it is the utterance of the declarative sentence that is standardly used for making an assertion. A theory of assertion has to account for the special connection between assertion and the declarative sentence (1e). Without indications to the contrary, we understand an utterance of the declarative to make an assertion. It is not to be denied that the occurrence of a declarative sentence may be used for other purposes, too. We often use (the occurrence of) a declarative to ask a question, or to express the antecedent of a conditional. Furthermore, the use of the declarative is not necessary to make an assertion: we sometimes assert by means of the utterance of an interrogative sentence, or by nodding our head. A theory of assertion should also account for the use of the declarative on stage to make a mock assertion (1e ), a “Scheinbehauptung”, as Frege put it. Not all philosophers believe that there is a special relation between the speech act of assertion and the declarative sentence (cf. Davidson 1984). But even these philosophers will agree that we cannot assert by means of an isolated utterance of a simple collection of terms, such as ‘runs, sits, walks’; neither can we use an isolated utterance of a that-clause or a phrase like ‘the death of Caesar’ to make an assertion. Apart from the special relation between the declarative and the assertion, one may wonder whether there is a special form (1e ), or a special linguistic structure, that is somehow essential or basic to assertion and judgement. Do all assertions and judgements have a subject-copula-predicate (S-isP) structure, or is it rather the predicative form … is a fact that is common to all judgements, as Frege proposed in the Begriffsschrift? All speech acts are done by an agent. For assertion this means that an assertion is always made from a certain perspective: the perspective of the asserter (1f). Furthermore, all speech acts are social acts. One uses an assertion to convince others, or to defend the truth of a declarative in a dialogue. And, an assertion licenses others to reassert; entitlements to assert can be conferred upon others (1g). Speech acts have special relations with other speech acts. A theory of assertion has to account for the fact that an assertion can always be considered as answer to a question (1h). Before we use the occurrence of a declarative sentence S to assert that S, we often wonder whether S, which wonder may be expressed by means of the question ‘Is it true that S?’. If someone makes the assertion that S, we assume that he apprehends the meaning of S (1h ). If someone asserts that S without apprehending the meaning of S, the assertion is somehow unhappy or inappropriate. Some religious thinkers, though, believe that one may properly assert and judge what one does not fully apprehend.2 The fact that S has a meaning may be considered as a presupposition of the assertion that S. If the presupposition is false, the assertion is void to such an 2 According to Cardinal Newman, one can genuinely assent to a proposition that one does not fully understand. One needs to understand the predicate, but not the subject of the proposition (Newman 1870, 16, 17).
123
Synthese (2011) 183:187–210
191
extent that no assertion is made. A theory of assertion has to give an account of the notion of presupposition, in order to answer the question whether every assertion has a presupposition (1h ). (2) Assertions have a special relation to judgements. According to Frege, the act of assertion is the expression or announcement of a judgement [made] (Frege 1918, 62). And according to Dummett, the act of judgement is the interiorization of the external act of assertion (Dummett 1973, 362). Whether one takes the linguistic notion or the mental notion to be prior in the conceptual order, there seems to be an important relation between the two notions. Instead of the term ‘judgement’, the term ‘belief’ is often used in modern philosophy. Although one may decide to use these terms synonymously, there is in general an important distinction between the notions judgement and belief. Judgement, like assertion, is primarily an act, and the act of judgement is an all or nothing affair: judgemental or assertive force does not have degrees. Belief is primarily a mental state, and is generally understood to have degrees; belief in this sense is conviction, not judgement. Assertion and judgement, though, need not go together (2 ). Although an assertion is generally taken to be an announcement of a judgement made, the person who is making the assertion may not have made the relevant judgement. We are capable of lying (2 a). A lie is an assertion with the intention to mislead the hearer, in such a way that the hearer is to understand that the speaker believes or judges what he asserts, whereas the speaker believes or judges that the proposition he asserts to be true, is actually false. An assertion may also mislead the hearer in another way: by putting forward a declarative sentence with unqualified assertive force, the asserter presents himself as having a ground for his assertion (see 4). If the asserter does not have such a ground, his assertion is misleading (2 b) (independent of the question whether the asserter had the intention to mislead or not), although we do not call it a lie when the asserter believes what he asserts. When President George H.W. Bush asserted that there were weapons of mass destruction in Iraq, and used the assertion as a premise in the argument for attacking Iraq, he sincerely believed, I suppose, that there were such weapons; this means that he cannot be accused of lying. Still, the assertion was misleading, because he did not have grounds for his assertion. The only ‘ground’ that Bush had for his assertion, I suppose, was the strong conviction in the truth of what he asserted, that is, his assertion was the expression of a prejudice. Strong convictions do not seem to be the right type of ground for assertions that do not have convictions as their subject. A theory of assertion thus has to say something about what may count as ground for a certain assertion. (3) One of the most important characteristics of assertion is that they may be correct or incorrect. Many philosophers explain the correctness of an assertion in terms of the truth of the proposition contained in the assertion, and it is therefore common to speak of the truth or falsity of an assertion. There are, though, other ways to explain the correctness of an assertion, as will become clear in this paper. In general, one may say that a theory of assertion has to give an account of the relation between the correctness (or incorrectness) of an assertion and the truth (or falsity) of the proposition asserted to be true in the assertion (3a). Further, it seems that there are other speech acts whose product may be correct or incorrect (3b). One may think of a speech act in which a guess is expressed, or in which a suggestion is done. A theory of assertion has to give
123
192
Synthese (2011) 183:187–210
an account of the relation between assertion and these other speech acts: is assertion a genus containing the speech acts just mentioned as species, or is assertion related in a different way to these other speech acts? The characteristic of being correct or incorrect also holds for the mental counterpart of assertion, that is, for judgement. Judgement, though, is not up to us, precisely because of its being correct or incorrect. When I look out of the window, I cannot help but judge that it is raining. I could have decided not to look out of the window, and in that case I, perhaps, would not have made the same judgement. So there is a sense in which we can indirectly influence our judgements. Given, though, that I have been looking out of the window, I am no longer free to judge that it is not raining. In contrast, I am free to assert or not to assert that it is raining (see 1c). We may express the point thus: as speech act, assertion is up to us; as expression of our judgement, assertion is not completely up to us. If I am allowed, for the moment, to use the metaphor, introduced by Bernard Williams, of aiming at truth, one may say that insofar as we aim at truth, assertion is not up to us (3c). Two last points relating to (3): assertions are related to other assertions, especially if they are made by the same agent. If the asserter realizes that he has asserted that S, and that he also has asserted that it is not the case that S, he has to withdraw at least one of his assertions, on pain of the accusation of being inconsistent or irrational (3d). This is precisely because assertions aim at truth, and because the assertions S and it is not the case that S cannot both be correct. Further, assertions can be used to offer a reason (3e). If someone asks ‘How do you know that this bird is a tree-creeper?’, the answer might be: ‘It has the typical movements of a tree-creeper.’ The assertion made that the bird has the typical movements of a tree-creeper may thus function as a reason for the assertion ‘It is a tree-creeper’. Assertions function in our reasoning not only as conclusions, but also as premises on which we base our conclusions. (4) Under what condition is one entitled to assert? Or, what is the pragmatic rule for assertion? Philosophers have given four different types of answer to this question. One may say that one is entitled to assert that S, if and only if (i) one believes that S; (ii) one has a justification for the truth of S; (iii) it is true that S; or (iv) one knows that S. If one defends the thesis that one is entitled to assert that S, if and only if one believes that S, one takes belief to be a necessary and sufficient condition for entitlement to assert. Entitlement to assert is to be understood exclusively in a cognitive sense: one may thus be entitled to assert that S, while it is inappropriate to make the assertion because it is contrary to etiquette, or because it is wrong for ethical reasons. If it is true that someone is entitled to assert that S if and only if he knows that S, one defends a knowledge account of assertion, and knowledge may be called the norm for assertion.3 There are some linguistic facts that may help us decide the question under what condition one is entitled to assert. When an assertion is put forward, an interlocutor has a right to ask ‘How do you know?’ (4a). The question presupposes that the asserter knows what he asserts, and the interlocutor seems to ask how the asserter has obtained his knowledge, or what his grounds are for making the assertion. The interlocutor expects that the asserter knows what he asserts, and demands him to give grounds for his assertion. The Moorean paradoxes (4b) also show something about the condition 3 “Something is a norm, or a concept is normative, if it involves some form of evaluation or appraisal, or some standard of correctness” (Engel 2002, 131).
123
Synthese (2011) 183:187–210
193
under which one is entitled to assert. Each of the sentences ‘It rains, but I do not believe it.’, ‘It rains, but I have no evidence for it.’, and ‘It rains, but I do not know it.’ is paradoxical only when uttered with assertive force. This seems to imply that one is entitled to assert only if one believes and knows what one asserts, and if one has evidence for what one asserts. Points (4a) and (4b) give us reason to think that knowledge is the norm for assertion, and quite a few philosophers defend a knowledge accounts of assertion, although recently important arguments have been raised against such an account. In the introduction I already mentioned that the constructivist account of assertion is a grounding account of assertion. This account has some similarities with knowledge accounts of assertion, and it therefore has to be seen how the grounding account will answer the critique raised against knowledge accounts in general. 3 Arguments against knowledge accounts of assertion A knowledge account of assertion has it that one is entitled to assert that S if and only if one knows that S, where entitlement is understood in the exclusive cognitive sense explained above. On a standard account of knowledge this implies that one is entitled to assert that S, if and only if (1) one believes that S is true, (2) S is true, where S’s being true need not be accessible to the asserter, and (3) one has a justification for S. Some philosophers, though, think that knowledge is too severe a norm for assertion. A knowledge account of assertion can be criticized from three different points of view: a belief account, a truth account, and a justification account of assertion. Some philosophers say that one is entitled to assert that S if and only if one believes that S. By making the assertion that S one represents oneself as believing that S, as Davidson says.4 A more developed account of assertion in terms of belief can be found in Bernard Williams’ Truth and Truthfulness (2002, Chap. 4). For Williams, one of the central requirements for assertion is sincerity, which at the most basic level is simply openness: say what you believe.5 Williams gives an account of assertion in terms of expression of belief. If one understands assertion as expression of belief, one needs to take special account of lying; not every assertion is an expression of belief. The central question that needs to be answered on such an account is: What is belief? Williams does certainly not neglect that question, and I will come back to his account of belief in the final section, as well as to the question whether sincerity or openness is a specific norm for assertion. Williams’ account is supported by one of the characteristics of assertion: standardly, assertion is understood to be the announcement of a judgement made or a belief. According to Williams, knowledge is too strong a norm for assertion: the asserter should be in a position to apply the norm for assertion effectively, and if the norm is knowledge, he may not be in a position to do so, because he may reasonably think that he knows that S while not knowing that S. The asserter is not to be 4 Davidson leaves open the possibility that there is more at stake than belief: “Someone who makes an assertion represents himself as believing what he says, and perhaps as being justified in his belief” (Davidson 1984, 268). 5 “Sincerity at the most basic level is simply openness, a lack of inhibition” (Williams 2002, 75).
123
194
Synthese (2011) 183:187–210
criticized for his assertion, when he reasonably thinks that he knows what he asserts (Williams 2002, 76, 77). It should be noted that for Williams assertion is, perhaps, a broader notion than it is for those who defend a knowledge account of assertion: “merely telling someone that P, without any special entitlement … is surely already assertion,” he says (Williams 2002, 77). Other philosophers have defended a truth account of assertion. According to John Searle, “the [illocutionary] point or purpose of the members of the assertive class is to commit the speaker (in varying degrees) to something’s being the case, to the truth of the expressed proposition” (Searle 1979, 12). The illocutionary point, according to Searle, is one of the features of illocutionary force, and Frege’s assertion sign marks the illocutionary point of all members of the assertive class, that is, of all assertives. Searle’s assertive class includes speech acts that are not assertions: suggesting and putting forward as a hypothesis. Searle understands assertives also as expressions of belief, where belief may reach to degree zero. ‘Belief’ is thus to be understood as conviction, which comes in degrees. It is difficult to understand why he uses Frege’s assertion sign in front of hypotheses; the acknowledgement of degrees of assertive force is not compatible with Frege’s account of judgement and assertion. A truth account of assertion is also defended by Matt Weiner. According to Weiner, assertion is a genus comprising “species such as reports, predictions, arguments, reminders, and speculations” (Weiner 2005, 229). If ‘assertion’ is understood in such a broad sense, knowledge is definitely too strong a norm for assertion. One of the questions that needs to be raised in this paper is how broad the concept of assertion is to be understood. How we choose our terminology is not an arbitrary matter. Somehow, we want to be able to blame a person whose assertions are expressions of mere guesses, that is, a norm for assertion seems to be violated in such cases. According to Weiner, the norm for the different species just mentioned is truth, and not knowledge, which point he elucidates by giving two examples. One example concerns a prediction; the other concerns the detective work of Sherlock Holmes. Suppose that Holmes and Doctor Watson are brought to a crime scene. Holmes looks carefully at the scene, and says: “This is the work of Professor Moriarty! It has the mark of his fiendish genius” (Weiner 2005, 231). According to Weiner, Holmes’ remark is an assertion, and indeed there do not seem to be signs that the remark is not to be understood as an assertion. When the assertion turns out to be true, Weiner says, there is no way that the assertion can be called improper, although Holmes seemingly did not know that it was the work of Moriarty, for he based his assertion not on evidence, but on ‘his sense of what Moriarty’s crimes are like’. When asked ‘How do you know?’, Holmes is not able to give grounds for his assertion. I will give an evaluation of Weiner’s example in the last section. For now, it is important to understand that the truth account has something important to say. If someone tells us the truth without having knowledge of this truth and without being able to give a justification, he is generally not blamed for this, and may even be praised for it. Any knowledge or justification account of assertion has to explain this intuition. Besides, both the belief and the truth account of assertion are supported by the fact that entitlements to assert can be conferred upon others. This may be problematic for those who defend a knowledge or justification account of assertion, because the asserter often does not make explicit the ground for his assertion.
123
Synthese (2011) 183:187–210
195
Recently, knowledge accounts of assertion have been criticized from a different angle. Jonathan Kvanvig has defended the claim that it is justification rather than knowledge or truth that is the norm for assertion (Kvanvig 2009). He argues that it is important to make a distinction between two ways in which one may have to take back one’s assertion: the agent may have to take back his assertion, because its content is shown to be false, in which case he is not to blame for his act of asserting, or he may have to take back his assertion because there was something wrong with the very act of asserting. In the latter case, one is blamed for making the speech act of assertion if it turns out that the asserter did not have any ground for his assertion. The norm for the act of assertion, according to Kvanvig, is not knowledge, but epistemic justification. The point of the question ‘How do you know?’ when an agent has made an assertion, Kvanvig says, is to find out what the reasons are for the assertion, and when the reasons are given, the questioner is satisfied, because his question is answered. Such reasons may not be sufficient, though, for knowledge. What counts against both a belief and a knowledge account of assertion, Kvanvig says, is that we sometimes properly assert what we don’t believe, as in the case of the teacher who is required to teach certain material: “we assert things we don’t believe because of a social role we inhabit” (Idem, p. 141). Jennifer Lackey has called such assertions selfless assertions (Lackey 2007, 598 ff; cf. Douven 2006, 461). As a teacher, one may fully assert as biological facts the central theses of evolutionary theory, although one’s personal religion makes one believe otherwise. Essential to selfless assertions is that the subject makes the assertion because it is best supported by the evidence (Lackey 2007, 603), and that he does not believe personally what he asserts “for purely non-epistemic reasons” (Lackey 2007, 599). These assertions are not insincere, and the speaker is not lying, it is said. Kvanvig, Lackey, and Douven (2006, 460) deny that belief is required as a norm for assertion. In selfless assertions, the speaker is able to answer the question ‘How do you know?’, because he can give grounds for his assertion, that is, because it is rational for him to believe what he asserts. And this in no way implies that the speaker knows or believes what he asserts. It thus seems that justification is the only norm for assertion. These selfless assertions form a tension with characteristic (1f) of assertions, which says that an assertion is always made from the perspective of the asserter. The perspective from which the assertion is made seems not to be a personal one, but rather that of a role to which the asserter has committed himself. A defender of a knowledge account of assertion has to explain this phenomenon of selfless assertions. In order to understand in what sense the grounding account of assertion proposed in this paper differs from justification accounts of assertion as proposed by Kvanvig and Lackey, an account of selfless assertions needs to be developed. Lackey and Douven argue that defenders of a knowledge or a truth account of assertion have to acknowledge two forms of propriety with respect to assertion. Matt Weiner has indeed defended such a distinction (Weiner 2005, 229, following Keith DeRose in this respect). The primary propriety is determined by whether the act actually conforms to the norm, whether truth or knowledge, while secondary propriety is determined by whether the asserter reasonably believes that the act conforms to the norm. It does not seem, though, that such a distinction has any value, because the agent is fully entitled to assert if it is reasonable for him to believe what he asserts
123
196
Synthese (2011) 183:187–210
(Lackey 2007, 608; cf. Douven 2006, 476ff). There is no way to blame him for his act of assertion, it is said, although the content of his assertion may be criticised in the way Kvanvig has argued. A grounding account of assertion has to show in what sense it agrees with and in what sense it differs from knowledge and justification accounts of assertion as proposed in recent years. In order to be able to make such a comparison I need to explain what the concepts grounding, proof, proposition, truth, judgement, and knowledge amount to from a constructivist point of view. 4 Grounding, knowledge and truth in constructive type theory In this paper I propose the thesis that a grounding account of assertion suits constructive type theory. In mathematics, a judgement is grounded by an act of demonstration resulting in a theorem, or a judgement is grounded by an act of immediate insight, that is, an act of understanding, resulting in an axiom, such as 0 is a natural number (0 : N), A is a proposition (A : prop), or propositions form a type (prop : type). A judgement is thus grounded by a cognitive act that may result in a theorem or axiom; it is the cognitive act that does the grounding, where a cognitive act is either an act of demonstration or an act of immediate insight.6 In an act of demonstration for a judgement of the form ‘A is true’, a proof object a is constructed for a certain proposition A, resulting in the theorem a : A. In mathematics, the proof can be named and can itself be treated mathematically, and is therefore a proof object (cf. Sundholm 1994, 121). Outside mathematics, one may say, for example, that the (canonical) proof object for the proposition that the ball is red is the red-moment of the ball, which is constructed in an act of perception (cognitive act is thus becoming a broader notion). On a constructivist account, one is entitled to assert that A is true, only if one has constructed a proof object a for the proposition A, and recognizes a to be a proof for A, thereby demonstrating that A is true. The judgement A is true is thus shorthand for the judgement a : A. The proof object can be considered as a truth-maker for the proposition A, where A is the truth-bearer (cf. Sundholm 1994, 117ff). What counts as a truth-maker for A is determined by the proposition A, which is understood as a set of proof objects. There is thus an internal relation between the proof object a and A itself. Whether there exists a proof object for A is not determined by the proposition A alone, unless A is a tautology. On a constructivist account, one is entitled to assert that there exists a proof object or truth-maker for A, only if one has constructed a proof object a for A. Existence of a truth-maker cannot be identified with the existential quantifier. If one would consider that there exists a proof object for the proposition A to be a proposition, the explanation of the truth of propositions would be circular. There exists a proof object for A is a judgement, and ‘existence’ is to be understood as constructibility. A is true precisely means that a proof object a for A can be constructed, that is, that there exists a proof object for A. There is thus also an internal relation between the proof object a for the proposition A and the truth of A. 6 I have developed a theory of the cognitive act in my (2010).
123
Synthese (2011) 183:187–210
197
On a constructivist account, a judging agent has to understand what the proposition A is, in order to recognize an object as a proof object for A. The meaning of a proposition A is given in terms of its canonical proofs, and in order that a certain object may count as a proof for A that object needs to be either a canonical proof for A or a noncanonical proof, where a non-canonical proof for A consists in a method to obtain a canonical proof for A. As Dummett has put it, “the meaning alone determines whether or not something is a ground for accepting the sentence” (Dummett 1976, 88). The difference between a constructivist and a classical logician consists in the different meaning explanations that are given, for example, for negation. What the classical logician accepts as a condition under which ¬A is true is not the same as what the constructivist accepts as a condition under which ¬A is true. They give thus a different meaning to the proposition ¬A, but this is not to imply that we arbitrarily start with a supposition about the meaning of negation. There is a true dispute about meaning, which shows itself in the fact that the intuitionist denies the classical logician to be entitled to make certain assertions. And this means that what the one accepts as a proof system, the other does not accept as such. A proof system is acceptable to me, if its axioms are the result of an act of immediate insight, and the inference(-mode)s are valid, that is, “when a chain of evidence-preserving steps … can be given, which links premises and conclusion” (Sundholm 2004, 455), that is, when each step is justified or grounded by an act of immediate insight. As soon as I understand that the axioms are correct, I understand that the result of the next step has to be correct and justified. For the intuitionist, a proof system that uses elimination of double negation as a rule of inference does not allow for the fact that each step is insight preserving; the proof system is therefore not acceptable to him, and can therefore hardly be called a proof system. Because the axioms are the result of acts of immediate insight, the acts of inference departing from these axioms are insight preserving, and are therefore acts of demonstration. This means that what is shown to be correct within a proof system that is acceptable to me, is justified and known, period, and not merely known, justified, or correct relative to the proof system. It should be noted, though, that from a constructivist point of view justification and knowledge are justification and knowledge from a first person perspective, because knowledge is grounded in a first-person cognitive act (see my 2010). It is precisely for this reason that knowledge is fallible (see below). What precisely does a grounding account of assertion mean? For common assertions of the from ‘A is true’, one may say that one is entitled to assert that A is true, only if one has constructed a proof object a for A in an act of demonstration. The act of demonstration thereby grounds the assertion. A similar account can be given for assertions of the form ‘A is false’ and of conditional assertions (for the condition under which one is entitled to assert that A is false, see Sect. 5). There are also assertions, though, that cannot be captured by one of these forms, such as A is a proposition (that is, a set), or sets form a type. One is entitled to assert that A is a set, if and only if one has understood what a set is, and has understood that A is such an object. In these cases, one is entitled to assert iff the assertion is grounded by an act of immediate insight, an act of understanding. In a general sense, one may say that one is entitled to assert precisely if the assertion is grounded by a cognitive act (or ‘act of knowing’, Martin-Löf 1991, 144, 146, 1996, 1). Such a cognitive act is either an act of demonstration or an act of immediate insight, and one may extend the concept of cognitive
123
198
Synthese (2011) 183:187–210
act to non-mathematical acts, such as the act of perception. As we will see below, a cognitive act is precisely what makes an assertion justified, and it is for this reason that a grounding account of assertion is a justification account of assertion. How is the notion of act of demonstration, and that of cognitive act related to the concept of knowledge? According to Martin-Löf, knowledge can be characterized as justified judgement (Martin-Löf 1998, 110): knowledge is a judgement that is grounded or justified through an act of demonstration, or an act of immediate insight. A judgement is thus grounded by a cognitive act, whether that act is based on other judgements made, or not. The distinction between act and product, introduced in Sect. 2, applies both to knowledge and judgement. An act of judgement results in a judgement made; in the same way, an act of knowing results in knowledge as product, the justified judgement in the explanation given above. Knowledge as product, or a piece of knowledge (‘eine Erkenntnis’, Martin-Löf 1996, 20), is an abstract object dependent for its existence on an act of knowing. Before one makes a judgement, one has to understand the (possible) judgement7 : “a judgement is defined by laying down what it is that you must know in order to have the right to make it” (Martin-Löf 1998, 108). This means, in case the judgement has the form ‘A is true’, that one has to understand what kind of proof object has to be constructed for A, in order be entitled to make the judgement in question. Judgement is thus a notion that it is explained in epistemic terms. And assertion can be explained on a similar basis, that is, an assertion is defined by laying down what it is that you must do in order to have the right to make it: one has to perform a cognitive act, such as an act of demonstration for a judgement of the form ‘A is true’, in which a proof object a for the proposition A is constructed, resulting in knowledge that A is true. It is in this sense that the grounding account of assertion can be understood as a knowledge account of assertion, although, as we will see below, the constructivist concept of knowledge cannot be identified with the concept of knowledge that is in use in standard knowledge accounts of assertion. It is now possible to explain the notion of judgemental truth or correctness, a notion which will turn out to be relevant for the question what correctness of an assertion means. A judgement is correct means that it can be grounded or justified (“demonstrated”, “made evident”, Martin-Löf 1998, 109). A judgement is correct thus means that it is justifiable (“demonstrable”, “evidenceable”, Idem). Correctness of a judgement is not a primitive notion: it is defined by means of the cognitive term ‘being grounded’. Correctness of a judgement is not to be identified with truth of a proposition, but one may say that if and only if a proof object for A is constructible, that is, iff A is true, then the judgement A is true is justifiable, that is, correct. Although knowledge implies truth in the sense that a judgement that is known is also correct, knowledge does not imply infallible or real truth, according to MartinLöf: “[A] demonstration purports to make something evident to us, and it is the best guarantee that we have, but it is not infallible” (Martin-Löf 1998, 110). Our acts of demonstration are fallible in the sense that what now counts as a demonstrated judgement, what now counts as a theorem, may in the future turn out to be inconsistent with 7 One may thus speak of a judgement before it is actually judged. I have called this the assertion candidate in my (2007); see also the next section.
123
Synthese (2011) 183:187–210
199
other judgements that count as theorems, in which case we have to withdraw one of our knowledge claims (Martin-Löf 1991, 144). And because an act of demonstration, and the act of knowing in general, is fallible, its product, a piece of knowledge, is fallible, too. If we have withdrawn our knowledge claim, we have to admit that what seemed to be an act of demonstration to me then is no longer considered by me to be an act of demonstration for the relevant judgement. On a constructivist account, knowledge does not imply infallible truth, and differs in this sense from standard accounts of knowledge. The constructivist account of assertion differs from standard knowledge accounts of assertion because the constructivist explains knowledge in a different way. Modern criticism on the explanation of knowledge as justified true belief is directed at the notion justification, whereas the constructivist precisely keeps the ‘justification’ terminology. On a constructivist account, justification is understood in traditional internalist terms, and there is an internal relation between the notions justification and knowledge, and between justification and judgemental correctness. Further, knowledge is explained here in terms of ‘judgement’, not in terms of ‘belief’. ‘Belief’ is an ambiguous term. If it is understood as judgement in the way it is explained above, there is no difference between the constructivist and the standard account of knowledge as far as this term is concerned. Generally, though, ‘belief’ in the explanation of knowledge is understood as a mental state, a certain degree of conviction, whereas the judgemental act is an all or nothing affair (see Sect. 6). Finally, the notion of infallible truth, that is, truth that transcends the individual judger, is not part of the constructivist account of knowledge: knowledge implies judgemental correctness, but not infallible truth.8 5 A theory of assertion for constructive type theory (1) The distinction between act of judgement, judgement made, and possible judgement that was proposed in Sect. 4 can also be made for assertion (1a): there is the act of assertion, the assertion made resulting from such an act, and the possible assertion or assertion candidate. The assertion candidate is an epistemic notion, as it is explained in terms of what one must know in order to be entitled to make the assertion, and is not to be confused with the non-epistemic notion of proposition, which is understood as a set: the former has the form of a declarative (it rains, or that it rains is true), whereas the proposition has the form of a that-clause (that it rains), which form can be used to name an object in contrast to the declarative sentence (1b). The proposition does not have the right form to be asserted; one needs to add the indicative mood, the is true part, to the that clause in order to obtain an assertible form. The question to what extent assertion is up to the asserter (1c), I answer together with the question to what extent assertion is not up to us (see point 3c below). When we use an occurrence of a declarative sentence to make an assertion, the (occurrence of the) declarative has assertive force. The assertion sign in front of 8 A full comparison between the constructivist account of knowledge and the standard account of knowledge
I have given in my (2010).
123
200
Synthese (2011) 183:187–210
a declarative sentence shows that the declarative has been used with assertive force (1d). In constructive type theory many judgements have the form A is true, but the is true-part of these judgements is not to be understood as a sign of assertive force: judgements of the form A is true may function as antecedent of a hypothetical judgement, which means that there are contexts in which the judgement A is true is not asserted. Here, ‘judgement’ is to be understood as possible judgement, or what I have called the assertion candidate. Besides, constructive type theory acknowledges judgements of the form A is false, and these judgements may have assertive force, too. One is entitled to utter A is false with assertive force if and only if one has constructed a refutation of A. A proposition A is false if there exists a disproof, or refutation, of A. And a disproof of A is a hypothetical proof of absurdity from A, which is a function which takes a proof of A into a proof of absurdity. (‘Proof’ is to be understood in the sense of proof object.) The assertive force that is attached to a judgement of the form A is false when the candidate is used to make an assertion, does not differ from the assertive force that is attached to the judgement A is true when the latter is used to make an assertion. Although these are different judgemental forms, there is only one type of assertive force. Both the judgements A is true and A is false are explained by what one has to know in order to be entitled to make it. The assertive force attached to each of these judgements involves a claim that one is entitled to make the relevant judgement. The assertive force is thus the same for each of these judgements, although the sort of proof object that one claims to possess is a different one. Ranta (1994, 25) has given a different explanation of the assertion sign in CTT: he identifies the assertion sign with the is true-part of a judgement, which implies that “assertions will also occur as hypotheses”, where a hypothesis functions as antecedent of a hypothetical judgement (Ranta 1994, 26). For Ranta, the assertion sign, which he identifies with the is true-part, is a sign of indicative mood. Because the antecedent of a hypothetical judgement has indicative mood, the assertion sign is present in the hypothesis, according to him. My answer to Ranta’s proposal is, first, that the is true-part does not seem to function as a unique sign of indicative mood; a judgement of the form A is false also has indicative mood, but it lacks the is true-part; so, Ranta has to acknowledge different signs of indicative mood. Second, one may use the term ‘assertion’ and ‘assertive force’ with a new meaning, but then one’s logical system is still in need of the notions that are generally called ‘assertion’ and ‘assertive force’. By making a distinction between assertions and assertion candidates one is able to say that not all declaratives have assertive force, although they might all be in the indicative or declarative mood (see my 2007). Although many judgements have the form A is true in CTT, this is not a basic form of judgement (1e ). Apart from the fact that there are also judgements of the form A is false, and hypothetical judgements that have the form B is true (A is true), the judgemental form A is true is a short-hand for a form that makes the proof object explicit, a : A, which is indeed one of the basic judgemental forms in CTT. The important point for the topic of assertion is that there are different types of judgements or assertions, but that they all have the assertive or judgemental force in common, when they are actually judged or asserted. In CTT, there is an important relation between the notions declarative sentence and assertion (1e). The meaning of the declarative sentence is given in terms of its
123
Synthese (2011) 183:187–210
201
assertion condition, that is, in terms of what one has to know in order to be entitled to assert the declarative. This is precisely the way that the notion of assertion candidate or possible judgement is explained, and it can therefore be argued that the assertion candidate is the meaning of the declarative sentence (see my 2007). A presupposition of this semantics for the declarative sentence is that the declarative is standardly used for making an assertion, and that one needs special signs when the declarative is used as the antecedent in a conditional sentence or as an example. In these deviant uses, the declarative is not uttered with assertive force, but it does express an assertion candidate. Does the assertion sign precede ‘assertions’ made on stage (1e )? Dummett has given an affirmative answer to this question (Dummett 1973, 311). It might seem that without such a sign the public will not be able to understand that Desdemona is making an assertion by the utterance of the declarative ‘I do love thee.’ Othello takes her to be lying, and that presupposes that she is making an assertion. The actress who has the role of Desdemona is not really asserting, though, that she loves the person addressed. The assertion on stage therefore does not seem to be a real assertion, and the assertive force seems to be absent. Is it possible to explain the situation on stage by means of the notion of assertion candidate? When we hear the utterance of a declarative, we apprehend the assertion candidate. Because we know that the utterance is standardly used for making an assertion, the utterance on stage can be understood as an assertion in that context, although the assertive force is absent. This can at most be a partial answer to the problem. For, an utterance of the declarative can be used as assertion on stage, but an occurrence of the same declarative can be used to ask a question on stage, or it can be used as the antecedent of a conditional assertion on stage. The assertion on stage must thus be more than the expression of an assertion candidate. The point, I guess, is rather that everything that happens on stage is bracketed by the stage-setting. The act of killing within a play and the act of asserting on stage are acts modified by the stage setting. The phrase ‘asserting on stage’ is like the phrase ‘painted landscape’. The word ‘painted’ may be used as a modifying term, in which case it modifies the meaning of the term ‘landscape’ in such a way that the term no longer refers to a real landscape in this context. The phrase ‘on stage’ can be understood as a modifying term, too: the term ‘killing’ in ‘killing on stage’ does not refer to the activity of murdering, but to murder-on-stage, a mock murder. In an analysis of assertions-on-stage, these assertions are not preceded by a straightforward assertion sign; the most one can say is that they are preceded by the assertion sign-on-stage. This sign is not a special case of the assertion sign, just as a painted lion is not a special kind of lion. In what sense can we say that the assertion is made from the perspective of the asserter (1f)? The agent is entitled to assert that S, if he has done what the explanation of the assertion candidate demands him to do in order to be entitled to make the assertion. In case the judgement has the form ‘A is true’, he needs to have obtained the knowledge that A is true, that is, he needs to have constructed a proof object for the proposition A in an act of demonstration, which means that he considers the constructed object as a proof for A. The act of demonstration is always an act of demonstration from the perspective of the asserter. The act of demonstration or cognitive act purports to make the judgement justified; it is a fallible act as we have seen in the former section, and may not be considered an act of demonstration when looked upon from a new perspective. At a later time, one may no longer call the act an ‘act of
123
202
Synthese (2011) 183:187–210
demonstration’ or a ‘cognitive act’, and the constructed object is then no longer considered to be a proof for the proposition A. The cognitive act is essentially a cognitive act from a first-person perspective (see my 2010). Can a constructivist account of assertion explain that entitlements to assert can be conferred upon others (1g)? The distinction between canonical and non-canonical proof objects may be of help here (see Sect. 4). On a constructivist account, a proposition is explained in terms of its canonical proof objects. For example, the proposition A&B is explained in terms of ordered pairs consisting of a proof object of A and a proof object of B. A non-canonical proof object is a method to obtain a canonical proof object. When the disciples told Thomas that they had seen the Lord after Jesus had died, we may say that he was already entitled to judge that Jesus was among them, because he had obtained a non-canonical proof object for the relevant proposition. Doubting Thomas apparently thought that he was entitled to judge that Jesus was among them, only if he had obtained what is considered to be a canonical proof for the proposition, by seeing Jesus with the prints of the nails in his hands. If a proof object, whether canonical or non-canonical, may cross from one person to another, entitlements to assert can be conferred upon others. The question whether apprehension of the meaning of a declarative sentence S by a speaker is assumed when we hear him asserting that S is to be answered affirmatively (1h ). It is true for all assertions that in order to make the assertion that S one is at least expected to have apprehended the assertion candidate S. It is in this sense that all assertions can be considered as answers to a certain question (1h), namely the question ‘Is the assertion candidate true (correct)?’ The question whether every assertion or judgement has a presupposition is to be answered negatively, though (1h ). In CTT, the judgement A: proposition has to be correct in order that a meaning explanation can be given for the judgement a : A, that is, the judgement A : proposition is a presupposition for the judgement a : A, and thus for the judgement A is true. Propositions or sets are to be considered as types themselves, which means that a further presupposition is needed for judgements of the form A : proposition, namely that propositions (or sets) form a type, that is, prop : type or set : type. In type theory, the judgement set : type does not have a presupposition. The judgement cannot be made without understanding what a type is, but the definition of what a type is, is not given in a judgement, and therefore cannot be considered as a presupposition.9 To use a phrase introduced by Collingwood, we may call the judgement set : type an absolute presupposition. (2) On what ground can one defend a judgement/assertion parallel? It is not a sufficient answer that assertion is the announcement of a judgement made, or that judgement is the interiorization of assertion, for we need an explanation why each of these notions can be understood in terms of the other. One of the presuppositions of the parallel is that judgement, like assertion, is understood as linguistically structured. Both the assertion and the judgement that snow is white standardly use an occurrence of the declarative ‘Snow is white’, either aloud or in silence. Because the meaning of the declarative is explained as what one has to know in order to be entitled to assert the declarative, an utterance of the declarative in silence with judgemental force amounts 9 A type is defined by what it means to be an object of that type (the criterion of application), and by what it means for two objects of that type to be identical (the identity criterion).
123
Synthese (2011) 183:187–210
203
to the same act as an assertion, except that it is uttered in silence. As Plato says in the Theaetetus (190a), judgement is an assertion which is not addressed to another person or spoken aloud, but silently addressed to oneself.10 The dialogical aspect of assertion is internalized to a dialogue with oneself. This explains that a silent judgement is in need of a ground as much as the overt assertion. It is better to speak of the assertion/judgement parallel than of the assertion/belief parallel: in the case of belief, a ground for one’s belief is not demanded. The most important reason, though, to speak of the assertion/judgement parallel, rather than the assertion/belief parallel, is that both assertion and judgement are primarily acts: a speech act in the former case, a mental act in the latter. Although the assertion/judgement parallel may at first sight seem to confirm a belief account of assertion, this need not be the case if the notions judgement and belief are held apart, and the (possible) judgement is explained in cognitive terms. Conceptual space is needed between the notions assertion and judgement because we can lie, and because one can assert without having a justification for one’s assertion (2 ). If someone lies, he asserts that S while judging that it is not the case that S, with the intention to mislead the hearer to make him think that the asserter does judge that S (2 a). If one asserts without having a ground for one’s assertion, the asserter need not lie (2 b). If the hearer finds out that the asserter did not have a ground for his assertion, the asserter’s trustworthiness will be diminished, though. The next time the US government puts forward a casus belli in an assertion, we will definitely make use of our right to ask ‘How do you know?’. (3) What does it mean to say that an assertion is correct? In the last section we have seen that a judgement is correct if and only if it can be grounded or justified. From a constructivist point of view, if a judgement or assertion is grounded, the assertion is correct. The distinction between an assertion being correct and it being grounded becomes fruitful when applied to the assertion of a third person, which includes assertions made by oneself in the past. Because I myself have grounded the assertion A is true, I can call the assertion A is true made by another agent correct, independent of the question whether he has a justification for his assertion. In the same way a declarative sentence, expressing an assertion candidate, can be called correct, although it is not actually used to make an assertion. An assertion is thus correct, if and only if it can be justified. What is the relation between the correctness of an assertion of the form A is true and the truth of the proposition A (3a)? An assertion of the form A is true is correct, if and only if it can be justified, that is, if and only if it is possible to construct a proof object for A, which means that there exists a proof of A, that is, that the proposition A is true. Is it true that there is a group of speech acts whose products are correct or incorrect (3b)? Can we speak of a correct guess and a correct suggestion in the sense of judgemental correctness? Because the declarative by means of which a suggestion is done or a guess is made expresses an assertion candidate, the suggestion or guess may be called correct, insofar as the assertion candidate is correct. There is thus a group of 10 I have used Myles Burnyeat’s revision of Levett’s translation, apart from the fact that I have used ‘assertion’, where the Levett–Burnyeat translation has ‘statement’ (Burnyeat 1990, 323).
123
204
Synthese (2011) 183:187–210
speech acts that have the characteristic of being correct or incorrect. The crucial question is now: How do these speech acts relate to the speech act of assertion? Is it true that there is a genus of assertives with different species such as assertion, expressing a guess, doing a suggestion, and putting forward a hypothesis, as Searle has suggested? If the assertives form a genus they ought to have something in common, most likely, a claim to correctness. Against this it may be said that putting forward a hypothesis does not involve a claim to correctness. The members of the class of assertives do have something in common, though: in each case an utterance of the declarative is standardly used to carry out the relevant speech act. Each of these acts thus expresses an assertion candidate, which is explained in terms of what one has to know in order to be entitled to assert it. The assertion candidate is explained in terms of the speech act of assertion. The speech act of assertion is thus a notion prior in the order of explanation to the other speech acts in this group. The relation between the speech act of assertion and the other speech acts in this group is not a genus–species relation, for these speech acts are not explained in terms of the speech act of assertion, the genus, together with a specific difference. These other speech acts are rather to be understood as etiolations or modifications of the speech act of assertion. When someone expresses a guess, his speech act has some similarities with the speech act of assertion, but the speaker does not claim to have a ground for the assertion candidate expressed by the declarative that is used to make the guess. A sign is needed that the declarative is used in a deviant sense; one may, for example, add ‘I guess’ as a parenthesis. Insofar as there is a parallel between assertion and judgement, we are not free to assert whatever we like (3c). In order to be entitled to judge, one needs to have a ground for one’s judgement, and what counts as a ground for the judgement is not up to us. What may count as a ground is determined by the explanation of the judgement (see Sect. 4). The act of assertion is in one aspect more free than the act of judgement (1e): we can lie to others in a way we cannot lie to ourselves. If the asserter has asserted that A is true, and at some time later asserts that A is false, which is equivalent with the assertion ¬A is true, he has to withdraw one of his assertions (3d). Presupposing that A is a proposition, if the judgements A is true and A is false were both knowable, so would be the judgement ⊥ is true, because A is false means that there exists a hypothetical proof of absurdity from A. ⊥, like any proposition, is defined by its introduction rules, which are none, that is, for absurdity there is no canonical proof, which means that there cannot be a non-canonical proof for absurdity either. Therefore, absurdity cannot be known to be true (which is one of the laws of knowability, cf. Martin-Löf 1995, 194). And this means that the judgements A is true and A is false cannot both be knowable. Therefore, as soon as an asserter realizes that he has asserted both that A is true and that A is false, he will withdraw at least one of his assertions. When an assertion is offered as a reason, it has the function of a premise in our reasoning (3e). The assertion made rather than the act of assertion functions as a reason, because premises are assertions made; the acts have only temporal existence. This notion of reason or ground is not identical with the notion of justification as it is introduced in Sect. 4. A judgement is grounded or justified if it is grounded by a cognitive act, which may be based upon other judgements made, but it need not be, for the cognitive act may be an act of immediate insight, which results in an axiom.
123
Synthese (2011) 183:187–210
205
In a more restricted sense of ground or reason, the judgements made on which an act of demonstration is based, may be called the reasons for the conclusion in the sense of 3e. These reasons are not to be confused with the proof objects for propositions. (4) We have seen in Sect. 4 that within CTT the judgement is explained by what one has to know in order to be entitled to make the judgement in question. A knowledge account of assertion thus suits CTT, although on a constructivist account this means nothing more than that one’s assertion is grounded.11 Because knowledge is explained as justified judgement, it may also be said that one is entitled to make a judgement or assertion, if and only if one has justified it. An interlocutor has a right to ask ‘How do you know?’, when someone has made an assertion (4a). The questioner thus presupposes that the asserter knows what he asserts, and if the asserter gives a ground for his assertion, the ‘How do you know’-question is answered; nothing further is asked for. The point shows that there is an internal relation between knowledge and justification, and this corresponds to the way knowledge and justification are explained in Sect. 4. A Moorean paradox arises if the sentence ‘It rains, but I do not know it’ is uttered with assertive force (4b). Instead of saying that one is entitled to assert if and only if one knows what one asserts, one can also say that by making an assertion one claims to know what one asserts. If this knowledge claim is made explicit, one obtains ‘(I know that) it rains, but I do not know that it rains.’, and the paradox becomes visible (cf. Sundholm 2004, 460, note 15). The same holds for an assertive utterance of the sentences ‘It rains, but I do not have evidence for it,’ and ‘It rains, but I do not believe it.’ Making the knowledge claim explicit, one obtains in the latter case ‘(I know that) it rains, but I do not believe it.’ This creates a paradox for standard cases of knowledge, where knowledge involves a certain degree of conviction (belief). In standard cases, the justification for one’s judgement results in a certain degree of conviction. There are exceptions, though, for example, when someone asserts ‘I have my driver’s licence; I can’t believe it’, just having obtained the licence after a long struggle. 6 An evaluation of the constructivist account of assertion How does a grounding account of assertion deal with the problems for a knowledge account of assertion described in Sect. 3? We have seen that, according to Bernard Williams, knowledge is too strong a norm for assertion, because the asserter is not able to determine whether his assertion is made in accordance with the norm. The standard account of knowledge understands knowledge to imply truth, where truth is understood as transcending the individual judger, what has been called infallible or real truth in Sect. 4. If knowledge is the norm for assertion, transcendent or infallible truth is thereby a norm for assertion, too, on the standard account. The problem with 11 In 1988 Göran Sundholm has already pointed out that Martin-Löf’s constructive type theory can account for the fact that the announcement of a judgement made by means of an assertion involves a knowledge claim (Sundholm 1988, 17; cf. Sundholm 1999, 2004). In a letter from 10 September 2007, Martin-Löf wrote to me that “the judgemental force … is defined by the pragmatic rule, To have the right to make a judgement, you must have justified (grounded) it.” The two explanations, one in terms of knowledge, the other in terms of justification, do not differ on a constructivist account, because knowledge is precisely the justified or grounded judgement.
123
206
Synthese (2011) 183:187–210
knowledge accounts of assertion as they are recently proposed, is that we are never able to determine whether our judgement is true in this sense. And this criticism thus applies to a truth account of assertion, as well, if truth is understood in a non-epistemic, transcendent sense. Within CTT, knowledge is not explained in terms of a transcendent notion of truth; knowledge is understood as justified judgement, where the justification does not transcend the asserter, as we have seen in Sect. 4. The asserter is in a position to determine whether his judgement is justified or grounded, because the cognitive act through which the judgement is justified is an internal, first-person act, and is thus immanent to the asserter. Williams himself gives an account of assertion in terms of belief: “A utters a sentence “S,” where “S” means that P, in doing which either he expresses his belief that P, or he intends the person addressed to take it that he believes that P” (Williams 2002, 74). Because not all assertions are expressions of belief—there are lies, too, Williams has to take account of assertions that are not expressions of belief in the account of assertion itself. According to Williams, it is an important characteristic of assertions that they can be used insincerely, that is, that they can be used to lie, and this characteristic, he says, should be accounted for in the explanation of assertion. It is perfectly possible, though, to understand what assertion is without understanding that it can be used for lying; lying is a parasitic phenomenon. Giving an explanation in terms of entitlement to assert leaves open the possibility of insincere assertions, without distinguishing between two types of assertion in the account of assertion, as Williams has to do. To explain assertion in terms of belief is also problematic, because the notion of assertion is clearer than the notion of belief: assertions, in contrast to beliefs, can directly be perceived and are non-dispositional. Besides, belief is a highly ambiguous notion: it may mean a disposition to judge, where judging is an all or nothing affair and a normative or rational notion; conviction, which has degrees; opinion, which is opposed to knowledge; or (religious) faith. (This analysis will turn out to be of importance for the treatment of selfless assertions at the end of the section.) Related to these ambiguities, the notion is sometimes used in a naturalistic context, in which case the central question is how we come to have the beliefs we have (‘belief’ in the sense of conviction). At other times, it is used in a normative context, where truth is the norm for belief (‘belief’ in the sense of disposition to judge), and the central question is what reasons we have for our judgements. Unravelling the ambiguities of the term ‘belief’ will show that Williams is not able to give a naturalist foundation for normative notions such as assertion and for epistemic virtues such as sincerity, but this has to be done in a new paper. Besides, sincerity is not the only demand for assertion. In the case of the casus belli with respect to Iraq, the assertion that there were mass destruction weapons was, most likely, sincere, but there was definitely something wrong with it. Williams can invoke his notion of accuracy here, but the problem with these demands of sincerity and accuracy is that they are not exclusive for assertion. The norm of sincerity is not typical of the speech act of assertion: one should not make insincere promises, or advise someone insincerely; and the same holds for accuracy. Searle and Weiner combine a truth account of assertion with the thesis that there is a broad class of assertives, including expressions of guesses and speculations, because for these acts truth also is the norm, they say. According to Weiner, these speech acts are acts of assertion. This implies, according to Weiner, that the asserter may be fully
123
Synthese (2011) 183:187–210
207
entitled to make his assertion, because what he asserts is true, although he is not able to answer the question ‘How do you know?’ If one takes the class of assertives in this general sense, one may still wonder under what condition one is entitled to assert in the more specific sense; such an entitlement seems to demand more than the entitlement for guesses and speculations demands. Holmes’ remark ‘This is the work of Professor Moriarty,’ is an assertion, according to Weiner. Although Holmes is not able to answer the ‘how do you know’-question, he is entitled to make the assertion, Weiner says, because what Holmes says is true. A defender of an epistemic account of assertion may answer Weiner in either of two ways. One may say that Holmes did have a ground for his assertion—we are told that Holmes had knowledge of the modus operandi of Moriarty; the ground may simply be difficult to explain to an interlocutor. Or, one may say that Holmes uses the utterance of the declarative ‘This is the work of Professor Moriarty’ not to make an assertion in the strict sense, but rather to express a proposal for further investigation. The quality of Holmes as a detective consists in putting forward the most relevant candidate for assertion. Certainly, we do praise people when they have put forward a hypothesis that turns out to be correct, but we praise them not because of their entitlement to assert, but because their hypothesis is valuable. According to Kvanvig, the norm for assertion is justification, and not truth or knowledge. The grounding account of assertion can be considered as a justification account of assertion, as we have seen in Sect. 4, and it is therefore important to understand in what sense the constructivist account of assertion differs from justification accounts of assertion defended by Kvanvig, Douven, and Lackey. These philosophers have argued against knowledge and truth accounts of assertion that two forms of propriety are needed on such accounts to explain that someone’s assertion may have a secondary propriety, because he reasonably believes that he is entitled to assert, whereas his assertion does not have primary propriety, because he is not really entitled to assert on such an account. From the constructivist point of view, no such distinction is needed: entitlements to assert are based upon internal justification, and it is this justification that determines exclusively whether the asserter is entitled to assert. There is thus only one type of propriety, as it is on the justification account defended by Kvanvig and others. There are also some important differences, though, between the constructivist account and justification accounts as generally understood. According to Kvanvig, the norm for the content of assertion is not to be confused with the norm for the act of assertion. Whereas truth and knowledge are the norm for the content of the assertion, justification is the norm for the act of assertion, he says. From a constructivist point of view, there is not a similar way to separate these norms. That one has obtained the knowledge demanded by the assertion candidate A is true, because one has constructed a proof object for A, precisely gives one the entitlement to make the assertion. Further, both Kvanvig and the constructivist can explain that an interlocutor has a right to ask for grounds, when an assertion is made. Kvanvig cannot explain, though, the presupposition of the question ‘How do you know?’, namely that the asserter knows what he asserts. According to Douven, Kvanvig and Lackey, a central argument against both a belief and a knowledge account is the possibility of selfless assertions (Lackey 2007, 598). A teacher seems to be fully entitled to assert that modern day Homo sapiens evolved from Homo erectus, although she does not believe the proposition to be true, because
123
208
Synthese (2011) 183:187–210
she has “a belief in the truth of creationism, and, accordingly, a belief in the falsity of evolutionary theory” (Idem, 599). “[S]he readily admits that she is not basing her own commitment to creationism on evidence at all but, rather, on the personal faith that she has in an all-powerful Creator. … She regards her duty as a teacher to include presenting material that is best supported by the available evidence, which clearly includes the truth of evolutionary theory” (Idem). If assertions are essentially made from the perspective of the asserter, how can the agent assert selflessly? Does the constructivist defend the thesis that belief that S is a necessary condition for being entitled to assert that S? If ‘belief’ means a disposition to judge resulting from a cognitive act the answer is ‘yes’. If ‘belief’ means conviction, the answer is not a straightforward ‘yes’, because one needs to acknowledge those cases where one possesses a proof object for A, and recognizes it as such, although one has not obtained the degree of conviction that standardly accompanies such a possession, as we have seen in the case of the driver’s licence. In the example of the biology teacher, ‘belief’ in the sense of faith plays an important role, which is generally accompanied by the syntactic form ‘to believe in’. Insofar as religious faith and judgement do not belong to the same category the biology teacher may assert and judge that Homo sapiens evolved from Homo erectus on the basis of evidence, and also have faith in the God of the Bible. The case of selfless assertions is not restricted, though, to cases where belief means faith. There are more cases imaginable where we assert things we do not believe because of a social role we inhabit. The possibility of selfless assertions supports Jonathan Cohen’s thesis that assertions are announcements of what we accept rather than of what we believe, where acceptance is an act of choice based on evidential or prudential reasons that need not be accompanied by belief (Cohen 1992). Acceptance as Cohen explains it is always relative to a certain context: one may thus accept the truths of evolutionary theory in one context, while not accepting them in another. I have argued, in my (2009), that Cohen’s distinction between belief and acceptance is not a satisfying one: the same notion of acceptance takes in both evidential reasons and prudential or practical reasons. I have argued that accepting for evidential reasons is not context relative, and should be replaced by the notion of judgement, whereas acceptance for purely prudential or practical reasons is a notion we are in need of, and is indeed context relative. Constructivism does not say anything about acceptance for prudential or practical reasons, but it can certainly allow for it. This gives a solution to the point that we may accept (‘assert’) things we do not believe because of a social role we inhabit, although I would prefer not to use the term ‘assertion’ here. We have seen that there are a lot of speech acts that have some agreement with assertion, such as making a guess, or putting forward a hypothesis, where the declarative is the standard linguistic form to perform the speech act, and acceptance may be included among them. We may accept that S for prudential reasons, while not judging that S because of evidential reasons, and the other way round. In accordance with this, I understand the example of the biology teacher in a different way: the teacher judges that evolutionary theory is true, because of the evidence available, and is thus entitled to assert, while she does not accept the evolutionary theory for religious reasons. I thus understand the religious reasons to be prudential reasons. This solution allows also for cases in which non-religious, purely prudential reasons play a role.
123
Synthese (2011) 183:187–210
209
We can make the example of the biology teacher a bit more difficult to explain, though. In a footnote, Lackey gives a quotation from an interview with Marcus R. Ross, a creationist who just finished a Ph.D. in paleontology: “the methods and theories of paleontology are one ‘paradigm’ for studying the past, and Scripture is another … I am separating the different paradigms” (Lackey 2007, 620, note 12). It seems that Ross is thus willing to assert both that the evolutionary theory is true, and, for example, that Homo sapiens evolved from Homo erectus, and that creationism is true, that evolutionary theory is false, and that Homo sapiens did not evolve from Homo erectus, although he is not willing to make those assertions within the same paradigm. Those who defend a justification account of assertion may say that Ross is entitled to assert that the evolutionary theory is true, because he has some evidence for this assertion, and that he also is entitled to assert that evolutionary theory is false, because he has some evidence for that assertion, too. From a constructivist point of view, one cannot be entitled to assert that A is true and that A is false, because these judgement candidates are not both knowable (see Sect. 5, point 3d); they cannot both be grounded. For a constructivist ‘being entitled to assert that A is true with respect to an admissible paradigm’ means that one is entitled to assert A is true, period. If Ross were to assert both that the evolutionary theory is true and that the evolutionary theory is false, he is inconsistent, and is thus logically to be blamed, if he realizes what he is doing, and does not withdraw one of his assertions, or Ross is a relativist with respect to truth (‘true’, for him, means true relative to a paradigm). 7 Conclusion The paper has developed a full theory of assertion for constructive type theory, in which it is shown that the theory is able to account for many of the epistemic aspects of the speech act of assertion, and that it is supported by the thesis developed here that assertions do not form a wide genus. The paper has shown that a grounding account of assertion suits constructive type theory. Because grounding the assertion that S by means of an act of demonstration amounts to knowing that S, the question to what extent the constructivist account of assertion differs from knowledge accounts of assertion generally proposed had to be answered. It is especially the internal relation between the proof for a proposition and its truth, and the internal relation between a judgement being grounded and its being known that distinguishes the constructivist account of assertion from standard knowledge accounts of assertion. Because a constructivist explains knowledge as grounded judgement, it also had to be shown in what sense the constructivist account of assertion differs from justification accounts of assertion recently proposed. It is especially the treatment of selfless assertions that makes the constructivist account of assertion different from these justification accounts. Acknowledgments I thank Göran Sundholm, Per Martin-Löf, and Igor Douven for comments on a former version of the paper. I have presented the paper in Paris, and I am grateful to the organizer, Pascal Engel, and the public. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
123
210
Synthese (2011) 183:187–210
References Austin, J. L. (1962). How to do things with words. Oxford, New York: Oxford University Press. 1984. Burge, T. (Ed.). (1986). Frege on truth. In Truth, though, reason: Essays on Frege (pp. 83–132). Oxford: Clarendon Press. 2005. Burnyeat, M. (1990). The Theaetetus of Plato (with a translation of Plato’s Theaetetus). Indianapolis, Cambridge: Hackett Publishing Company. Cohen, L. J. (1992). An essay on belief and acceptance. Oxford: Clarendon Press. Davidson, D. (Ed.). (1984). Communication and convention. In Inquiries into truth and interpretation (pp. 265–280). Oxford: Clarendon Press. Douven, I. (2006). Assertion, knowledge and rational credibility. Philosophical Review, 115, 449–485. Dummett, M. (Ed.). (1973). Assertion. In Frege: Philosophy of language (Chapter 10, pp. 295–363). London: Duckworth. Second edition 1992. Dummett, M. (Ed.). (1976). What is a theory of meaning (II). In The seas of language (pp. 34–93). Oxford: Clarendon Press. Originally in G. Evans & J. McDowell (Eds.), Truth and meaning. Oxford: Clarendon Press, 1993. Engel, P. (2002). Truth. Chesham: Acumen. Frege, G. (1879). Begriffsschrift. In I. Angelelli (Ed.), Begriffsschrift und andere Aufsätze. Hildesheim, New York: Georg Olms. 1971. Frege, G. (1918). Der Gedanke. Beiträge Zur Philosophie Des Deutschen Idealismus, 1, 58–77. Kvanvig, J. (2009). Assertion, knowledge, and lotteries. In P. Greenough & D. Pritchard (Eds.), Williamson on knowledge (pp. 140–160). Oxford: Oxford University Press. Lackey, J (2007). Norms of assertion. Nous, 41, 594–626. Martin-Löf, P. (1991). A path from logic to metaphysics. In Atti del Congresso Nuovi problemi della logica e della filosofia della scienza (Vol. II, pp. 141–149). Bologna: CLUEB. Martin-Löf, P. (1995). Verificationism then and now. In W. Schlimanovich, E. de Pauli, & F. Stadler (Eds.), The foundational debate (pp. 187–196). Dordrecht: Kluwer. Martin-Löf, P. (1996). On the meanings of the logical constants and the justification of the logical laws. Nordic Journal of Philosophical Logic, 1, 11–61. (Originally presented in 1983.) Martin-Löf, P. (1998). Truth and knowability: On the principles C and K of Michael Dummett. In H. G. Dales & G. Oliveri (Eds.), Truth in mathematics (pp. 105–114). Oxford: Clarendon Press. Newman, J. H. (1870). An essay in aid of a grammar of assent. Oxford: Clarendon Press. (I used the edition of 1985.) Ranta, A. (1994). Type-theoretical grammar. Oxford: Clarendon Press. Searle, J. R. (1979). Expression and meaning. Cambridge: Cambridge University Press. Sundholm, G. (1988). Oordeel en gevolgtrekking; bedreigde species? Inaugural lecture. Leiden: Leiden University. Sundholm, G. (1994). Existence, proof and truth-making: A perspective on the intuitionistic conception of truth. Topoi, 13, 117–126. Sundholm, G. (1999). MacColl on judgement and inference. Nordic Journal of Philosophical Logic, 3, 119–132. Sundholm, G. (2004). Antirealism and the roles of truth. In I. Niniluoto, M. Sintonen, & J. Wolenski (Eds.), Handbook of epistemology (pp. 437–466). Dordrecht: Kluwer. van der Schaar, M. (2007). The assertion-candidate and the meaning of mood. Synthese, 159, 61–82. van der Schaar, M. (2009). Judgement, belief and accepance. In G. Primiero & S. Rahman (Eds.), Acts of knowledge: History, philosophy and logic; essays dedicated to Göran Sundholm (pp. 267–286). London: College Publications. van der Schaar, M. (2010). The cognitive act and the first-person perspective; an epistemology for constructive type theory. Synthese. doi:10.1007/s11229-009-9708-4. Weiner, M. (2005). Must we know what we say?. Philosophical Review, 114, 227–251. Williams, B. (1973). Deciding to believe. In Problems of the self (pp. 136–151). Cambridge: Cambridge University Press. Williams, B. (2002). Truth and truthfulness. Princeton: Princeton University Press. Williamson, T. (1996). Knowing and asserting. Philosophical Review, 105, 489–523.
123
Synthese (2011) 183:211–227 DOI 10.1007/s11229-010-9760-0
Paraconsistent vagueness: a positive argument Pablo Cobreros
Received: 23 December 2009 / Accepted: 29 June 2010 / Published online: 3 August 2010 © Springer Science+Business Media B.V. 2010
Abstract Paraconsistent approaches have received little attention in the literature on vagueness (at least compared to other proposals). The reason seems to be that many philosophers have found the idea that a contradiction might be true (or that a sentence and its negation might both be true) hard to swallow. Even advocates of paraconsistency on vagueness do not look very convinced when they consider this fact; since they seem to have spent more time arguing that paraconsistent theories are at least as good as their paracomplete counterparts, than giving positive reasons to believe on a particular paraconsistent proposal. But it sometimes happens that the weakness of a theory turns out to be its mayor ally, and this is what (I claim) happens in a particular paraconsistent proposal known as subvaluationism. In order to make room for truth-value gluts subvaluationism needs to endorse a notion of logical consequence that is, in some sense, weaker than standard notions of consequence. But this weakness allows the subvaluationist theory to accommodate higher-order vagueness in a way that it is not available to other theories of vagueness (such as, for example, its paracomplete counterpart, supervaluationism). Keywords
Logical consequence · Paraconsistency · Vagueness · Subvaluationism
The subvaluationist theory of vagueness is the dual theory of the well-known supervaluationist theory. Where the supervaluationist reads ‘truth’ as ‘supertruth’ (truth in every precisification) the subvaluationist reads ‘truth’ as ‘subtruth’ (truth in some precisification). This dual reading of the notion of truth leads to a theory
P. Cobreros (B) University of Navarra, Pamplona, Spain e-mail:
[email protected]
123
212
Synthese (2011) 183:211–227
of vagueness in which borderline sentences give raise to gluts of truth-value (by contrast to supervaluationism in which borderline sentences give raise to gaps in truth-value). Subvaluationism is a paraconsistent theory in the sense that a sentence might both be true and false without triviality (i.e., the set of sentences { p, ¬ p} is subvaluationist-satisfiable); it is weakly paraconsistent in the sense that classical contradictions are not subvaluationist-satisfiable (the analogous dual remarks apply to supervaluationism). The subvaluationist theory has been defended by Dominic Hyde (in Hyde 1997 and in the more recent Hyde and Colyvan 2008) who exploits the duality between subvaluationism and supervaluationism to argue that the first is at least as good as the second and, consequently, the neglect of paraconsistent theories in the literature lacks a justification. Commenting on Hyde’s 1997 paper Beall and Colyvan (2001) point out that Hyde could have gone further arguing that truth-value gluts seem to have the upper hand in the case of paradoxes other than the sorites, such as the paradoxes of self-reference in which truth-value gluts, unlike truth-value gaps, do not succumb to strengthened versions. Even if truth-value glut theories have the upper hand in the case of self-referential paradoxes, this might not constitute enough justification for a glut solution in the case of vagueness. There are other phenomena that suggest a gappy treatment and there’s no argument for the claim that gluts solve things everywhere. It seems to me that paraconsistent proposals on vagueness, subvaluationism in particular, deserve a positive argument, a justification on its own not parasitic on a paracomplete dual. This paper provides such an argument based on Fara’s (so-called) paradox of higher-order vagueness. Fara (2003) shows that if the supervaluationist is committed to a rule of inference known as D-introduction, then she/he cannot endorse the complete hierarchy of gap-principles needed to explain the seeming absence of sharp transitions in sorites series. This paper argues that these gap-principles are equally compelling to other theories of vagueness in which the notion of a borderline case plays a key role. Then it shows that these theories, if committed to a notion of logical consequence as strong as local consequence, cannot avoid a strengthened version of the paradox. But the subvaluationist can. The paper is divided into three sections. The first one describes in a general way what it is understood by a borderline-based theory of vagueness and proposes a general setting to define a notion of definiteness for this sort of theories. Different informal readings of the general setting will deliver different notions of truth for each theory and, consequently, different notions of logical consequence. The paper considers three alternatives: supervaluationist, local and subvaluationist consequence. The second section presents Fara’s paradox of higher-order vagueness as applied to the supervaluationist theory. The last section considers a strengthened version of Fara’s paradox and explains why theories committed to a notion of logical consequence as strong as local consequence cannot handle this version of the paradox while the subvaluationist theory can.
123
Synthese (2011) 183:211–227
213
1 A general setting for borderline-based theories 1.1 Borderline-based theories of vagueness Most theories of vagueness take the notion of borderline case as a central one in the explanation of the phenomenon of vagueness.1 The general idea is that a vague predicate such as ‘bald’ leads to a situation in which competent speakers refuse to classify certain people as ‘bald’ and refuse at the same time to classify them as ‘not bald’. In this sense, borderline cases of a given vague predicate are objects which are within the significance range of the predicate but such that competent speakers manifest a kind of symmetry in their dispositions to apply the predicate or its negation. This rough characterization of borderline cases intends to remain neutral among several different interpretations. We might read borderline cases in epistemic terms, in which case the symmetry in our dispositions to assent to the sentence ‘Tim is thin’ and to assent to the sentence ‘Tim is not thin’ is a manifestation of a particular sort of ignorance associated to vague expressions. In the truth-value gap reading, the symmetry manifests the fact that each sentence lacks a truth-value and in the subvaluationist the fact that both sentences are equally true (and false). An adequate theory of vagueness should provide an explanation of the sorites paradox. Consider a long sorites series for the predicate ‘tall’. The first member of the series is 2.5 m tall, the last is 1.5 tall and each member differs from its successor in the series by less than a millimeter. The paradox originates because though the first element is clearly tall and the last is clearly not tall, there seems to be no sharp transition from the elements that are tall to those that are not tall. At this point a borderline-based theory will make use of borderline cases (reading this notion in the particular way of the theory). The supervaluationist will explain that there’s actually no such sharp transition, since between the poles of the series there are cases that are neither truly tall nor falsely tall (i.e, there are borderline cases, reading this notion in supervaluationist terms). The epistemicist bites the bullet claiming that there’s such a sharp transition, but explains that we cannot know where the transition lies since there are cases in the series of which we cannot know whether they are tall (i.e, there are borderline cases, reading this notion in epistemic terms). 1.2 A general setting Since the notion of a borderline case plays a key role in these theories, it is natural for them to employ a notion of definiteness to speak about borderline cases. Several of these proposals share the common framework of a possible-worlds semantics in which we can define the notion of definiteness.2 An interpretation for a propositional language with an operator for definiteness (‘D’ henceforth) is a triple W, R, ν where, W is a non-empty set of worlds, R is a relation in W and ν is a truth-value assignment 1 Remarkable exceptions are Fara (2000) and Zardini (2008). 2 The use of possible world semantics for different theories of vagueness is now completely standard. See
for example Williamson (1994) and Williamson (1999).
123
214
Synthese (2011) 183:211–227
to sentences at worlds such that classical operators are defined classically (though things are relative to worlds) and the D-operator is defined as the modal operator for necessity, that is,3 νw (ϕ → β) = 1 iff either νw (ϕ) = 0 or νw (β) = 1 νw (¬ϕ) = 1 iff νw (ϕ) = 0
νw (Dϕ) = 1 iff ∀w such that w Rw νw (ϕ) = 1
At this stage the difference between theories concerns the different informal readings of the semantics. Epistemicism will read worlds as some sort of epistemic possibilities, contextualism as contexts of utterance and supervaluationism and subvaluationism as precisifications. Each informal reading of the semantics motivates, however, different readings of the relevant notion of truth for the theory. Supervaluationism is associated with supertruth where ϕ is supertrue in an interpretation at a world w just in case it takes value 1 at every world. Epistemicism and contextualism are associated to local truth where ϕ is locally true in an interpretation at a world w just in case it takes value 1 at w. Finally, subvaluationism is committed to subtruth where ϕ is subtrue in an interpretation at a world w just in case it takes value 1 at some w. Since logical consequence is a matter of necessary preservation of truth, different commitments on the notion of truth render different commitments on logical consequence. Definition 1 (Local consequence) A sentence ϕ is a local consequence of , written l ϕ, just in case for every interpretation and world w in the interpretation: if every member of is true at w then ϕ is true at w. Definition 2 (Supervaluationist consequence) A sentence ϕ is a supervaluationist consequence of , written SpV ϕ, just in case for every interpretation: if every member of is true at every world then ϕ is true at every world. Definition 3 (Subvaluationist consequence) A sentence ϕ is a subvaluationist consequence of , written SbV ϕ, just in case for every interpretation: if every member of is true at some world then ϕ is true at some world. Note that the standards for satisfaction set by each notion of truth differ in strength. It is harder to supervaluationist-satisfy a set of sentences than to locally satisfy it. If is supervaluationist-satisfied in an interpretation, then it certainly is locally satisfied in that interpretation (if every γ ∈ hold everywhere in the interpretation then, for any w in that interpretation holds locally). In turn, it is harder to locally satisfy a set of sentences than to subvaluationist-satisfy it. If is locally satisfied in an interpretation, it certainly is subvaluationist-satisfied in that interpretation (if holds at a particular world w in an interpretation, then each γ ∈ hold somewhere in that interpretation). So subtruth is the weakest standard for satisfaction. This fact is crucial for our discussion on the satisfiability of gap-principles below. 3 For readability I will often write ‘ϕ is true at w’ or ‘ϕ holds at w’ instead of ‘ν (ϕ) = 1’. w
123
Synthese (2011) 183:211–227
215
Local consequence preserves local truth in every interpretation; it is, therefore, a well-defined notion of consequence for theories committed to local truth. In a similar manner, supervaluationist consequence is well defined for supervaluationism if this theory is committed to supertruth4 and subvaluationist consequence for the subvaluationist theory. Local consequence is the standard definition of logical consequence for the simple modal language.5 Every locally valid argument is supervaluationist-valid but not the other way (in particular, ϕ SpV Dϕ but ϕ l Dϕ). On the other hand not every locally valid argument is subvaluationist-valid (since {ϕ, ¬ϕ} l ⊥ but {ϕ, ¬ϕ} SbV ⊥) nor every subvaluationist-valid argument is locally valid (since ¬Dϕ SbV ¬ϕ but ¬Dϕ l ¬ϕ).6 2 Fara’s paradox of higher-order vagueness In her 2003 paper Delia Graff Fara presents an argument against truth-value gap theories (supervaluationism in particular) concerning higher-order vagueness. Think again about our sorites series for the predicate ‘tall’. The first member of the series is clearly tall and the last is clearly not tall, but there seems to be no sharp transition from the members of the series that are tall to those that are not tall. A truth-value gap theory of vagueness (supervaluationism in particular) explains this fact appealing to the presence of a truth-value gap in between; there seems to be no sharp transition from the tall to the not tall for the simple reason that there’s no such sharp transition. Members of the series that are truly tall are not immediately followed by members that are falsely tall; i.e., the truly tall and the falsely tall are separated by a truth-value gap, making true the following gap-principle: (GP for ‘tall’) Dtall (x) → ¬D¬tall(x ) (where x is the successor of x) However, the problem of vagueness does not stop at that point since there seems to be no sharp transition either from the truly tall to the non-truly tall. Given that the 4 The use of the expression ‘supervaluationist consequence’ might look tendentious since not all authors agree on whether supervaluationism is committed to this form of consequence. Canonical supervaluationism identifies truth with supertruth and is (naturally) committed to this notion of consequence (for example Fine 1975, p. 290 and Keefe 2000, p. 176). Some authors, however, hold a supervaluationist-like treatment of vagueness but endorse local consequence instead (for example Varzi 2007 and Asher et al. 2009; see my Cobreros (2010a) for discussion on Varzi’s arguments). I hold that supervaluationism is committed to a third alternative between local and global consequence (in Cobreros 2008, 2010b). In order to concentrate on the arguments in this paper I leave the discussion of this issue for a better occasion, but just two quick remarks on the strategy of linking supervaluationism to local consequence. First, supertruth allows for truth-value gaps while local truth does not (in each world in every interpretation a sentence is either locally true or locally false); thus, linking supervaluationism to local consequence jeopardizes either the truth-value gap explanation of vagueness or the idea that logical consequence is a matter of necessary preservation of truth. Second, as we shall see, local consequence does not escape from the strengthened version of Fara’s paradox, so even if supervaluationism is committed to local consequence, it is still in trouble to handle the issue of higher-order vagueness. 5 See Blackburn et al. (2001, p. 31). 6 See the Appendix for a more detailed account of the relations between these notions of consequence.
123
216
Synthese (2011) 183:211–227
phenomenon seems to be of the same kind, the explanation must be carried out in analogous terms, i.e., the truly tall and the non-truly tall are separated by a gap making true the following gap-principle: (GP for ‘Dtall’)DDtall(x) → ¬D¬Dtall (x ) Non-terminating higher-order vagueness should be treated, at least, as a logical possibility yielding all the gap-principles of this form, (GP for ‘Dn tall’)DDn tall(x) → ¬D¬Dn tall(x ) In addition to the truth of each gap-principle, Fara claims that a truth-value gap theory (supervaluationism in particular) is committed to the following inference rule7 : D-introduction: ϕ Dϕ The reason for this commitment is the fact that for the truth-value gap theorist ‘D’ means something like ‘It is true that’, in which case it looks impossible for a sentence ϕ to be true while the sentence saying that ϕ is true (i.e. ‘Dϕ’) is not (Fara 2003, pp. 199–200). But Fara provides an argument to show that one cannot consistently combine the commitment to gap-principles and the commitment to this rule of inference. Her argument is as follows. Suppose we have a finite sorites series of m elements. The first element is clearly tall, the last is clearly not tall. The difference between each adjacent pair of members of the series is small enough to justify the truth of each instance of any of the previously mentioned gap-principles. Now from the fact that the last element, m, is not tall it follows, by D-introduction, that it is definitely not tall: D¬tall(m). But consider the following instance of the gap-principle for ‘tall’: (GP for ‘tall’)Dtall(m − 1) → ¬D¬ tall(m) Making use of the contrapositive form and of Modus ponens, from the fact that D¬tall(m) it follows that ¬Dtall(m − 1). Making use of D-introduction again we obtain: D¬Dtall(m − 1). And again we use an instance of a gap-principle, this time for ‘Dtall’: DDtall(m − 2) → ¬D¬Dtall(m − 1) As before, we obtain ¬DDtall(m − 2) using the contrapositive form of the gap-principle and Modus ponens. Making use of the m − 1 relevant instances of the relevant gap-principles we can construct an argument showing that ¬Dm−1 tall(1). But this contradicts our first assumption that the first element of the sorites is tall since, 7 The rule of D-introduction must not be confused with the Necessitation rule of normal modal logics. D-introduction can be generally stated as: ϕ ⇒ Dϕ while the Necessitation rule can be stated this way: ϕ ⇒ D() Dϕ, where D() is {Dγ | γ ∈ }.
123
Synthese (2011) 183:211–227
217
Fig. 1 Fara’s argument
by m − 1 applications of D-introduction it should be Dm−1 times tall. The argument is graphically explained in Fig. 1. As Fara points out her argument is inspired by an argument of Wright’s (Wright 1987, 1992). Wright’s original argument makes use of a second-order gap-principle: ∀x(DDT (x) → ¬D¬DT (x )) from which he derives ∀x(D¬DT (x ) → D¬DT (x)), with the aid of (a weakened form of) D-introduction. The derived sentence is as paradoxical as the original induction premise of the sorites argument. Thus, the argument makes it difficult for theories committed to D-introduction (or even a slightly weakened version) to accommodate higher-order vagueness. The problem with Wright’s argument is that, though the rule of D-introduction is supervaluationist-valid, the argument itself is not supervaluationist-valid. The reason is that in the presence of D-introduction some classically valid rules of inference are not unrestrictedly valid (Williamson 1994, pp. 151–152). In particular, we cannot discharge premises with rules like conditional proof and reductio when we have used D-introduction in the sub-proofs (i.e., when we have applied D-introduction to the premises or to anything deduced from them), as Wright does.8 What makes Fara’s argument particularly interesting is that the rules used in her argument are always supervaluationist-valid. In particular, Fara makes use of the contrapositive forms of the relevant instances of the relevant gap-principles but by supervaluationist standards, a conditional and its contrapositive form are logically equivalent. Then Fara combines D-introduction with Modus ponens but (unlike conditional proof or reductio) this rule remains supervaluationist-valid in the presence of D-introduction. In short, what makes Fara’s argument interesting is that it is actually a a proof showing that gap-principles are supervaluationist-unsatisfiable for finite sorites series.9 Thus, Fara’s argument shows that the supervaluationist has problems with his 8 This observation concerning Wright’s argument is to be found in Heck (1993). Further discussion on the
argument can be found in Edgington (1993) and Sainsbury (1991). 9 The Appendix outlines a tableaux-based proof system for the notions of logical consequence discussed
in this paper and shows, making use of this brute-force procedure, that gap-principles are supervaluationistunsatisfiable.
123
218
Synthese (2011) 183:211–227
own explanation of the seeming absence of sharp transitions in sorites series. If the supervaluationist is really committed to what we characterized as supervaluationist consequence above, then she/he is in a rather desperate position to respond to Fara’s challenge. 3 Strengthening the paradox 3.1 Local consequence and gap-principles Though Fara’s objection is directed against the supervaluationist theory, the principles appealed to in the paradox are compelling for other borderline-based theories of vagueness (reading ‘D’ in the particular way of the theory). The general idea is that for a borderline-based theory of vagueness the presence of borderline cases among, say, the clearly tall and the clearly not tall is a constitutive part of the phenomenon of vagueness (reading the notion of borderlineness in the particular way of the theory). But when one considers the vagueness of ‘D-tall’ one is compelled to a borderline-based explanation claiming that there are borderline cases among the clearly D-tall and the clearly not D-tall (reading the notion of borderlineness in the particular way of the theory). And so the story goes, justifying the truth of each gap-principle, reading the notion of definiteness involved in those principles in each theory’s preferred way. For example, for the epistemicist there is a sharp transition from tall members to non-tall members of the series. However, we cannot know where this transition lies because there are members of the series of which we cannot know whether they are tall (i.e., there are borderline cases of ‘tall’, reading the notion of borderlineness in epistemic terms). In a similar way, we cannot know where the transition from the definitely tall to the non-definitely tall lies, since there are borderline cases, this time for ‘definitely tall’... The situation for theories committed to local consequence is not, perhaps, as desperate as the situation for theories committed to supervaluationist consequence since we might consistently handle gap-principles given local consequence. Fara’s argument works, as pointed out in Sect. 2, based on the relevant instances of the relevant gap-principles. To take a pocketsize example, imagine a sorites series of 3 elements: 1, 2, 3. The first element is clearly tall, the last is clearly not tall. The relevant instances of the relevant gap-principles, (GP 1∗ )Dtall(2) → ¬D¬ tall(3) (GP 2∗ )DDtall(1) → ¬D¬Dtall(2) should be true. Diagram in Fig. 2 depicts an interpretation in which all this holds. The first element of our “sorites” series is clearly tall and the last is clearly not tall. (GP 1*) takes value 1 at w0 since the antecedent takes value 0 at w0 . (GP 2*) takes value 1 at w0 since the consequent takes value 1 at w0 . This example shows that the relevant instances of relevant gap-principles are locally satisfiable for any given finite sorites series (of at least three elements). Note, however, that (GP 1*) takes value 0 in w1 . This is not a particular feature of this model; rather, Fara’s argument
123
Synthese (2011) 183:211–227
219
Fig. 2 Local satisfaction of gap-principles
shows that gap-principles are not supervaluationist-satisfiable and so any interpretation showing the local satisfiability of gap-principles will contain some world at which some gap-principle is false. We exploit this fact in the following section to formulate a strengthened version of the paradox. 3.2 The strengthened paradox Definition 4 (Absolute definiteness) The absolute definitization of a sentence ϕ is the set {Dn ϕ | n ∈ ω}. The idea is that the absolute definitization of ϕ is the set containing ϕ and Dϕ, and DDϕ etc. In the present setting, for example, the absolute definitization of a classically valid sentence is valid in either reading of our three notions of consequence. Making use of this notion of absolute definiteness, we can establish the following connection between supervaluationist and local consequence: Claim 1 If SpV ϕ then {Dn γ | γ ∈ , n ∈ ω} l ϕ. That is, ϕ is a supervaluationist consequence of just in case ϕ is a local consequence of the absolute definitization of the elements in .10 For any finite sorites series of m elements, let be the set of premises of Fara’s argument for that series. That is, for any such series, contains ‘tall(1)’, ‘¬tall(m)’ plus all the instances of the relevant gap-principles used in the argument. We will call the absolute definitization of the elements of , i.e. {Dn λ | λ ∈ , n ∈ ω}, the strengthened paradox of higher-order vagueness. Since for a finite sorites series of m elements SpV ⊥, then, by Claim 1 above, the absolute definitization of the elements in locally entail a contradiction. That is, theories committed to a notion of logical consequence as strong as local consequence succumb to the strengthened version of Fara’s paradox of higher-order vagueness.11 10 A sketch of the proof of Claim 1 might be found in the Appendix. 11 Wright’s argument in Wright (2010, pp. 537–539) is based on similar grounds. In their recent paper Asher
et al. point out to the fact that we can construct an argument not appealing to D-introduction but to the definiteness of gap-principles instead (Asher et al. 2009, p. 915). However, this conflicts with their observation according to which ‘gap-principles can be made determinately-to-the-n true for any n’ (Asher et al. 2009, p. 924) (in fact, the second diagram in p. 924 fails to show that the definitization of the first gap-principle (their G1) holds). The tableaux-based proof of the supervaluationist-unsatisfiability of gap-principles in the Appendix shows that there is a systematic connection among how definite we need gap-principles to be to run the argument and the length of the sorites series (for a sorites series of m elements we need just a limited number of iterations of D attached to each gap-principle for the argument to work).
123
220
Synthese (2011) 183:211–227
Fig. 3 Sub-satisfaction of absolutely definite gap-principles
How bad is this last result? Surely, it is not as bad as lacking even the possibility of accepting gap principles plainly (as happens in the case of supervaluationist consequence). But it seems to me that the result is bad enough. As pointed out before, gap principles are compelling for each borderline-based theory of vagueness (reading ‘D’ in the preferred way of the theory). Now since for each theory the justification of the truth of gap-principles follows from considerations of the theory itself, claiming that gap-principles are not absolutely definite amounts to claiming that the theory itself is not absolutely definite (reading ‘definite’ in each theory’s preferred way).12 For example, for the epistemicist, it wouldn’t be absolutely knowable whether the theory is right (and the theory going wrong should be always an epistemic possibility). For the contextualist reading, more dramatically I think, there must be contexts where some gap-principles are false, and thus, there are contexts where the theory itself goes wrong. At this point, the weakness of subvaluationist’s logic turns out to be a great advantage. In order the make room for truth-value gluts, logical consequence in the subvaluationist theory must be weaker than either local or supervaluationist consequence (since { p, ¬ p} l ⊥ but { p, ¬ p} SbV ⊥). The set of sentences { p, ¬ p} is SbV -satisfiable for, in a given interpretation p might hold in w and ¬ p might hold in a different w (and, of course, this is what SbV -satisfiability requires). We already know that the set of premises in Fara’s argument supervaluationist-entail a contradiction; by the duality of SpV and SbV (see the Appendix) a valid sentence subvaliationist-entails the negation of some of the premises. This means that for any interpretation, some gap-principle will be subfalse. Still, since for the supervaluationist ‘being true’ means ‘being true somewhere’ this does not show that gap-principles are not SbV -satisfiable (a principle might be subfalse at a world but subtrue at some different world). In fact the absolute definitization of the premises in Fara’s argument is SbV -satisfiable. Consider again our pocketsize example with a sorites series of three elements 1, 2, 3: (GP 1∗ )Dtall(2) → ¬D¬tall(3) (GP 2∗ )DDtall(1) → ¬D¬Dtall(2) Figure 3 shows that (GP 1*) holds in w0 since, as before, the antecedent is false at w0 . Now this time, the absolute definitization of (GP1*) holds in w0 since it holds in 12 It is commonly accepted that definiteness is closed under logical consequence in the sense that if ϕ then {Dγ | γ ∈ } Dϕ. Thus, contrapositively, if ϕ is not definite, then some of the γ ’s is not definite. The justification of the truth of gap-principles is not, perhaps, a strict logical consequence from the theory itself; but these theories aim to provide a logic of definiteness for our everyday notion of consequence and in this sense the claim in the text is well motivated.
123
Synthese (2011) 183:211–227
221
every w accessible from w0 (namely, w0 itself). Similarly, the absolute definitization of (GP2*) holds in w1 since it holds in every w1 -accessible (namely, w1 itself). Thus, the absolute definitization of the relevant instances of relevant gap-principles for a sorites series of three elements plus the absolute definitization of the assumption that the first element is tall and that the last is not is subvaluationist-satisfiable. More generally, we might show that the absolute definitization of the relevant instances of relevant gap-principles for a finite sorites series is subvaluationist-satisfiable constructing a model in which there is a world for each gap-principle and where these worlds relate only to themselves. At each of these worlds, of course, the absolute definitization of some other gap-principle will take value 0 but, again, showing that the absolute definitization of the premises is SbV -satisfiable requires just showing that, for each ϕ in that set, there is at least a w at which ϕ takes value 1. 4 Conclusion The capability of endorsing the absolute definitization of gap-principles looks like an appealing feature for any borderline-based theory of vagueness; but this capability is restricted to theories committed to a weak enough notion of logical consequence. If vagueness is to be explained in terms of borderline cases, the foregoing results constitute a good argument in favor of the subvaluationist theory of vagueness. Some philosophers will still find the commitment to parconsistency as something hard to swallow and will probably consider that the result speaks against the whole borderline-based approach to vagueness. These philosophers think that, as Williamson says, ‘dialetheism is a fate worse than death’ (Williamson 2006, p. 387). To these I find appropriate Priest’s own response to Williamson: ‘I haven’t died yet, so I’m not in a position to judge’ (Priest 2007). Appendix Connection between supervaluationist and local consequence We give now a sketch of the proof for Claim 1: Proposition 1 If SpV ϕ then {Dn γ | γ ∈ , n ∈ ω} l ϕ. The proof is based on the general fact that for any interpretation showing {Dn γ | γ ∈ , n ∈ ω} l ϕ there is a corresponding generated submodel showing SpV ϕ (see (Blackburn et al. 2001, p. 56) for details on generated submodels). Proof sketch Assume that {Dn γ | γ ∈ , n ∈ ω} l ϕ; then there is a model with a world w0 such that ϕ takes value 0 at w0 and every member of takes value 1 at every w accessible from w0 in any number of R-steps (including w0 itself). The only way in which this model might fail to show that SpV ϕ is by containing worlds that are not accessible from w0 in any number of R-steps at which some of the members of take value 0. However, since these worlds are not accessible from w0 in any number of R-steps, they are irrelevant for the truth-value of sentences in w0 . So define a model
123
222
Synthese (2011) 183:211–227
W , R , ν such that W is {w|w0 R m w} ∪ {w0 } and R , ν the restrictions of R, ν to W . It can be proved by induction over the set of wff that ν and ν agree in the truth-values of sentences in w0 . Since {Dn γ | γ ∈ , n ∈ ω} holds in w0 each γ in take value 1 at any world in W . In turn, since ϕ takes value 0 at w0 , there is at least a world at which it takes value 0. So the new model is an interpretation showing SpV ϕ.
Some relations between l , SpV and SbV For the classical propositional language (without modal operators) and for singleconclusions l coincides with SpV . Since for the classical propositional language the valid arguments of l are those of classical logic, this result amounts to the coincidence between supervaluationist and classical logic for a language without D or similar operators (see Fine 1975, pp. 283–284 and Keefe 2000, pp. 175–176). The thing changes when we consider multiple conclusions since, classically, the truth of {ϕ ∨ ψ} guarantees the truth of some in {ϕ, ψ} (this rule sometimes named ‘subjunction’) but we have that {ϕ ∨ ψ} SpV {ϕ, ψ} (since the disjunction might be supertrue without either disjunct being supertrue). For the classical propositional language (without modal operators) and for singleconclusions SbV is strictly weaker than l . On the one hand, not every locally valid argument is subvaluationist-valid since {ϕ, ¬ϕ} l ⊥ but {ϕ, ¬ϕ} SbV ⊥ (since ϕ and ¬ϕ might both be sub-true in the same interpretation). On the other hand, every subvaluationist-valid argument is locally valid since if l ϕ then there is an interpretation with a world w at which every member of takes value 1 and ϕ value 0; since there are no modal operators, the interpretation consisting of w as the single world in W is an interpretation showing SbV ϕ (this reason is actually the same invoked to show that supervaluationist and local consequence coincide for the classical vocabulary and single conclusions). Looking at the multiple-conclusions case we have that subvaluationist consequence coincides with classical consequence for single-premise arguments (in the same way in which supervaluationist consequence coincides with classical logic for single-conclusion arguments) but not for multiplepremises arguments; for example {ϕ, ψ} classically entails ϕ ∧ ψ (this rule sometimes named ‘adjunction’) but we have that {ϕ, ψ} SbV {ϕ ∧ ψ} (since each of ϕ and ψ might be sub-true even in cases in which the conjunction is not). These facts are linked to the duality of ‘ SpV ’ and ‘ SbV ’. For the simple modal language (i.e., for a language with ‘D’) supervaluationist consequence is strictly stronger than local consequence. Every locally valid argument is supervaluationist-valid, but not the other way. In particular, as we already know, the inference from ϕ to Dϕ is supervaluationist-valid. Thus, for this language, not every supervaluationist-valid argument is locally valid. In a similar way, for this language, not every subvaluationist-valid argument is locally valid since ¬Dϕ SbV ¬ϕ but this is not the case for local consequence. The duality of ‘ SpV ’ and ‘ SbV ’ can be fully expressed extending the definitions of logical consequence to the multiple-conclusions case:
123
Synthese (2011) 183:211–227
223
Definition 5 (Supervaluationist consequence: multiple conclusions) A set of sentences is a supervaluationist consequence of a set of sentences , written SpV , just in case for every interpretation: if every member of takes value 1 at every world then some member of takes value 1 at every world. Definition 6 (Subvaluationist consequence: multiple conclusions) A set of sentences is a subvaluationist consequence of a set of sentences , written SbV , just in case for every interpretation: if every member of takes value 1 at some world, then some member of takes value 1 at some world. Now for a given set let ¬() be {¬γ | γ ∈ }, i.e., the result of attach ‘¬’ to each γ in . Then, Proposition 2 SpV iff ¬() SbV ¬(). Proof Assume SpV . Then, for any interpretation if all the γ ’s are true everywhere, then some of the δ’s are true everywhere. Contrapositively, if all the δ’s are false somewhere, some of the γ ’s are false somewhere, i.e., ¬() SbV ¬(). Tableaux Finally, we outline a procedure to extend standard modal tableaux for SpV and SbV and show that gap-principles are supervaluationist-unsatisfiable. Definition 7 (Global modalities) For any interpretation W, R, ν and any w ∈ W , νw (g ϕ) = 1 iff ∀w ∈ W νw (ϕ) = 1. For any interpretation W, R, ν and any w ∈ W , νw (♦g ϕ) = νw (¬g ¬ϕ). Remark g and ♦g are global modalities in the sense that their truth-conditions depend on what is going on in every world (whether accessible or not). Thus, global modalities cancel, so to speak, the local perspective characteristic of modal semantics in the sense that for any sentence ϕ, interpretation W, R, ν and world w ∈ W , νw (g ϕ) = 1 iff for every w ∈ W , νw (g ϕ) = 1 (the same remark applies ‘♦g ’). Thus, ϕ holds at every w in an interpretation iff for any w in that interpretation νw (g ϕ) = 1 (this remark is used in the lemma below). We write g () for {g γ |γ ∈ } and ♦g () for {♦g γ |γ ∈ }. Lemma 1 For any set of sentences and , SpV iff g () l g () and SbV iff ♦g () l ♦g (). Proof SpV iff for every interpretation W, R, ν: if for all γ ∈ and for all w ∈ W , νw (γ ) = 1 then there is some δ ∈ such that for all w ∈ W νw (δ) = 1. Taking into account our previous remark on global modalities, the foregoing hold iff for every interpretation W, R, ν and world w ∈ W : if for all γ ∈ , νw (g γ ) = 1 then for some δ ∈ , νw (g δ) = 1, that is, iff g () l g (). An analogous reasoning shows that SbV iff ♦g () l ♦g ().
123
224
Synthese (2011) 183:211–227
Modal tableaux constitute a systematic procedure to check whether a given set of sentences is locally satisfiable.13 In order to check whether l we construct a tableaux to check whether the set ∪ ¬() is locally satisfiable. Given our previous lemma, to check whether SpV we should check whether g () ∪ ¬(g ()) is locally satisfiable. The rules for the global modalities are as follows: g ϕ, n
¬g ϕ, n
ϕ, m
¬ϕ, m
(for every m in the tableaux)
(for a new m)
The global character is reflected in the rules by the fact that we do not need neither an accessibility node to trigger the g -rule nor to introduce an accessibility node after triggering the ¬g -rule. The rules for ♦g and ¬♦g are analogous to these ones. Given our previous lemma, soundness and completeness proofs are analogous to those for the standard modal tableaux. A pair of examples. Example 1 p SpV q → D p g p, 0 ¬g (q → D p), 0 ¬(q → D p), 1 q, 1 ¬D p, 1 1r 2 ¬ p, 2 p, 2 ⊗ Example 2 ¬(q → D p) SbV ¬ p ♦g ¬(q → D p), 0 ¬♦g ¬ p, 0 ¬(q → D p), 1 q, 1 ¬D p, 1 1r 2 ¬ p, 2 ¬¬ p, 2 ⊗ Let us consider Fara’s argument again. Consider a finite sorites series of m elements. The first element of the series is tall, the last element of the series is not tall. There is a number of m − 1 relevant instances of relevant gap-principles at work in Fara’s argument above (we write ϕn instead of tall(n)): 13 Here we assume familiarity with this procedure, for more details see Priest (2008).
123
Synthese (2011) 183:211–227
225
(GP.1)D¬Dm−2 ϕ2 → ¬Dm−1 ϕ1 (GP.2)D¬Dm−3 ϕ3 → ¬Dm−2 ϕ2 (GP. 3)D¬Dm−4 ϕ4 → ¬Dm−3 ϕ3 .. . (GP. m − 2)D¬Dϕm−1 → ¬D2 ϕm−2 (GP. m − 1)D¬ϕm → ¬Dϕm−1 Lemma 2 For any finite sorites series of m elements, 1, ..., m, the relevant instances of relevant gap-principles are globally unsatisfiable. g ϕ1 , 1 g ¬ϕm , 1 g D¬Dm−2 ϕ2 → ¬Dm−1 ϕ1 , 1 g D¬Dm−3 ϕ3 → ¬Dm−2 ϕ2 , 1 .. .
Proof
g D¬ϕm → ¬Dϕm−1 , 1
¬D¬Dm−2 ϕ2 , 1 1r 2 Dm−2 ϕ2 , 2
¬D¬Dm−3 ϕ3 , 2 2r 3 Dm−3 ϕ3 , 3 .. .
¬Dm−1 ϕ1 , 1 ⊗
¬Dm−2 ϕ2 , 2 ⊗
(m − 2)r (m − 1) Dϕm−1 , m − 1 ¬D¬ϕm , m − 1 (m − 1)r m ϕm ⊗
¬Dϕm−1 , m − 1 ⊗
Sentences prefixed with g must hold everywhere in the tableaux. We trigger the rule for the first instance of a gap-principle. The right branch closes with the fact that g ϕ1 ; the left branch is open and lead us to a new world where Dm−2 ϕ2 should hold. We trigger the rule for the second instance of a gap-principle in this new world (we can always do this since it is a g -prefixed sentence). The right branch closes and the left branch lead us to a new world where we trigger the rule for the next instance of a gap-principle. Each time we do this, the right branch closes and the left branch lead
123
226
Synthese (2011) 183:211–227
us to a new world. When we trigger the rule corresponding to the last instance of a gap-principle, the left branch lead us to a world m where ϕm should hold. The branch closes based on the fact that g ¬ϕm . The tree reveals that in fact we did not need as much as g to run the argument: we need just the premises be definite enough to reach the appropriate world in the tree (this fact relates to the connection between global and local validity expressed in Proposition 1). Contra (Asher et al. 2009, p. 924) gap-principles cannot be made definitely n for any n; rather, for each sorites series of m elements, there is a limited number of iterations of ‘D’ that keep the premises locally consistent. Acknowledgements Thanks to María Cerezo, Paloma Pérez-Ilzarbe and Dave Ripley for discussion on earlier versions of this paper and to two anonymous referees of this journal for comments that improved the final version. An earlier material was presented in the VI Conference of the Spanish Society for Logic, Methodology and Philosophy of Science (SLMFCE); many thanks to the audience, particularly to Pepe Martínez and Javi Rubio. The research of this paper is integrated in the research project? Borderlineness and Tolerance? (FFI2010-16984) funded by the Ministerio de Ciencia e Innovación, Government of Spain.
References Asher, N., Dever, J., & Pappas, C. (2009). Supervaluations debugged. Mind, 118(472), 901–933. Beall, J. C., & Colyvan, M. (2001). Heaps of gluts and Hyde-ing the sorites. Mind, 110(438), 401–408. Blackburn, P., Rijke, M., & Venema, Y. (2001). Modal logic. Cambridge University Press (Reprinted with corrections: 2004). Cobreros, P. (2008). Supervaluationism and logical consequence: A third way. Studia Logica, 90, 291–312. Cobreros, P. (2010a). Varzi on supervaluationism and logical consequence. Mind (forthcoming). Cobreros, P. (2010b). Supervaluationism and Fara’s paradox of higher-order vagueness. In P. Egré & N. Klinedinst (Eds.), Vagueness and language use. Palgrave McMillan. Edgington, D. (1993). Wright and Sainsbury on higher-order vagueness. Analysis, 53(4), 193–200. Fara, D. G. (2000). Shifting sands: An interest-relative theory of vagueness. Philosophical Topics, 28(1), 45–81. Orignally Published under the Name ‘Delia Graff’. Fara, D. G. (2003). Gap principles, penumbral consequence and infinitely higher-order vagueness. In J. C. Beall (Ed.), Liars and heaps: New essays on paradox (pp. 195–221). Oxford University Press. Originally published under the name ‘Delia Graff’. Fine, K. (1975). Vagueness, truth and logic. Synthese, 30, 265–300. Heck, R. (1993). A note on the logic of (higher-order) vagueness. Analysis, 53(4), 201–208. Hyde, D. (1997). From heaps and gaps to heaps of gluts. Mind, 106(424), 641–660. Hyde, D., & Colyvan, M. (2008). Paraconsistent vagueness: Why not?. Australasian Journal of Logic, 6, 107–121. Keefe, R. (2000). Theories of vagueness. Cambridge: Cambridge University Press. Priest, G. (2007). Review of Absolute Generality. Notre Dame Philosophical Reviews. http://ndpr.nd. edu/review.cfm?id=11144. Priest, G. (2008). An introduction to non-classical logic: From if to is. Cambridge: Cambridge University Press. Sainsbury, M. (1991). Is there higher-order vagueness?. Philosophical Quarterly, 41(163), 167–182. Varzi, A. (2007). Supervaluationism and its logic. Mind, 116(463), 633–676. Williamson, T. (1994). Vagueness. London: Routledge. Williamson, T. (1990). On the structure of higher-order vagueness. Mind, 108(429), 127–143. Williamson, T. (2006). Absolute identity and absolute generality. In A. Rayo & G. Uzquiano (Eds.), Absolute generality (pp. 369–389). Oxford: Oxford University Press. Wright, C. (1987). Further reflections on the sorites paradox. Philosophical Topics, 15(11), 227–290. Wright, C. (1992). Is higher-order vagueness coherent?. Analysis, 52(3), 129–139.
123
Synthese (2011) 183:211–227
227
Wright, C. (2010). The illusion of higher-order vagueness. In R. Dietz & S. Moruzzi (Eds.), Cuts and clouds. Vagueness, its nature and its logic (pp. 523–549). Oxford: Oxford University Press. Zardini, E. (2008). A model of tolerance. Studia Logica, 90(3), 337–368.
123
This page intentionally left blank z
Synthese (2011) 183:229–247 DOI 10.1007/s11229-010-9767-6
Can determinable properties earn their keep? Robert Schroer
Received: 6 May 2008 / Accepted: 21 July 2010 / Published online: 20 August 2010 © Springer Science+Business Media B.V. 2010
Abstract Sydney Shoemaker’s ‘Subset Account’ offers a new take on determinable properties and the realization relation as well as a defense of non-reductive physicalism from the problem of mental causation. At the heart of this account are the claims that (1) mental properties are determinable properties and (2) the causal powers that individuate a determinable property are a proper subset of the causal powers that individuate the determinates of that property. The second claim, however, has led to the accusation that the effects caused by the instantiation of a determinable property will also be caused by the instantiation of the determinates of that property—so instead of solving the problem of mental causation, the Subset Account ends up guaranteeing that the effects of mental properties (and all other types of determinable property) will be causally overdetermined! In this paper, I explore this objection. I argue that both sides in this debate have failed to engage the question at the heart of the objection: Given that both a determinable property and its determinates have the power to cause some effect (E), does it follow that both will actually cause E when the relevant conditions obtain? To make genuine progress towards answering this question, we need to take a serious look at the metaphysics of causation. With the debate properly reframed and issues about the metaphysics of causation front and center, I explore the question of whether the Subset Account is doomed to result in problematic causal overdetermination. Keywords Determinable properties · Realization · Mental causation · Subset Account · Causation
R. Schroer (B) Department of English and Philosophy, Arkansas State University, Jonesboro, AR, USA e-mail:
[email protected]
123
230
Synthese (2011) 183:229–247
1 The Subset Account of determinable properties and the realization relation There are determinable and determinate predicates—“having length” is an example of the former, while “having a length of precisely 2.5 millimeters” is an example of the latter. A predicate that is a determinable relative to one predicate might be a determinate relative to another; “being red” is a determinable of “being scarlet” while being a determinate of “being colored”. Although much more can be said about determinable and determinate predicates and the various relations between them,1 I’m going to skip all that and move on to the following question: Are there determinable and determinate properties? For someone who is ontologically serious about properties, the real question here is whether there are any determinable properties beyond what we might call the “maximally specific determinate properties”—i.e. determinate properties that do not, themselves, have any determinates. In this paper, I examine a recent attempt to defend the existence of determinable properties given by Sydney Shoemaker. For reasons that will become clear shortly, I will refer to Shoemaker’s account as the “Subset Account”. Many philosophers are drawn to the so-called “Eleatic Principle”—the idea that to be real is to possess causal powers.2 When it is applied to properties, the Eleatic Principle tells us that real properties make a causal difference to the particulars that instantiate them and that a “property” that makes no such difference isn’t really a property after all. Many of the participants in the debate over the Subset Account (including Shoemaker himself) accept the Eleatic Principle; indeed, discussions of (and objections to) the Subset Account are typically carried out using the Eleatic Principle as a backdrop. I will follow this trend and frame my investigation with the following assumption: (E) Properties are individuated from one another in terms of the causal powers they contribute to whatever possesses them. A quick note: (E) is actually stronger than the Eleatic Principle, for the claim that properties are individuated by their causal powers is considerably stronger than the claim that properties must contribute causal powers to their bearers. (For instance, it is consistent with the latter claim that the connection between a property and the causal powers it contributes is contingent.3 ) This difference, however, is irrelevant for the purposes of this paper. With assumption (E) in place, we can state the basic idea of Shoemaker’s Subset Account. The collection of causal powers individuating a given determinable property is a (non-empty) proper subset of the causal powers that individuate the various determinate properties of that determinable.4 (To be clear, this is a simplified statement of the Subset Account. For purposes of this paper, however, this simplified version 1 See, for example, the discussions in Johnson (1940) and Armstrong (1997). 2 See, for example, Shoemaker (1980, 2007), Armstrong (1997), Kim (1998), and Heil (2003a). 3 See, for example, Armstrong (1997). 4 See Shoemaker (2001, 2007). A similar account of account of determinable properties can be found in
Fales (1990).
123
Synthese (2011) 183:229–247
231
captures what’s important about Shoemaker’s view.) The causal powers of having mass, for example, are a proper subset of the causal powers of having a mass of exactly 2.5 g. Shoemaker uses the Subset Account of determinable properties to offer a new account of the realization relation and a solution to the problem of mental causation. Simplifying greatly, the problem of mental causation arises in virtue of the claim that mental and physical properties are distinct from one another (as revealed by the phenomenon of multiple realizeability5 ) and the claim that the physical world is causally closed.6 Taken together, these claims seem to imply that mental properties are either causally irrelevant to physical effects (such as bodily movement) or that they causally overdetermine these effects along with various physical properties. (For the record, I think that the relata of the causal relation are events. So strictly speaking, it is events (or instantiating of properties by particulars) that cause physical effects. Talk of properties as causing effects is simply shorthand.) Shoemaker (2001) claims that realized properties (such as mental properties) are always determinable properties and that the properties that realize them are determinates of those determinables.7 The relation between determinable and determinate properties, in turn, is explicated along the lines of the Subset Account—a determinable property is individuated by a proper subset of the causal powers that individuate its determinates. (Recently, Shoemaker has reformulated the connection between determinable/determinate properties, the realization relation, and the Subset Account (see Shoemaker 2007); he continues to explicate the realization relation along the lines of the Subset Account, but he now maintains that the relation between determinable and determinate properties is just one (special) case of the realization relation. This change is irrelevant for the arguments of this paper.) This means that the causal powers that individuate a mental property are a proper subset of the causal powers that individuate its various physical realizers. Indeed, Shoemaker says we can think of a determinable property as being a part of each of its determinates.8 This, in turn, means that we can think of a mental property as being a part of each of its physical realizers. How does treating a mental property as being a determinable property and treating the physical properties that realize it as being its determinates help with the problem of mental causation? In general, we do not think that wholes and their parts double up on one another with respect to various effects that each might bring about. Rather, depending upon the effect in question, sometimes we think that the whole caused it and sometimes we think that only a part of the whole caused it. If determinable properties are parts of determinate properties, then these two kinds of property will not
5 See Putnam (1980). 6 In claiming that the physical world is causally closed, all I mean is that every physical event has a sufficient
physical cause. Heil and Robb (2003) refer to this notion as “Completeness”. 7 Yablo (1992) also maintains that the relationship between a mental property and the various physical properties that realize it is that of a determinable property to its determinates. Unlike Shoemaker, however, Yablo does not explicate his account in terms of a determinable property being individuated by a subset of the causal powers that individuate its determinates. 8 More specifically, Shoemaker maintains that an instance of a determinable property is a part of the
instances of the determinates of that property. I ignore this complication in what follows.
123
232
Synthese (2011) 183:229–247
double up on one another and overdetermine their various effects. (More carefully, the instantiations of each kind of property in a given particular will not overdetermine the same effect.) Rather, depending upon the effect in question, sometimes the determinate property will be the cause and sometimes the determinable property will be the cause. Shoemaker illustrates this idea using Yablo’s (1992) example of Sophie, a pigeon who has been trained to peck at red things. Consider a case where Sophie pecks at a scarlet triangle: …given that Sophie’s pecking was a consequence of the instance of scarlet, we can ask whether what caused it was this instance as a whole or some proper part of it…And here it seems appropriate to say that it was a part of it, namely the instance of red, that did the causing, because it was the conditional powers conferred by that part that were relevant to the effect… (Shoemaker 2001, p. 81) Following the lead of Yablo, Shoemaker claims that when dealing with determinate and determinable properties, causes should be proportional to their effects. In the case of Sophie, for instance, proportionality favors the determinable property of being red, not the determinate property of being scarlet, as the cause of Sophie’s pecking. Why? Because the property of being scarlet has too many causal powers that are irrelevant to Sophie’s pecking, while the property of being red does not. In summary, according to Shoemaker mental properties are not guaranteed to cause effects that are also caused by physical properties (even though mental properties are realized by physical properties). Given that mental properties are determinable properties whose determinates are physical properties, and given that in the case of determinable/determinate properties causes should be proportional to their effects, then, depending upon the effect in question, either the mental property will be the cause or the physical property will be the cause, but not both.9 2 An objection to the Subset Account There are a number of challenges that can be raised to the Subset Account. For instance, does it really make sense to say that an instance of red is a part of an instance of scarlet? And is the Subset Account best described as a version of non-reductive physicalism given that it posits a common physical element (a determinable property, which is individuated by a subset of the causal powers that individuate various physical properties) to all tokens of a type of mental state?10 In what follows, however, I set these questions aside and focus on the following concern instead: Since mental properties (which are determinable properties) are individuated by a proper subset of the causal powers that individuate the physical properties that realize them (their determinates), 9 It is natural to wonder how all this relates to the claim that the world is physically closed. If Sophie’s pecking is a physical event, then (given proportionality) it appears that this event is not caused by another physical event; rather, it is caused by a mental event. In reply to this kind of concern, Yablo (1992) maintains that every event is “causally determined” by a physical event (and that for every event there is some physical event that is “causally sufficient” for it) but rejects the claim that every event, in fact, has a physical cause. 10 For more on this objection, see Heil (2003b).
123
Synthese (2011) 183:229–247
233
isn’t it inevitable that the instantiation of these properties will causally overdetermine a range of effects? This concern has been voiced by a number of people. Consider, for example, what Gillett and Rives (2005) say about the Subset Account. With the Eleatic Principle firmly in view, they claim that …applying Occam’s Razor we should only posit as many properties as we need to account for the causal powers of individuals. (p. 491) In this context, failure to abide by Occam’s Razor will result in positing properties whose causal powers are overdetermined.11 And this, in turn, leads to the charge that the effects of these properties will be causally overdetermined. Walter (2007) also thinks that the Subset Account has a problem with overdetermination. (Walter expresses this concern in terms of epiphenomenalism, not overdetermination. But in this context, the charge of overdetermination and the charge of epiphenomenalism are two sides of the same coin.) …given such a framework, the reason why determinates screen off their determinables could be put as follows: The causal powers of determinables must be a subset of the causal powers of their determinates…which straightforwardly entails that determinates, in contrast to determinables, are guaranteed to contribute all required causal powers to their bearers. (p. 231, footnote 10) And although he doesn’t end up endorsing this objection, McLaughlin (2007) says that the Subset Account: …invites the question whether mental causation is always redundant. The forward-looking causal features of any mental property will be a subset of the forward-looking causal features of each of its physical S-realizers…Given that, why wouldn’t mental causation be redundant causation, at least where the effects of the kinds specified in the forward-looking causal features of mental properties are concerned? (p. 156) Indeed, even Shoemaker (2007) acknowledges that the Subset Account invites the charge of causal overdetermination. It may seem that the account endorses an objectionable sort of overdetermination. Suppose that one of the forward-looking causal features of P is its aptness in circumstances C to produce effect E, and that this is one of the causal features its shares with its realizers, including property Q. And suppose P is instantiated in virtue of Q being instantiated, and that effect E is produced. Won’t it be true on this account that two different property instantiations, that of P and that of Q, caused effect E? And won’t this be overdetermination? (p. 13) 11 Of course, if we fail to abide by Occam’s Razor in virtue of positing epiphenomenal properties, the
result will not involve the overdetermination of causal powers. But such a move is at odds with the Eleatic Principle.
123
234
Synthese (2011) 183:229–247
For what it’s worth, I have heard similar arguments given informally on a number of occasions. What are we to make of this objection? First off, there is the obvious point that positing both determinate and determinable properties involves positing more properties than would be posited if we just posited determinate properties. But the bare fact that the former option results in positing more properties isn’t damning as long as determinable properties don’t create any trouble. The potentially damning objection is that positing determinable properties leads to trouble. And the kind of trouble that is the focus of the above quotations (and that I want to focus on) is that of causal overdetermination. (There is another, related form of trouble looming here—namely, that positing determinable properties leads to the “double-counting” of causal powers (Gillett and Rives 2005) or a “piling problem” with respect to causal powers (Hofmann 2007). The worry is that under the Subset Account a given object is guaranteed to have a set of causal powers twice over—once in virtue of having a particular determinable property and again in virtue of having the determinate of that determinable property. I don’t want to take the time to explore this objection in any detail; instead, I will focus exclusively on the question of whether positing determinable properties leads to causal overdetermination.12 ) At this point, I’d like to introduce some terminology: Let’s say that determinable properties would ‘earn their keep’ if their instantiation would cause various effects that would not be causally overdetermined by the instantiation of the corresponding determinate properties. The charge that I am exploring is that under the Subset Account it’s inevitable that determinable properties will not earn their keep. The argument goes like this: If (as the Subset Account mandates) a determinable property and its determinate both contribute a causal power to bring about the same effect (E), then when the relevant conditions obtain (i.e. the conditions that make the causal power operative) they will both cause E. Hence, the causal overdetermination of E in those conditions is inevitable. This argument assumes that a given property’s status as a cause of E (in the relevant conditions) is settled by whether one of the causal powers individuating that property is the power to cause E. Let’s label this idea as “α”. α : When the relevant conditions obtain (i.e. the conditions that make the causal power operative), a property’s status as a cause of E is settled by whether the causal powers individuating it include the power to cause E. According to the Subset Account, both a determinable property and the corresponding determinate property have the power to cause (let’s say) E. And according to α, when the relevant conditions obtain, both these properties will cause E. So E is causally overdetermined. 12 Hofmann (2007) argues that if we take a reductionist view and posit that properties are reduced to causal powers the above objection vanishes. Under a reductionist approach, there are not two different entities on the scene that each confer (some of) the same causal powers to a given object; rather, that object has the set of causal powers that constitute the determinate property and in virtue of having those very powers the object also has the causal powers that constitute the determinable property.
123
Synthese (2011) 183:229–247
235
The problem with this argument for the overdetermination of E is that α is a nontrivial assumption about a property’s status as a cause of E. Indeed, it is easy to read Shoemaker as offering a competing principle along the following lines: β : When the relevant conditions obtain (i.e. the conditions that make the causal power operative), a property’s status as a cause of E is settled by whether the causal powers individuating it (1) include the power to cause E, and (2) are, as a collection of causal powers, proportional to E. β (and its appeal to proportionality) is controversial as a thesis about causation, but so is almost every other thesis about causation, including α. And, more importantly for our purposes, there’s nothing in the Eleatic Principle that favors α over β. It is consistent with the Eleatic Principle that an object could have two properties—a determinable property and one of its determinates—that are both individuated (in part) by the power to cause E and that one of the properties is operative as the cause of E while the other isn’t. So to press the objection that determinable properties will not earn their keep against the Subset Account, you need to argue for α and against β. But the versions of this of objection that I’ve encountered seem to just assume the truth of α; they seem to just assume that since both a determinate property and its corresponding determinable are individuated by the power to cause E, they will both cause E when the relevant enabling conditions obtain. Hence, as it is stated above (and as it is typically presented, at least in my experience), this objection against the Subset Account is seriously incomplete. In order for this objection to have genuine bite, it is not enough to point out that both the mental property and the physical property that realizes it contain the power to cause E; you must also make the case that both properties will actually cause E when the relevant enabling conditions obtain. And in order to make the latter argument, you need to go beyond the question of what causal powers each property has and delve into the actual metaphysics of causation. There’s a flipside to all this. In order to defend the Subset Account from the objection that determinable properties will not earn their keep, you need an argument in favor of (something like) β and against α. To his credit, Shoemaker does provide some argument here—he tries to motivate the claim that causes should be proportional to their effects via intuitive examples (like the earlier one involving Sophie).13 But given that this claim about proportionality is the linchpin to repelling the charge that determinable properties will fail to earn their keep, Shoemaker says surprising little in favor of it. In addition to the concern that it is not adequately supported, there is a more general concern about Shoemaker’s appeal to proportionality. To say that causes must be proportional to their effects is not to say anything about what causation is. This, in turn, leads to doubts about whether proportionality is well motivated as a thesis about the metaphysics of causation—since we haven’t been told what causation is, we might wonder whether the claim that causes should be proportional to their effects really 13 Shoemaker recycles several intuitive examples in favor of proportionality originally offered by Yablo
(1992).
123
236
Synthese (2011) 183:229–247
reflects a fact about the metaphysics of causation (as opposed to merely reflecting an explanatory preference). Brian McLaughlin, for instance, states that …I myself think that rather than a constraint on causation, proportionality is a pragmatic constraint on explanation. Too much causal detail or too little causal detail makes for a poor or misleading explanation in a context. (2007, p. 165) Let me be clear about the complaint I’m levelling here. I’m not complaining that there is no evidence for the idea that in some sense we should think of causes as being proportional to their effects. I am willing to allow that the intuitive examples offered by Shoemaker (and Yablo) could be evidence in favor of treating proportionality as a pragmatic constraint on explanation (as McLaughlin claims). My complaint, rather, is that the idea of proportionality is not univocally supported by the previously mentioned examples as a thesis about the metaphysics of causation. To show that proportionality is a constraint on the actual metaphysics of causation (and to make some progress rebutting the charge that it is merely a pragmatic constraint on explanation), what we need (at a bare minimum) is an account of what causation is that supports the claim that causes are proportional to their effects. But Shoemaker makes no attempt to say what causation is.14 Given how things stand, then, we cannot take it for granted that proportionality is well supported as a thesis about the metaphysics of causation. Let’s take a step back and review. According to the Subset Account, a determinable property is individuated by a subset of the causal powers that individuate its various determinates. One objection that has been levelled against this account is that since both a determinable property and its determinate are individuated by the power to cause (say) E, they will both cause E when the relevant enabling conditions obtain; in short, it is claimed that determinable properties will not earn their keep. To tell whether this objection has any real bite, we need to move beyond the fact that a determinable property and its determinate each have the power to bring about E and focus on the status of each of these properties as the actual cause of E. In short, we need to delve into the metaphysics of causation. To date, however, the debate has exhibited little movement in this direction: Opponents of the Subset Account tend to assume that causal overdetermination is an inevitable outcome of that position (in virtue of assuming the truth of something akin to α). Shoemaker, in turn, presents a defense of the idea that determinable properties will earn their keep that turns on a claim—that causes are proportional to their effects—that (1) does not itself receive much defense in Shoemaker’s writings, and (2) potentially flounders on the charge that it confuses epistemology for metaphysics. Can we remedy this disappointing stalemate? We can, if we explicitly investigate the status of determinable and determinate properties as causes. But there is a serious difficulty facing such a project—exploring the issue in this way requires getting clear on the metaphysics of causation and the metaphysics of causation is anything but clear. Fortunately, this difficulty is not insurmountable. While there is a panoply of theories of causation, there is a growing consensus that (viable) theories of causation fall into one of two basic camps: Causation-as-production and 14 Yablo (1992) also does not say what causation is; like Shoemaker, he does not give us an account of the metaphysics of causation.
123
Synthese (2011) 183:229–247
237
causation-as-counterfactual-dependence. So although we can’t examine the question of whether determinable properties will earn their keep from the perspective of every extant theory of causation, we can explore it from the perspective of these two general camps on causation. And, as we shall see, doing so allows us to reach some important conclusions about the Subset Account.
3 The Subset Account and two notions of causation Hall (2004) has recently argued that there are two basic varieties (or concepts) of causation. He characterizes the distinction in the following way: Causation, understood as a relation between events, comes in at least two basic and fundamentally different varieties. One of these, which I call “dependence”, is simply that: counterfactual dependence between wholly distinct events. In this sense, event c is a cause of (distinct) event e just in case e depends on c. That is, just in case, had c not occurred, e would not have occurred. The second variety is rather more difficult to characterize, but we evoke it when we say of event c that it helps to generate or bring about or produce another event e, and for that reason I call it “production”. (p. 225, his emphasis) Hall characterizes production accounts of causation as accounts that are committed to the ideas of Locality (causes are connected to their effects via spatiotemporally continuous sequences of causal intermediaries), Intrinsicness (the causal structure of a process is determined by its intrinsic, non-causal character (together with the laws)), and Transitivity (causation is a transitive relation).15 For expositional simplicity, I will refer to these ideas as “L”, “I” and “T”. How does Hall argue for the distinction between these two kinds of causation? One challenge to counterfactual approaches to causation involves cases were there appears to causation even though the two events involved do not stand the appropriate counterfactual relation with one another. Here’s a simple example of one such case— a case of ‘preemption’—taken from Hall (2004, p. 235): Suzy and Billy both throw rocks at a glass bottle. Both rocks are thrown accurately, but Suzy’s throw is made a split second before Billy’s and, thus, arrives at the bottle first and shatters it. It seems obvious that Suzy’s throw was the cause of the bottle’s breaking, but it is not true that if Suzy had not thrown the rock, the bottle would not have broken (due to Billy’s throw). In response to cases like this, defenders of counterfactual theories of causation might be tempted to supplement their accounts with L or I.16 Indeed, Hall argues that supplementing counterfactual accounts with L or I is the best available strategy for dealing with these cases. 15 See Hall (2004, p. 225). 16 Lewis (1986b), for example, appeals to something akin to Intrinsicness in this context. Here’s the
basic idea: Suzy’s throw counts as the cause of the shattering because it is intrinsically just like another process (i.e. the process we’d have if Suzy made the same throw and Billy did not throw at all) that, under a counterfactualist approach, would qualify as being a causal process of a throw breaking a bottle.
123
238
Synthese (2011) 183:229–247
But there is a problem with supplementing counterfactual accounts with L, I, and/or T. Hall presents several other examples—examples of ‘double prevention’ and examples of causation of/by omissions—where the heart of the counterfactual approach (Dependence) ends up being at odds with L, I, and T. Consider, for example, the following case of double prevention (taken from Hall 2004, pp. 241–248): Suzy and Billy are piloting separate planes during a bombing run. A third pilot—Enemy—comes into view and is shot down by Billy. Suzy then completes her bombing run. If Billy had not shot down Enemy, Enemy would have shot down Suzy and the bombing would not have occurred.17 It seems right to credit Billy with being a cause of Suzy’s successful bombing—after all, if he hadn’t shot down Enemy, Suzy would not have successfully completed her bombing mission. As Hall argues, however, it’s difficult to see how the causal relation between Billy’s shooting down Enemy and Suzy’s bombing accommodates the ideas of L, I and T. In short, there is an intuitive pull to say that causation is present (in virtue of a form of counterfactual dependence being present), but given the way the case is constructed accepting this intuition forces us to conclude that L, I, and T are false as theses about the nature of causation. The result is an uneasy dance between Dependence and L, I, and T: In some cases (like the above example of preemption), our intuitions say that causation is present despite the absence of counterfactual dependence. (As mentioned earlier, Hall argues that the best response to these cases is to supplement the counterfactual approach with an appeal to L or I.) But in other cases (like the above example of double prevention), our intuitions say that causation is present in virtue of the presence of a counterfactual dependence between events despite that counterfactual dependence being at odds with L, I, and T. So what are we to make of all this? Hall argues that what this uneasy dance shows is that “causation” is not a univocal concept—there is one concept of causation focused on the (unembellished) idea of counterfactual dependence, and another focused on the idea of production, as characterized by L, I, and T. What the above examples reveal, in turn, are some of the ways in which these two concepts can come apart from one another. Now that we’ve sketched Hall’s argument, let’s take a closer look at each conception of causation. I take it that the basic idea of the counterfactualist account of causation is straightforward enough.18 But we need to hear a little more about the production account. Toward that end, I turn to Jonathan Schaffer’s recent paper “Causes need not be Physically Connected to their Effects”. In this paper, Schaffer explores the distinction between theories of causation that maintain that causes are “physically connected” to their effects and theories that do not (the counterfactualist approach being the leading version of the latter).19 The former theories I take to be (at least roughly) the same class of theories that Hall has in mind when he speaks of production accounts of causation. To my ear, to say that causes are “physically connected” to
17 This is a case of ‘double prevention’ because Billy’s shooting Enemy down prevents Enemy from shooting down Suzy which, in turn, would have prevented the bombing. 18 Lewis (1973) is the seminal paper on this approach to causation. 19 More specifically, Schaffer is exploring this distinction via the question of whether absences (or omis-
sions) can be causes.
123
Synthese (2011) 183:229–247
239
their effects is awfully close to saying that there is something physical in causes that “generate or bring about or produce” their effects.20 Here is Schaffer’s brief, but useful, overview of theories that posit a physical connection between cause and effect: There are three thematically related versions of the physical connections view. First, there is the idea that causation requires transference, of a property, or more specifically of energy-momentum. This idea has been developed by such philosophers as Jerrold Aronson (1971), David Fair (1979), and Hector-Neri Castaneda (1984). Secondly, there is the idea that causation requires processes. This idea traces back to Bertrand Russell (1948). It was developed by Wesley Salmon (1984), who characterizes a causal process as a continuous qualitative persistence that is capable of transmitting a mark, of propagating structure. This idea was further developed by Phil Dowe (1992, 1995, 2000; see also Salmon 1994, 1998), who characterizes a causal process as a world-line of an enduring conservedquantity-bearing object. Thirdly, there is the idea that causation requires an intrinsic tie. This idea had been developed by J.L. Mackie (1974). Douglas Ehring (1997) specifies this tie as the persistence line of a trope, and Max Kistler (1998, 2001) further develops this thought, while bringing it closer to Dowe’s view, by restricting the persisting tropes to those of conserved quantities. These three approaches owe their distinctive aspects as much to historical pedigree as to philosophical difference. All understand physical connections as lines of persistence. They differ only in what is said to persist: unspecified for Russell and Mackie, properties for Aronson, tropes for Ehring and Kistler, energy for Fair and Castaneda, structure for Salmon, and objects (those instantiating conserved quantities) for Dowe. (pp. 203–204, his emphasis) Let me summarize the discussion so far. In order to assess whether determinable properties (as conceived of by the Subset Account) can earn their keep, I need to delve into the metaphysics of causation. In doing so, I will follow Hall in thinking that (viable) theories of causation boil down to one of two general camps: Counterfactual dependence or production. (And I will follow Schaffer in thinking of the latter camp as positing a physical connection between cause and effect.) I will not argue for Hall’s claim nor will I attempt to provide a more thorough characterization of the two camps beyond what I have already done. For this reason, the conclusions I ultimately reach about the Subset Account will be contingent upon Hall’s conclusion and will also be a little open-ended. But, as I mentioned earlier, there is a growing consensus that Hall’s conclusion is sound, so it is my hope that in assuming it I will not be offending too many of my readers. (It is worth noting that a prominent reductionist (Kim 2007) and a prominent non-reductionist (Loewer 2007) have both accepted Hall’s conclusion.) 20 One natural way of understanding the requirement of Locality (which Hall uses to characterize the notion of production) is that causes are spatiotemporally connected to their effects in virtue of being physically connected to them. (I will not explore how the theories that Schaffer characterizes as positing a physical connection between cause and effect relate to the ideas of Intrinsicness and Transitivity.)
123
240
Synthese (2011) 183:229–247
With these assumptions in place, let’s return to the question of whether determinable properties will earn their keep. Following Shoemaker, let’s focus on the specific question of whether determinable properties will earn their keep as causes of determinable effects. Given that both a determinable property and the corresponding determinate property will have the power to cause a determinable effect E, is there something about the nature of causation (viewed as counterfactual dependence or viewed as production) that disqualifies the determinate property as the actual cause of E? For expositional purposes, I will frame my investigation of this question around the following case: A building is built in such a way that it can survive an earthquake that registers lower than 5 on the Richter scale but is guaranteed to collapse given an earthquake that registers at 5 or greater. An earthquake of a magnitude of 5.4 occurs and the building collapses.21 We are assuming that there are both determinate and determinable properties in the mix. This means that the collapse of the building involves a determinable and a determinate event: The determinable event of the building’s (general) collapse and the determinate event of the specific way the building collapsed. Our focus is on the determinable event of the building’s collapse. There is both a determinate event (an 5.4 earthquake) and a determinable event (a 5 or greater earthquake) on the scene with the power to cause the determinable event of the collapse. So does that mean that they are both causes of that collapse? Setting appeals to proportionality aside, is there anything about the metaphysical nature of causation that disqualifies the determinate event as the cause of the determinable collapse?
3.1 The collapse, viewed from the counterfactualist perspective In this subsection, I explore the question of what the building’s collapse looks like from the perceptive of a counterfactualist about causation. Before doing this, however, I want to acknowledge that there are serious questions about how well-suited, in general, counterfactualist accounts of causation are to discussions of mental causation. Here’s the concern: If we treat causation as being a relatively simple form of counterfactual dependence, then Cartesian Dualism, parallelism, and epiphenomenalism could all be consistent with mental causation. (Under all of these theories, there can be a counterfactual dependence between mental events and physical events which, given a counterfactualist account of causation, means that the former events cause the latter events.22 ) But this result is odd; intuitively, the first theory of mind is supposed to present a major obstacle to mental causation, while the latter two are actually
21 This is a slightly modified version of a case described by Yablo (1992). 22 Kim (1998, p. 70; 2007, p. 234), for example, argues that under an epiphenomenalist theory of mind
there can be counterfactual dependences obtaining between mental events and physical events. Similar arguments can be generated for dualist and parallelist theories of mind.
123
Synthese (2011) 183:229–247
241
supposed to be inconsistent with it.23 Although this is a serious concern, I will not explore it any further in this paper. Instead, I will focus on the more specific question of what the Subset Account looks like when it is wedded to a counterfactualist account of causation. I will show that regardless of the viability of counterfactualist accounts of mental causation in general, there is something particularly worrisome about the marriage of the Subset Account and counterfactualism. With that out of the way, let’s turn our attention back to the collapse of the building. For a counterfactualist about causation, the question of which event causes the (determinable) collapse transforms into the following: Which event does the (determinable) collapse of the building counterfactually depend upon? The answer is that the collapse counterfactually depends on the (determinable) event of a 5 or greater magnitude earthquake and not on the (determinate) event of a 5.4 earthquake—after all, had the earthquake registered at 5.3 or at 5.5 the building still would have collapsed. Let me put this point a bit more carefully. Under the Lewis/Stalnaker account of counterfactuals, A counterfactually depends upon B iff A and B both occur and at the closest possible world where A does not obtain, B does not obtain (where “A” and “B” are distinct events).24 Applying this account to the case at hand, we get the following: The determinable event of the collapse counterfactually depends on a 5.4 earthquake iff at the closest possible world without a 5.4 earthquake the building does not collapse. I assume that a possible world that is just like ours but with a 5.3 or a 5.5 earthquake is the relevant closest world. In that world, however, the building would still collapse. (Of course, in that world the building will not collapse in exactly the same way that it does in this world. But this fact is irrelevant if what we are looking for is the cause of the determinable event of the building collapsing.) So if we conceive of causation as counterfactual dependence, then even though the 5.4 earthquake has the power to cause the determinable effect of the collapse, it doesn’t actually end up being the cause of that collapse. Since it fails to stand in the appropriate counterfactual relationship to the (determinable) collapse, the (determinate) event of a 5.4 earthquake is disqualified as the cause of that collapse. This result generalizes beyond the example at hand: If we add determinable properties to a metaphysical worldview that conceives of causation as counterfactual dependence, (1) these determinable properties will cause certain determinable effects and (2) determinate properties will be disqualified as causes of those same effects.25 In short, if we view causation as being counterfactual dependence, then determinable properties will earn their keep as causes of determinable effects.26 23 Loewer (2007) argues that given the right understanding of counterfactuals, there will not be a counterfactual dependence between mental and physical events under epiphenomenalism. (I ignore the question of whether a similar reply can be generated with respect to dualism and parallelism.) 24 See Stalnaker (1968) and Lewis (1973, 1979). 25 Both Yablo (1992) and Lewis (1986a) give similar examples of how determinable properties play a role
in various counterfactual dependencies between events that their determinates do not and, in virtue of doing so, are better candidates as being the causes of certain effects. 26 Interestingly, it appears that determinate properties will not earn their keep as the cause of determinate effects under a simple counterfactualist approach. Why not? Whenever a determinate property is instantiated, a corresponding determinable property will also be instantiated (because the instantiation of a determinate property necessitates the instantiation of a determinable property). This, in turn, means that
123
242
Synthese (2011) 183:229–247
Despite procuring the claim that determinable properties will earn their keep, there is a serious problem with combining the Subset Account with a counterfactualist account of causation. If we think causation is counterfactual dependence, then there isn’t a problem of mental causation in the first place. To put it another way, the presence of overdetermination in mental causation is worrisome only if you view causation as production. As Loewer (2007) puts the point: …if causation is understood as production then it does seem that causal exclusion is, as Kim says, “virtually analytic”. If P(y,t) produces Q(y,t) how can a distinct event M(y,t) also produce Q(y,t)? As Kim likes to put it, there is “no work for a distinct mental event to do”….It seems obvious that if the brain event produces the bodily motion, a distinct mental event has nothing more to do. (p. 253, his emphasis) …there is no problem of overdetermination if causation is understood as dependence. On Lewis’s account of counterfactuals a particular event (or the value of a range of possible events) can depend on many co-occurring events. The motions of one’s body, for example, the motions of a person’s arms and hands when reaching into the refrigerator, depend counterfactually both on her mental states (which snacks she wants) and on her brain (and other bodily) states and on a myriad of other states and events…there is no temptation to say if B depends on P it can’t also depend on M since “there is no work for M to do”. (pp. 255–256) If we view causation as production, then it’s genuinely worrisome if our theory entails that two distinct events (a mental event and a physical event) are each producing the same effect. If the physical event is perfectly capable of producing the effect on its own, then it appears that the mental event “has no work to do”. If, in contrast, we view causation as counterfactual dependence, the fact that the same effect counterfactually depends upon two distinct events (a mental event and a physical event) doesn’t generate the same intuitive concern about the mental event “having no work to do”. For if we view causation as counterfactual dependence, we’re not saying that each of these events generates the same effect (and thus that each of these events essentially “does the same work” to bring that effect about). Rather, we’re only saying that there is a relationship of counterfactual dependence between each of these events and the effect.27 At the heart of all counterfactualist accounts is the idea that causation is a form of counterfactual dependence between distinct events. And, as we have just seen, if Footnote 26 continued whatever counterfactual relationships the instantiation of the determinate property stands in, the instantiation of the corresponding determinable property is fated to stand in the same relationships. (I will not press this problem, however, for reasons that will become clear shortly.) 27 I believe that even Jaegwon Kim, a leading opponent of non-reductive physicalism, would accept the claim that counterfactualist theories of causation dissolve the problem of mental causation. When Kim examines versions of non-reductive physicalism that appeal to counterfactualist accounts of causation, he does not argue that these accounts should be rejected because they entail a problematic form of causal overdetermination or because they entail that mental events have no work to do. Rather, he argues (in a variety of ways) that mental causation involves something more, something “thicker”, than just counterfactual dependence between wholly distinct events. See, for example, Kim (1998, pp. 67–72; 2007).
123
Synthese (2011) 183:229–247
243
causation is just counterfactual dependence, then there isn’t really a problem of mental causation in the first place, for under counterfactualist accounts of causation, the fact that an effect has multiple causes isn’t especially problematic. Earlier, we saw that if we combined a counterfactualist account of causation with the Subset Account, determinate properties will be disqualified as the causes of determinable effects, thus ensuring that determinable properties will be the only causes of those effects. But now we see that the key ingredient in securing the claim that determinable properties will earn their keep—namely, the claim that causation is counterfactual dependence—ends up dissolving any worries we might have had about an effect having multiple causes in the first place. For if causation is just a form of counterfactual dependence, then the possibility that an effect might be caused by (i.e. counterfactually depend upon) two or more events is no longer intuitively problematic. So when the Subset Account is combined with a counterfactualist approach to causation, the counterfactualist approach dissolves the problem of causal overdetermination and the question of whether there is an effect (such as the building’s collapse) that is caused by both a determinable and a determinate event no longer matters. If this line of thought is correct, then as a solution to the problem of mental causation the Subset Account ends up being pointless when it is wedded to a counterfactualist account of causation. It’s pointless because the claim that allows the Subset Account to defend itself from the charge of problematic overdetermination—namely, the claim that causation is counterfactual dependence—ends up dissolving the very problem that the Subset Account (and its appeal to proportionality) was intended to be a solution to. If causation is mere counterfactual dependence, then there is no such thing as problematic causal overdetermination and, hence, no “problem” for the Subset Account to solve. (This is not to say that the Subset Account is pointless tout court. It could still have benefits besides its account of mental causation, benefits that are unscathed by the above argument.28 ) 3.2 The collapse, viewed from the productionist perspective Now consider the collapse of the building from the perspective of someone who thinks of causation as production. For expositional simplicity, let’s work with a (simplified) version of the productionist account that posits that causes and effects are physically connected in virtue of the transference of energy from cause to effect.29 Under such an account, the question “what caused the (determinable) collapse?” transforms into the question “where did the energy responsible for the (determinable) collapse come from?” According to the Subset Account, both the determinable event of a 5 or greater earthquake and the determinate event of a 5.4 earthquake have the power to cause the determinable event of the collapse. As we saw in the preceding section, when we conceive of causation as being counterfactual dependence the determinate event of the 5.4 earthquake ends up being disqualified as the actual cause of the (determinable) 28 I am indebted to an anonymous referee for this point. 29 For more on this particular version of the production approach to causation, see Fair (1979).
123
244
Synthese (2011) 183:229–247
collapse. But under the energy-transference account, the fact that the earthquake was precisely 5.4 and thus contained more than enough energy to ensure the collapse of the building does not disqualify that fully determinate event as being the cause of the (determinable) collapse; the fact that the (determinable) effect of the building’s collapse would have occurred even if the earthquake had registered 5.3 or a 5.5 on the Richter scale is simply irrelevant to the question of whether the 5.4 earthquake was the cause of that collapse. So unlike the counterfactual conception, under the energy-transference conception of causation there’s nothing that disqualifies the determinate event (the 5.4 earthquake) as the cause of the determinable collapse. This, in turn, means that there is a legitimate worry about the causal overdetermination of that collapse—both the determinable earthquake and the determinate earthquake have the power to cause it (in that both events have enough energy to cause it) and nothing in the metaphysical nature of causation (understood as energy-transference) disqualifies either of these events as being the actual cause of it. This result extends to productionist accounts beyond the energy-transference account. For under all production accounts of causation, the question about the cause of the (determinable) collapse is a different kind of question than it is under counterfactual accounts. Under production accounts, causation involves a physical connection between cause and effect or a line of persistence connecting cause to effect. (As noted earlier, there are competing accounts of what persists: perhaps its energy, structure, a trope, a property, an object, etc.) In identifying the cause of a given effect, what matters is the question of where the effect receives the thing (be it energy, structure, a trope, a property, an object, etc.) that connects it to its cause. Once we’ve identified where a given effect received this persisting entity from, we’ve determined its cause. And, as we have seen, when we conceive of causation in this manner there’s nothing that disqualifies the 5.4 earthquake as being the cause of the (determinable) collapse. I have shown that, when it is viewed as production, there is nothing in the metaphysics of causation that disqualifies the 5.4 earthquake as being the cause of the (determinable) collapse. In response to this point, it could be claimed that there is something else—something outside of the metaphysics of causation—that disqualifies the 5.4 earthquake as the cause of the (determinable) collapse. If this were the case, the claim that there is nothing in the metaphysics of causation that disqualifies the 5.4 earthquake as being the cause would not guarantee that the (determinable) event of the building’s collapse is causally overdetermined. Of course, the suggestion that something else might disqualify the 5.4 earthquake as the cause of the (determinable) collapse is a promissory note that needs to be cashed. Part of the difficulty in cashing this note is that attempts to disqualify the 5.4 earthquake as the cause of the (determinable) collapse that aren’t grounded in the metaphysics of causation run the risk of confusing epistemology for metaphysics. Consider, for example, a position that views causation as production and that posits that causes must be proportional to their effects. According to this position, determinate events would be disqualified as the causes of determinable effects by something (the appeal to proportionality) that is separate from the metaphysics of causation, for there is nothing in the basic idea that causation involves a line of persistence connecting cause to
123
Synthese (2011) 183:229–247
245
effect that supports the claim that causes should be proportional to their effects.30 But now the concern, discussed earlier in this paper, about proportionality merely reflecting a pragmatic constraint on explanation, and not a genuine constraint on causation, kicks in. The fact that citing the 5.4 earthquake as the cause of the (determinable) collapse violates a pragmatic constraint on explanation doesn’t explain what, in fact, disqualifies this event as being the cause of the determinable collapse. To be fair, there could be other ways of motivating something like the proportionality constraint, ways that do not flounder on the charge of confusing epistemology for metaphysics.31 Consider, for instance, a productionist account that understands causation in terms of the persistence of a trope. Under such an account, the identity conditions of tropes might dictate that only determinable events can cause determinable effects. More specifically, if the causation of a determinable event involves the persistence of a determinable trope, you might think that the cause must be a determinable event—i.e. the event that instantiates the previously mentioned determinable trope. The problem with this line of reasoning, however, is that it assumes that the identity conditions of tropes will guarantee that a determinate trope cannot persist as a determinable trope. In order to cash out the promissory note completely, you need an actual argument for thinking that tropes are individuated in way that prohibits a determinate trope from persisting as a determinable trope. The previous discussion contains an important lesson. Any productionist account that wishes to defend something like the proportionality constraint inherits the burden of producing an argument that shows that this constraint is not motivated solely by epistemic or pragmatic considerations. In absence of such an argument, there is a legitimate concern that, under productionist accounts of causation, a determinable effect receives what it needs (be it energy, structure, a trope, a property, an object, etc.) twice over. Under productionist accounts, then, worries about the Subset Account leading to causal overdetermination are well placed. 4 Conclusion: An assessment of the Subset Account and the charge of causal overdetermination The Subset Account offers a new way of understanding determinable properties and their relationship to determinate properties. It also offers a new way of understanding the realization relation and a (relatively) new solution to the problem of mental causation that plagues non-reductive physicalism. At the heart of this account is the claim that the causal powers that individuate a determinable property are a proper 30 It is an interesting question whether the metaphysics of causation, if viewed as counterfactual dependence, would support an appeal to proportionality. As we saw in Sect. 3.1, a counterfactual approach to causation will disqualify determinate events as causes of determinable effects. In this sense, then, a counterfactualist approach will secure the idea that determinable events should be the (sole) cause of determinable effects. But as we saw in footnote 26, a counterfactualist approach will not secure the idea that determinate events should be the (sole) cause of determinate effects, for under a counterfactualist approach it appears that certain determinable events will not be disqualified as also being causes of those determinate effects. (I am indebted to an anonymous referee for raising this issue.) 31 I am indebted to an anonymous referee for pressing me on this point and for the specific example that follows.
123
246
Synthese (2011) 183:229–247
subset of the causal powers that individuate the various determinates of that property. This claim, however, has lead some to claim that the Subset Account is guaranteed to result in causal overdetermination; it is claimed that the Subset Account guarantees that when a mental property causes some effect (E) so will the physical property that realizes it. I have argued that in order to make real progress in evaluating this objection, we need to move beyond the fact that both mental property and physical property have the power to cause E and focus our attention on the question of which property actually causes E. And to accomplish this, we need to delve into the metaphysics of causation. Following the lead of Hall (2004), I have assumed that all (viable) theories of causation fall into one of two basic camps: Causation-as-production or causation-as-counterfactual-dependence. With this assumption in place, I examined the question of whether determinable properties will earn their keep as causes of determinable effects. Here are my findings: If we view causation as counterfactual dependence, then the Subset Account ends up being pointless as solution to the problem of mental causation. It is pointless because if causation is mere counterfactual dependence, then concerns about the presence of problematic overdetermination in the case of mental causation do not arise in the first place. If, in contrast, we view causation as production, then concerns about effects having more than one cause come back into play. And it appears that the Subset Account is committed to such a form of problematic causal overdetermination—more specifically, it does not appear that determinable properties will earn their keep (because determinate properties are not disqualified as also being causes of determinable effects). So if we view causation as production, the Subset Account fails to provide a solution to the problem of mental causation. So either way, the Subset Account is seriously tarnished as a solution to the problem of mental causation—it is either pointless (because the problem it is meant to solve ends up being dissolved by a subsidiary assumption) or a failure (because the problem it is meant to solve is still in play, but the Subset Account fails to solve it).32 Acknowledgments The inspiration for this paper occurred while I was a resident in John Heil’s 2006 N.E.H. Summer Seminar, “Mind and Metaphysics”, at Washington University (in St. Louis). An early version was presented at the Alabama Philosophical Society in 2007. I would like to thank Chuck Carr, John Heil, Brendan O’Sullivan, and two anonymous referees for their comments.
References Armstrong, D. M. (1997). A world of states of affairs. Cambridge: Cambridge University Press. Aronson, J. L. (1971). On the grammar of “cause”. Synthese, 22, 414–430. Castaneda, H. N. (1984). Causes, causity, and energy. In P. French, T. Uehling, & H. Wettstein (Eds.), Midwest studies in philosophy IX (pp. 17–27). Minneapolis: University of Minnesota Press. Dowe, P. (1992). Wesley Salmon’s process theory of causality and the conserved quantity theory. Philosophy of Science, 59, 195–216. Dowe, P. (1995). Causality and conserved quantities: A reply to Salmon. Philosophy of Science, 62, 321–333. Dowe, P. (2000). Physical causation. Cambridge: Cambridge University Press. 32 So which is it, pointless or a failure? If you’re asking me, the answer is: Causation is production, so the
Subset Account is a failure.
123
Synthese (2011) 183:229–247
247
Ehring, D. (1997). Causation and persistence. Oxford: Oxford University Press. Fair, D. (1979). Causation and the flow of energy. Erkenntnis, 14, 219–250. Fales, E. (1990). Causation and universals. London: Routledge. Gillett, C., & Rives, B. (2005). The non-existence of determinables: Or, a world of absolute determinates as default hypothesis. Nous, 39(3), 483–504. Hall, N. (2004). Two concepts of causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and counterfactuals (pp. 225–276). Cambridge, MA: The MIT Press. Heil, J. (2003a). From an ontological point of view. Oxford: Oxford University Press. Heil, J. (2003b). Multiply realized properties. In S. Walters & H.-D. Heckmann (Eds.), Physicalism and mental causation (pp. 11–30). Charlottesville, VA: Imprint Academics. Heil, J., & Robb, D. (2003). Mental causation. In E. N. Zalta (Ed.), Stanford encyclopedia of philosophy (Spring 2008 edition). http://plato.stanford.edu/archives/spr2008/entries/mental-causation/. Hofmann, F. (2007). Causal powers, realization, and mental causation. Erkenntnis, 67, 173–182. Johnson, W. E. (1940). Logic. New York: Cambridge University Press. Kim, J. (1998). Mind in a physical world. Cambridge, MA: The MIT Press. Kim, J. (2007). Causation and mental causation. In B. McLaughlin & J. Cohen (Eds.), Contemporary debates in philosophy of mind (pp. 227–242). Malden: Blackwell. Kistler, M. (1998). Reducing causality to transmission. Erkenntnis, 48, 1–24. Kistler, M. (2001). Causation as transference and responsibility. In W. Spohn, M. Ledwig, & F. Siebelt (Eds.), Current issues in causation (pp. 115–133). Paderborn: Mentis. Lewis, D. (1973). Causation. Journal of Philosophy, 70, 556–567. Lewis, D. (1979). Counterfactuals. Cambridge, MA: Harvard University Press. Lewis, D. (1986a). Events. In D. Lewis (Ed.), Philosophical papers (Vol. II, pp. 241–269). Oxford: Oxford University Press. Lewis, D. (1986b). Postscripts to causation. In D. Lewis (Ed.), Philosophical papers (Vol. II, pp. 172–213). Oxford: Oxford University Press. Loewer, B. (2007). Mental causation, or something near enough. In B. McLaughlin & J. Cohen (Eds.), Contemporary debates in philosophy of mind (pp. 243–264). Malden: Blackwell. Mackie, J. L. (1974). The cement of the universe. Oxford: Oxford University Press. McLaughlin, B. (2007). Mental causation and Shoemaker-realization. Erkenntnis, 67, 149–172. Putnam, H. (1980). The nature of mental states. In N. Block (Ed.), Readings in philosophy of psychology (Vol. I, pp. 223–231). Cambridge, MA: The MIT Press. Russell, B. (1948). Human knowledge: Its scope and limits. New York: Simon and Schuster. Salmon, W. (1984). Scientific explanation and the causal structure of the world. Princeton: Princeton University Press. Salmon, W. (1994). Causality without counterfactuals. Philosophy of Science, 61, 297–312. Salmon, W. (1998). Causality and explanation. Oxford: Oxford University Press. Schaffer, J. (2004). Causes need not be physically connected to their effects: The case for negative causation. In C. Hitchcock (Ed.), Contemporary debates in philosophy of science (pp. 197–216). Malden: Blackwell. Shoemaker, S. (1980). Causality and properties. In P. van Inwagen (Ed.), Time and cause (pp. 109–135). Dordrecht: Reidel. Shoemaker, S. (2001). Realization and mental causation. In C. Gillett & B. Loewer (Eds.), Physicalism and its discontents (pp. 74–98). Cambridge: Cambridge University Press. Shoemaker, S. (2007). Physical realization. Oxford: Oxford University Press. Stalnaker, R. (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in logical theory (pp. 98–112). Oxford: Blackwell. Walter, S. (2007). Determinables, determinates, and causal relevance. Canadian Journal of Philosophy, 37(2), 217–244. Yablo, S. (1992). Mental causation. The Philosophical Review, 101, 245–280.
123
This page intentionally left blank z
Synthese (2011) 183:249–276 DOI 10.1007/s11229-010-9770-y
Being realistic about common knowledge: a Lewisian approach Cedric Paternotte
Received: 29 March 2009 / Accepted: 29 July 2010 / Published online: 22 August 2010 © Springer Science+Business Media B.V. 2010
Abstract Defined and formalized several decades ago, widely used in philosophy and game theory, the concept of common knowledge is still considered as problematic, although not always for the right reasons. I suggest that the epistemic status of a group of human agents in a state of common knowledge has not been thoroughly analyzed. In particular, every existing account of common knowledge, whether formal or not, is either too strong to fit cognitively limited individuals, or too weak to adequately describe their state. I provide a realistic definition of common knowledge, based on a formalization of David Lewis’ seminal account and show that it is formally equivalent to probabilistic common belief. This leads to a philosophical analysis of common knowledge which answers several common criticisms and sheds light on its nature. Keywords Fallibilism
Common knowledge · Probabilistic belief · Rationality · David Lewis ·
1 Introduction The concept of common knowledge of an event purports to describe its public nature— the fact that it is transparent for everyone and goes without saying. This paper provides a definition of ordinary common knowledge, that is, of common knowledge among a group of human beings with normal cognitive abilities. This definition will be both formally expressed and philosophically justified. The need for such an analysis has a twofold origin. First, the concept of common knowledge has been, and still is, regularly attacked for being unrealistic from a cognitive point of view. The main two criticisms claim that since common knowledge is equivalent to an infinite conjunction of embedded epistemic statements, it could only C. Paternotte (B) University of Bristol, Bristol, UK e-mail:
[email protected]
123
250
Synthese (2011) 183:249–276
be attained by ideally rational agents with unlimited computational power; and that since human agents can never know each other’s mental states, common knowledge cannot be defined by referring to mere individual knowledge. As a consequence, it would seem that common knowledge cannot exist (Parikh 2005) or that it should be replaced by a weaker concept. The latter possibility is usually followed, especially when common knowledge plays a major role, such as in analytical definitions of joint action. Indeed, a cognitively unrealistic concept could not be part of definitions that state with great care the precise set of mental states agents are in when they accomplish a joint action. One should thus look for a less demanding substitute for common knowledge (Tollefsen 2005; Peacocke 2005). Second, there exists several accounts of common knowledge. Philosophical definitions [found in Lewis 1969; Gilbert 1989, but also in early linguistic works such as Schiffer (1972) and Clark and Marshall (1981)] are subtle but informal, while formal definitions [whether in game theory Aumann (1976) or in logic] are useful but quite coarse. Moreover, these different versions are not independent: since formal common knowledge was partly built from the intuitions grounding informal common knowledge, formal and informal definitions have common features as well as important differences. Finding which one is undermined by a given criticism can be tricky, all the more so given that the accusations that common knowledge is unrealistic are based on formal definitions, while the informal ones are neglected.1 As Cubitt and Sugden (2003) indeed noticed: ‘[…] although Lewis is usually credited with priority, his work has had relatively little influence on later [common knowledge and convention] developments.’ To complicate matters further, formal common knowledge has itself several forms that, though technically equivalent, pertain to different intuitions. Moreover, there are also approximations or weakenings of common knowledge, both on the formal (Monderer and Samet 1989; Morris 1999; Bonnay and Egré 2009) and informal (Sperber and Wilson 1986) sides. The overall landscape is one of multiple notions, which cannot all be compared but are supposed to represent more or less the same concept. It is therefore natural to ask what connection holds between them, and particularly which one could more faithfully represent ordinary common knowledge - that is, the intuitive notion of a public fact. A quote from Gilbert (1989), though given twenty years ago, still sums up the situation: ‘The best way to describe or define ‘common knowledge’ however, is still somewhat moot.’ (p. 188) My objective is to understand the intuitive concept of common knowledge among a group of human agents by reconciling in the same analysis its formal and informal aspects. Before that, a preliminary objection must be dismissed: for it might be the case there is no such thing as intuitive common knowledge, that the concept is a purely formal one used to model coordination or cooperation behaviors. In this case, my approach would surely be meaningless. The answer is easy though, because common knowledge is just another name for publicity. Situations where an event is public abound in our daily life: from direct publicity, as in conversations, announcements, loud noises…or indirect publicity, such as advertising, salient features of a shared environment, etc. 1 So much so that Sperber and Wilson (1986) innocently remark that the informal alternative they propose to formal common knowledge is very close to Lewis’ original concept.
123
Synthese (2011) 183:249–276
251
And human beings seem to perfectly understand that these situations do not bring mere individual knowledge or even just shared knowledge. Chwe (2001) gives numerous examples where the fact that an event is public, and not only shared, makes a clear difference. For those reasons, common knowledge has a strong intuitive appeal. In what follows, I will choose an appropriate, philosophically justified characterization of common knowledge and express it formally as precisely as possible while maintaining coherence with the classical versions. This approach does not exhaust the concept of common knowledge though. In a seminal paper, Barwise (1988) distinguished three questions of equal interest, but distinct nonetheless, concerning common knowledge: (1) its right definition, (2) the conditions in which it appears, and (3) the way individuals use it. Failing to deal with these matters separately, or at least without bearing in mind that they are not synonymous, can create confusion - as for instance, when trying to answer (1) by answering (2). My analysis will focus on the first question. This paper is organized as follows. In the next section, I present the usual formal definitions of common knowledge and their main two criticisms, as well as the existing formal solutions and the new problems to which they lead. In Sect. 3, I turn to philosophical definitions of common knowledge, especially Lewis’ one. I suggest that formal common knowledge, if interpreted correctly, escapes previous criticisms and offers a path towards an adequate description of ordinary common knowledge. In Sect. 4, I propose a formalization of Lewisian common knowledge by using probabilistic belief operators, which allows one to compare it to classical versions. I check that this definition (1) satisfies some desired properties and (2) is equivalent to the classical approximation of common knowledge. I then provide a definition of ordinary common knowledge, and deal with the consequences of the previous work for its philosophical analysis. 2 Formal approaches Formal common knowledge is usually defined in the following framework. From a set of states, I a set of N agents and for each i ∈ a partition i of , one defines knowledge operators K i : 2 → 2 such that K i E = {w N∈ : i (w) ⊆ E}, where K i E defines the shared i (w) is the element of i which contains w. K ∗ E = i=1 knowledge operator K ∗ , expressing what all agents know. The common knowledge operator can then be defined in two ways. The first possible intuition is to say that an event E is common knowledge between several agents if they all know E, they all know they know E, and so on. In other words, agents have no doubt whatsoever about their respective knowledge about the event and about each other’s knowledge. Common knowledge amounts to an infinity of conditions that I will from now on call epistemic iterations. An epistemic iteration is a statement composed of nested knowledge operators. According to the second intuition, common knowledge appears in case of a public event - an event such that whenever it happens, everyone knows it. This leads to the following formal definitions: E is said to be common knowledge in w (written w ∈ C(E)) if and only if2 : 2 Taken from Osborne and Rubinstein (1994), p. 73.
123
252
Iterative definition: Fixed-point definition:
Synthese (2011) 183:249–276
w ∈ K ∗m E for all integer m (that is, w ∈ K ∗ E, w ∈ K ∗ K ∗ E, w ∈ K ∗ K ∗ K ∗ E, etc.)3 4 There exists a self-evident event (F ⊆ K F) such that ∗ w ∈ F and F ⊆ K ∗ E.5
A public event is obviously self-evident. For instance, F could be the event that the statement ‘The president is dead’ is being publicly made, and E the event that the president is dead. These two definitions are formally equivalent in a classical framework where the knowledge operator verifies veridicity, positive and negative introspection.6 As we will see, most problems of interpretation stem from this double definition. 2.1 Cognitive idealization, fixed point and finite iterations The most famous criticism of common knowledge bears on the iterative characterization, according to which common knowledge is composed of an infinity of epistemic conditions. Such an account ignores human agents’ cognitive limitations: agents have neither the memory capacity to stock an infinite number of statements nor the time to infer every one of them. In other words, only agents with ideal cognitive abilities could ever reach common knowledge. Remembering Barwise’s three questions, note that this problem concerns the definition of common knowledge. It does not concern the way in which common knowledge appears: if an infinity of conditions could not be obtained in a finite time by a step-by-step process, it could be entailed by a finite number of conditions. And it does not concern the way agents use common knowledge either: for agents may need to use only a small finite number of epistemic iterations. These last two caveats suggest two solutions to this problem. The first one is quite obvious: since the iterative definition is unsatisfactory, let us choose the fixed point one, which is formally equivalent and intuitively entails the former. The epistemic iterations are thus seen as mere consequences of the real common knowledge. Theorists—especially logicians—are well aware of this solution. Lismont and Mongin (1995) for instance point out that epistemic iterations are not actual but only potential: these are not statements that agents know explicitly but that they could deduce from 3 This definition is adequate since for two agents: K K E = K (K E ∩ K E) ∩ K (K E ∩ K E) = ∗ ∗ 1 1 2 2 1 2 K 1 K 1 E ∩ K 1 K 2 E ∩ K 2 K 1 E ∩ K 2 K 2 E - knowledge operators preserve conjunction of events: K i (E ∩ F) = K i E ∩ K i F. 4 What is usually referred to as the fixed-point definition of common knowledge is the following: E is
common knowledge if and only if everyone knows E and everyone knows that E is common knowledge (see Fagin et al. 1995, p. 34). This also describes Barwise’s shared situation account (see Lismont 1995, p. 287). My terminology is different: I call this a fixed-point definition simply because it is based on an event F that is a fixed point of the shared belief operator K ∗ . 5 We choose this definition to emphasize that F entails the knowledge of E. The classical definition of common knowledge by self-evident events only demands that F ⊆ E (see Osborne and Rubinstein 1994, p. 73). However, this is actually equivalent to our condition that F ⊆ K ∗ E. First, the latter entails the former, for it is a property of the knowledge operator K that K i E ⊆ E (see Osborne and Rubinstein 1994, p. 70, axiom K4.) Second, the converse is also true: if F ⊆ E, then K ∗ F ⊆ K ∗ E (Osborne and Rubinstein 1994, p. 69, axiom K2); since F = K ∗ F by definition of a self-evident event, we get that F ⊆ K ∗ E. 6 For a proof of this equivalence, see for example Osborne and Rubinstein (1994), p. 74, Proposition 74.2.
123
Synthese (2011) 183:249–276
253
the situation at hand if they had the adequate cognitive capacities—whence the futility of criticizing it for its unrealistic flavour. This solution is even stronger since the formal equivalence between the two definitions disappears when the properties required for the knowledge operator are weakened, so that the fixed point definition entails the iterative one without being entailed by it anymore—Barwise (1988) showed that the equivalence could only be preserved by extending the iterations at a transfinite level. This suggests that the ‘real’ definition of common knowledge is the fixed point one and that the iterative one only states some of its consequences. The problem thus receives a satisfying answer inside the classical formal framework. One attacking common knowledge on the basis of its iterative definition would have to let go as soon as he turns to the fixed point definition: the ubiquity of common knowledge makes it more complex but also more robust. The problem still exists mainly because the iterative definition is simpler and more popular than the fixed point one, which tends to be neglected. This first solution is not sufficient though. While the fixed point definition is built from a self-evident event, there are situations where common knowledge exists without any such event. The coordinated attack problem, and the e-mail game of Rubinstein (1989) are such cases, where no simultaneous public event is possible and agents have to coordinate by using successive messages. Though formal common knowledge is unattainable, agents typically manage to coordinate from a finite number of messages—there seems to be ordinary common knowledge. This suggests that the idealization problem should receive a solution within the scope of the iterative definition. The second solution thus consists in weakening the iterative definition: since an infinity of iterations is unattainable, it would suffice to truncate the definition at a given finite level to obtain a concept of ordinary common knowledge. According to this view, human do behave as if they had common knowledge of a fact as soon as they reach a certain finite number of epistemic iterations. This solution is coherent with the previous one: even in public event cases, agents would only be able to infer a small number of epistemic statements from (fixed point) common knowledge. It also has a strong intuitive appeal: we often reason on the basis of only a few epistemic iterations (I know she believes this, but she does not know I know.) However, this last argument, at least, is specious because, as Heal (1978) remarked, the cases where a finitely iterative reasoning is explicitly used seem to be precisely the ones occurring when common knowledge is absent. After all, in the description of a situation, the presence of one ‘A does not know B’ is enough to undermine common knowledge. To resort to a finite number of iterations is frequent in simplified explanations and is rarely used as a full justification for an agent’s actions in real cases of common knowledge7 —as is well known since Clark and Marshall (1981). For now, it is enough to note that something such as the finitely iterative definition seems necessary at least in cases where no public event is at hand.
7 Reasoning from a finite number of iterations can be rationally useful, but this hardly ever happens in
situations of full common knowledge.
123
254
Synthese (2011) 183:249–276
In the coordinated attack problem8 , two generals A and B must simultaneously attack a common enemy and can only communicate through messages that can be intercepted. Whenever one general receives a message, he has to make sure he received it by sending a confirmation, and so on. Consequently, it can never become common knowledge that the enemy is vulnerable: every received message builds a level of epistemic iteration, but generals would need an infinity of them to attain common knowledge.9 This example shows the contrast between our intuition and the analysis that game theory provides. Indeed, after having received a high number of messages, an agent will plausibly think he is certain enough about the other’s state and will decide to attack as well; for receiving one more message would make no epistemic difference to a cognitively limited agent. In other words, for human agents, ‘truncated’ common knowledge would be equivalent to classical common knowledge.10 There are several reasons to doubt that ordinary common knowledge is adequately described by truncated common knowledge, although none seems decisive. First, by distinguishing between epistemic states and the behavior to which they lead, one could argue that while truncated common knowledge obviously guarantees the same actions that common knowledge does, it is epistemically different. However, in a state of truncated common knowledge, agents do not merely know a few epistemic iterations. Since their cognitive limits prevent them from understanding what the additional iterations would add to their knowledge, they see them as ‘more of the same’. Consequently, agents end up in the same epistemic state as if they knew epistemic iterations, which can be seen as automatically generated by the lowest level ones. In other words, agents’ cognitive limits have them make false inferences which ultimately end at (iteratively defined) common knowledge. Another objection is this: if ordinary common knowledge is to be defined as truncated common knowledge, the level at which it is truncated has to be determined. Do agents stop understanding statements with three embedded knowledge operators, five or ten? All agents do not necessarily have the same limit. For instance, children younger than three years old do not seem to be able to reason about other people’s knowledge11 : they stop distinguishing between epistemic iterations at level one. An extremely clever agent may reach level four or five. If different agents have different limits, how could they be said to reach common knowledge? This objection is not decisive. In a game theoretical setting for instance, it would be easy to model different types according to their cognitive level, with beliefs about each other’s types, which 8 See Rubinstein (1989). 9 Rubinstein (1989) showed that in a game theoretical framework (that is, given a few additional parameters
such as the utility of the agents, the probability that a reply is lost…), one unique Nash equilibrium exists where no one attacks. Whatever the number of exchanged messages is, attacking is never the best solution. 10 This idea must not be mistaken for the fact that it can be rational to attack when only a finite number of messages have been sent. For example, Morris and Shin (1997) showed that by taking into account the probability that the enemy is vulnerable, Bayesian agents would decide to attack for a low enough probability that a message is lost, and that their attack would almost always occur when the enemy is actually vulnerable. One sent message is actually enough in this case to have both generals attacking. This case has nothing to do with cognitive limitations, since agents perfectly understand what every message means. 11 More precisely, they do not seem to distinguish between what they believe and what others believe. See
Wimmer and Perner (1983).
123
Synthese (2011) 183:249–276
255
they would always take as equal or inferior to their own. In the e-mail game, common knowledge would be reached whenever the number of exchanged messages exceeds every agent’s threshold. The corresponding agents would then falsely believe that there is common knowledge. From the formal point of view, the idealization problem thus seems to be solved by both the fixed-point definition and by truncated common knowledge, depending on cases. Arguments merely based on the difficulty of formalizing the latter do not prevent considering it as a legitimate representation of ordinary common knowledge.12 2.2 Fallibilism and probabilistic common belief 2.2.1 The fallibilist objection The second criticism is less prevalent but stronger. It starts from a fallibilist stance concerning what we know about other people’s knowledge. Let us suppose that the statement E is publicly made before agents A and B without anything obstructing their perception. In this case, E will be said to be common knowledge among them. To define common knowledge is to find precisely what A and B know. Classical accounts imply that A and B will both know E, know that they both know E, and so on. It is clear that since they perceived E without interference, A and B know E: they have direct evidence on which their knowledge is grounded. But how could they know what the other know? There is no way for A to be sure that B correctly heard the statement E: B might have misunderstood E or failed to hear it, being lost in his thoughts even when he seemed to pay attention. Such cases are quite easy to construct. It is a common experience to realize that someone who gave every sign of understanding actually ignores what has been said—the knowledge that one agent thought he had about the other proved fallible. It is thus impossible to claim that there is not the slightest doubt about an agent’s knowledge. There is no such thing as knowledge about another agent’s knowledge. But the classical concept of common knowledge defines it as a set of individual knowledge statements both about the event and others’ knowledge—an account which is demolished by the existence of such a slightest doubt. Consequently, classical common knowledge can not exist among a group of humans. In other words, if something such as common knowledge for human beings exists, it is not adequately defined by the classical formal accounts. Note that this criticism must be distinguished from a skeptical stance which would deny the certainty of any knowledge, attacking the very possibility that knowledge exists. Fallibilism is a special case of the principle that other people’s states of mind are opaque, itself a special case of the statement that knowledge of a fact is impossible when the available information can not exclude the possibility of the contrary fact. Different kinds of knowledge seem to be part of common knowledge. In the case of a public event, one must set apart agents’ perceptual knowledge—about the event—and 12 Note that the cognitive idealization problem, and formal common knowledge in general, are usually dealt with in both semantics and syntax. Several technical papers, such as Lismont and Mongin (1995), have demonstrated interesting completeness results related to this section’s topic. However, a discussion of these would not bring out any new significant point and would lead us too far from our semantic account.
123
256
Synthese (2011) 183:249–276
their inferential knowledge—about each other’s knowledge. Agents make inferences from the situation they perceive, which are not perceptually founded since there is no proof of what others know. Perceptual knowledge of an event would be expressed by formulas like K i E, where E is not itself an event whose expression contains a knowledge operator; while inferential knowledge would be expressed by any formula containing nested knowledge operators (expressing a knowledge about knowledge), for instance K j K i E.13 It should be clear from what precedes that the fallibilist principle only applies to inferential knowledge: it is precisely because inferential knowledge about other agents’ mental states is not based on perceptual proofs but only on hints and personal interpretations that it is fundamentally fallible. Still, note that the cases where the fallibilist thesis is really worth mentioning are cases of perceptually based knowledge. It is when we perceive something that we most often tend to suppress any doubt about our knowledge—or more precisely, tend to think that we acquired knowledge. 2.2.2 Probabilistic common belief A second approximation of common knowledge is obtained by replacing knowledge by probabilistic belief in the classical characterizations—thus obtaining a concept of probabilistic common belief (or p-common belief). First, we formally define probabilistic belief. As in the knowledge operator, we start from a set of states, I a set of N agents and for each i ∈ a partition i of . We now add a probability measure P over .14 We can then define the belief at p the degree p (or p-belief) that E as follows: Bi E = {w : P(E|i (w)) ≥ p}, where i (w) is the element of i which contains w. To state it informally: At world w, i will believe with degree p that E is the case whenever the conditional probability that E is the case, what he knows in w, is at least p. Similarly to the knowledge case, N given p p Bi E defines the shared belief operator B∗ , expressing what all agents B∗ E = i=1 believe with degree p. Probabilistic common belief is then straightforwardly obtained by replacing the knowledge operator with this new belief operator in the iterated definition as well as in the fixed-point one. This leads to the following couple of definitions: E is said to be common p-belief in w (written w ∈ C p (E)) if and only if15 : Iterative definition: Fixed-point definition:
p
p
w ∈ (B∗ )m E for all integer m (that is, w ∈ B∗ E, w ∈ p p p p p B∗ B∗ E, w ∈ B∗ B∗ B∗ E, etc.) p There exists a p-evident event (F ⊆ B∗ F) such that w ∈ p F and F ⊆ B∗ E.
An example of a p-evident event is a partially public event. For instance, F could be the event that the statement ‘The president is dead’ is being publicly made through a 13 Perceptual and inferential knowledge should also follow different axioms, as Dokic and Egré (2009) convincingly show. 14 To be precise, P is defined over the sigma-field generated by . 15 Taken from Morris (1999), p. 388.
123
Synthese (2011) 183:249–276
257
television channel which is massively watched, without viewers being certain whether everyone else is watching. Probabilistic common belief was introduced precisely to answer the fallibilist problem16 : even if agents can not know what others know, they can often believe it to a high enough degree (in an ordinary conversation, the probability that an agent does not hear correctly or does not listen is arguably quite low.) However, this solution still faces several problems of interpretation. First, the degree of belief is identical whatever the iteration level. In particular, there is no reason for an agent to believe to the same degree a fact, other agents’ belief of this fact, their belief about A’s belief, and so on. Common belief seems to require several degrees of belief indeed: the degree of publicity of an event F, the degree of the belief in E entailed by a p-evident event F, the degree of the initial belief that F is the case…and these should not necessarily be the same. In particular, perceptual and inferential belief should not be treated similarly. Second, Morris (1999) formally shows that defining common belief is more complex than it seems. In fact, there are at least three possible probabilistic approximations of common knowledge: common p-belief (defined above), iterated p-common belief and weak p-common belief. The difference between the first two is the following. Whereas in common p-belief, both agents believe that both agents believe…in iterative p-common belief, each agent believes that the other agent believes…That is, common p-belief is expressed by nested shared belief operators and iterative p-belief by nested individual belief operators.17 The last version rely on a surprising technical remark from Morris (1999) that ‘more information can reduce the degree of common belief of an event’—agents being said to have more information when their knowledge partitions become finer.18 Morris then defines weak p-common belief as what is common p-belief given agents’ current information or any worse information. This represents the higher common p-belief that agents could attain given a partition, taking into account the fact that they may not be aware of some of their available information. All these notions are minimally satisfactory approximations since all are equivalent to common knowledge when p equals 1.19 The problem is that each of these approximations is justified by a different intuition, that for each there exists specific cases where it successfully warrants the same technical results as does common knowledge, but that they are not equivalent when p < 1. Moreover, each one has both iterative and fixed-point definitions.
16 See Monderer and Samet (1989). 17 Formally: I p E = B p E ∩ B p E ∩ B p B p E ∩ B p B p E ∩ B p B p B p E ∩ B p B p B p E…This last ver1 2 1 2 2 1 1 2 1 2 1 2
sion is different since the p-belief operator does not have the conjunction property anymore. That is: p p p Bi (E ∩ F) ⊆ Bi E ∩ Bi F, while this was an equality for the knowledge operator; see note 3. 18 In other words, in a given state and for a given p, the set of what is common p-belief may become smaller as their information partition become finer. As a consequence, an event that was common p-belief with the original partitions may not stay so with finer ones. 19 As Morris (1999) states: ’Iterated 1-belief, common 1-belief and weak common 1-belief are all equiva-
lent.’ (p. 390); and see Proposition 11, p. 391.
123
258
Synthese (2011) 183:249–276
As a result, no purely formal argument can help us decide which one adequately describes ordinary common knowledge. Weak p-common belief can be rejected for it expresses a potential state: a common belief that could be reached but is not necessarily so. As for common belief and iterated common belief, the choice is harder since these concepts are intuitively quite close to each other: their only difference is that in the former, agents have beliefs about the conjunction of their lower level beliefs, not only about each other’s belief. It is therefore straightforward to show that the former entails the latter. But do we want to describe ordinary common knowledge as the strongest reachable epistemic state or as the weakest one? Ordinary common knowledge could be taken as what is closest to classical common knowledge, as well as its minimal core, which could be less vulnerable to cognitivist criticisms. No decisive technical argument can lead to the ‘good’ approximation without an understanding of what ordinary common knowledge should be. Consequently, I now turn to a philosophical analysis of common knowledge.
3 Informal approaches 3.1 Lewisian common knowledge The concept of common knowledge was created by David Lewis in his seminal book about conventions.While the success of this concept’s subsequent formalization had the side effect of concealing its original version, two recent papers (Cubitt and Sugden 2003; Sillari 2005) do justice to Lewis by reexplaining his approach, arguing for its unique characteristics and defending its relevance. Let me briefly recall the context. Lewis defined convention in a simple game theoretical framework, as a solution to a coordination problem. To put it simply, a coordination problem exists when agents have to choose between several actions, such that everyone prefers that everyone acts similarly but where several acceptable, separated such combinations of actions exist. Whence the term ‘coordination’: agents who do not choose the same action as others will be harmful to all. For instance, drivers can decide on the right or left side of the road; which side is chosen ultimately does not matter, since almost everyone makes the same choice. In such situations, expectations of what others will do, and of what they will themselves expect, are thus crucial for an agent’s choice. In most cases, Lewis took such expectations to be created by ‘agreement, salience or precedent’ (p. 52). He then built common knowledge to explain how mutual expectations between agents can arise from these kinds of situation. Common knowledge is defined as follows: Definition 1 It is common knowledge in a population P that __ if and only if some state of affairs A holds that: 1. Everyone in P has reason to believe that A holds. 2. A indicates to everyone in P that everyone in P has reason to believe that A holds. 3. A indicates to everyone in P that __ . ’
123
Synthese (2011) 183:249–276
259
This account has several striking features and consequences. First, though informal, this definition is of the fixed-point kind, as shown by condition 2.20 Rather than being the infinite set of embedded expectations, common knowledge is what fundamentally causes them. These expectations are what agents need to coordinate, and common knowledge is but a sufficient condition leading to them. As Cubitt and Sugden (2003) remark: ‘Lewis presents a set of conditions which he shows are sufficient to make [iterated reason to believe] true; that set of conditions is his definition of common knowledge’ (p. 185). Second, common knowledge is built on the fundamental notion of ‘reason to believe’ and not on actual belief or knowledge - even the second crucial concept of ‘indication’ is wholly explained on this basis. Common knowledge provides justifications for agents to have certain beliefs but does not require that they actually form these beliefs. In other words, common knowledge has an externalist flavor: it depends more on the state of the world than on the agents’ mental states. Or to put it differently again , it does not describe the epistemic state of a group but only their informational state (in the sense of the information which is available)21 : it states the inferences that agents could make given the available hints. Third, Lewisian common knowledge is not grounded on deductive rationality. Quoting Lewis: ‘What A indicates to x will depend […] on x’s inductive standards and background information’ (p. 53). It is sufficient that agents have the same reasoning standards, whether deductive or not; it is even sufficient that agents know each other’s reasoning standards when they have different ones. Note that since the indication depends on agents’ personal reasoning and knowledge, the Lewisian conception of common knowledge is actually less externalist than it seemed at first sight. Lewisian common knowledge and formal common knowledge thus share a similar structure, but the former uses reason to believe and common reasoning standards whereas the latter requires knowledge or beliefs and deductive reasoning. Even if these are major differences, the gap between the two accounts is not that wide. Replacing general reasoning standards by deductive reasoning, for one, is a simplification that makes formal common knowledge a mere special case of Lewis’ one. And though replacing ‘reason to believe’ by ‘knowledge’ or belief seems to have more serious consequences, I want to stress that nothing in the definitions of formal common knowledge says that the embedded knowledge operators correspond to agents’ actual knowledge.22 It is a matter of interpretation of the partitional structures, which can be taken as expressing agents’ actual knowledge as well as their available information, or potential knowledge. That the first interpretation is preferred in technical works is understandable, because game theorists are usually interested in justifying people’s behavior. However, people are taken to act on the basis of their actual beliefs,
20 In the fixed-point definition of Sect. 2, F entails that everyone believes F. Lewis’ condition 2 can be obtained by replacing ‘entails’ by ‘indicates’ and ‘believes’ by ‘has reason to believe’. 21 It could also be called evidential state. 22 I here oppose actual knowledge and information. This distinction is different than the one between
explicit and implicit knowledge. An information can be available in a given context without even being implicit knowledge—for instance when the agent does not perceive a fact though it seemed salient. It is an information that the person sitting next to me has a red shirt, but it will not even become implicit knowledge if she never enters my line of sight.
123
260
Synthese (2011) 183:249–276
so the informational interpretation could hardly be used to explain nor to predict any behavior. This is not a limit of the formal framework itself though. My point is that the gap between Lewisian and formal common knowledge is less due to a conceptual loss when passing from an informal description to a formal one than to the particular interpretation that theorists tend to make (and to the widespread domination of the iterative account). The previous three points were already emphasized in the two aforementioned papers. But several other aspects of Lewisian common knowledge are relevant to our discussion. The first, quite obvious one is that this concept escapes the two cognitivist criticisms, namely idealization and fallibilism. The idealization criticism becomes innocuous because Lewisian common knowledge constrains the agents’ mental states much less.23 Even if it did, it has a fixed-point structure and explicitly makes mutual expectations—and thus potential epistemic iterations—mere consequences of common knowledge. Since there is no demand whatsoever about the inferences that agents should make, this common knowledge can easily be attained even by cognitively limited individuals. Lewisian common knowledge does not suffer either from the fallibilist criticism: irreducible doubt about other people’s mental knowledge perfectly fits a description based on ‘reason to believe’. Even if an agent cannot be sure of what another perceived or understood, she usually has good enough reason to believe that she perceived or understood correctly what actually happened—at least in basic situations where a public statement has been made. But there is a price to pay for this realism. Lewisian common knowledge looks somewhat weaker than even ordinary common knowledge should be: for the definition guarantees no actual knowledge or belief. In a situation A satisfying all (1)–(3) conditions but such that no agent realizes that it holds,24 Lewisian common knowledge would exist, which seems counterintuitive. Moreover, the concepts of reason to believe and indication seem to admit of degrees. For example, a situation where a public statement is made by a man in a crowd can more or less indicates that it is being made, according to the context: the surrounding noise, the visibility of the speaking individual have an influence on its salience. Similarly, a reason to believe something can be more or less convincing: for example, an event observed ten, a hundred or a thousand times in the past does not equally indicate that it will happen again. When indications or reasons to believe are weak, one should not expect a situation of common knowledge. This problem stems from the fact that Lewis wanted to warrant agents’ choice of action. The main purpose of Lewisian common knowledge is to justify that conventions exist, that is, that agents facing a coordination problem keep acting according to whatever agreement, precedent or salience indicate. In other words, common knowledge has to justify why a certain choice is most often made in a certain situation. Consequently, it is far from being epistemically well defined, simply because different epistemic states can justify the same action. Morris (1999) showed that probabilistic common belief can sometimes lead to the same results as common knowledge. For this 23 It still weakly constrains them since it determines what agents may come to believe in a given situation. 24 This is because indication is considered as an objective, or external, relation. E can indicate F to i
without i forming the relevant belief that F; i can have reason to believe E without believing it.
123
Synthese (2011) 183:249–276
261
reason, in a situation of Lewisian common knowledge, even if agents come to actually believe what they have reason to believe, this belief may have a degree which is strictly less than one. However, a common knowledge definition should at least make sure that the beliefs to which it can lead are solid enough. In other words, common knowledge should not be defined on the mere basis of its consequences in terms of action. In a nutshell, Lewisian common knowledge seems weak enough to avoid the cognitivist criticisms, but not strong enough to adequately describe ordinary common knowledge. Its formalization (in Sect. 4) will confirm that this is the case indeed. But before making that step, it is necessary to clarify some essential aspects of common knowledge.
3.2 Orthodox interpretations Independently of the assets and drawbacks of Lewis’ account, the existing literature about common knowledge reveals that one of its consequences has been, if not misunderstood, at least underestimated. There is widespread agreement that epistemic iterations play a major role in defining common knowledge, and in particular that classical common knowledge, taken iteratively, represents ideal common knowledge, attainable by cognitively unbounded agents. Normal agents, able to infer but a few epistemic iterations, would have to make do with ordinary common knowledge. This I call the orthodox interpretation. I will argue for the opposing, unorthodox position that iterations, though important when defining common knowledge, have no impact on its fullness, and in particular that ideal agents do not reach a fuller common knowledge than ordinary agents. As remarked in the previous section, according to Lewis epistemic iterations are seen as consequences of common knowledge. Nonetheless, they are necessary to define it since common knowledge has to produce these iterations, in the form of mutual expectations in coordination problems. Hence the intuition that the more epistemic iterations are actually known by agents, the closer they are to full or ideal common knowledge. This view has been entertained by most philosophers tackling the concept of common knowledge. Lewis (1969) to begin with, asserts that ‘Anyone who has a reason to believe something will come to believe it, provided he has a sufficient degree of rationality’ (p. 55) and that ‘degrees of rationality we are required to have, to have reason to ascribe, etc., obviously increase quickly. This is why expectations of only the first few orders are actually formed’(p. 56). Cubitt and Sugden (2003) showed that agents that ‘reason faultlessly’ (viz. such that every reason to believe they have becomes actual belief) could reach any level of nested actual beliefs. Sillari (2005), providing a formal framework where the transition from agents’ reasons to believe to actual beliefs is stated in terms of awareness of formulas, hold that ‘An ideal agent […] would be unboundedly (epistemically) rational.’, that is, would believe everything he has reason to believe. Gilbert (1989) contains a detailed, Lewisian account of common knowledge which more precisely illustrates my point. She defines ordinary common knowledge (that is, for normal human beings), as the state in which ideal agents could reach classical common knowledge. What is interesting is that Gilbert shares my objective to
123
262
Synthese (2011) 183:249–276
define ordinary common knowledge, and avoids all the aforementioned traps about common knowledge: she uses a fixed-point-like definition, takes common knowledge as a situation, ensures that agents perceive the basic necessary facts but requires no additional actual belief. However, her definition refers to full common knowledge and ideal agents. Why is this needed? Defining ordinary common knowledge of E, as she does, by saying that agents perceive each other perceiving E ensures the necessary fixed-point structure; from that basis, ideal agents would obviously hold an infinity of nested beliefs. The point is that it is the very same situation that leads normal agents to ordinary common knowledge and that leads ideal agents to the infinite set of epistemic iterations. Defining the former from the latter is thus unnecessary at best. All these accounts do not claim that common knowledge amounts to a stack of epistemic iterations and clearly state that it simply entails them. Rather, they overestimate the link between ideal rationality and common knowledge by constantly associating them. But as a matter of fact, common knowledge is not more reachable by ideally rational agents than by rationally bounded ones. Why? Not because inductive reasoning standard are different from deductive reasoning standards. As Sillari (2005) remarks, ideal versions of them would hold any belief or knowledge which can be produced by using their reasoning standards, whatever they are. Replacing deduction by induction does not harm the power of ideal agents. I see three reasons why ideally and normally rational agents are on a par in a situation of common knowledge. First, the fallibilist stance, which implies that agents can never be sure of what others actually know, believe or even perceive, touches agents without regards to their rationality. That someone actually hears what seems to be in her hearing, understands what seems clear, remarks on what is in her line of sight, is ultimately contingent. Ideally rational agents have no more reason to believe that it is the case than normal agents. The path leading from a reason to believe to a belief is not ruled by rationality; failing to form an actual belief from a reason to believe is not being irrational. Second, passing from a reason to believe to an actual belief is not a mere matter of reasoning but can depend on the context. If we agreed to meet tomorrow and you never missed any of our previous meetings, I would not only have a reason to believe but also a full belief that you will come tomorrow. I can abandon the very weak belief that you might not come; your not coming is unlikely and would have no catastrophic consequences anyway, so it would not influence my decision to come. But I might refuse to bet a million euros on the fact that you will come, which indicates that I am not certain that you will. In other words, context has an influence over the inductive inferences we actually make, which affects ideal agents as well as normal ones. Third, the set of epistemic iterations is a very peculiar set of consequences. I claim that whether agents actually draw inferences from a situation of common knowledge has no influence on their level of common knowledge. Let us draw an analogy with mathematical reasoning. Consider the case of induction proofs: from a starting point and a hypothesis—a property for a given integer—that entails the property for the following integer, this property is automatically proven to hold for every integer. In such cases, knowing the induction proof of a theorem suffices. No one would argue that agents do not know that the property holds for all integers because they cannot be
123
Synthese (2011) 183:249–276
263
aware of the infinity of properties it implies.25 Similarly, agents need to acknowledge that a common knowledge situation is such, to automatically take the whole set of epistemic iterations as granted. A common knowledge situation is a manifest sign that epistemic iterations are reached at every level. There are not many different types of common knowledge situations, and their consequences are all generated identically: so agents do not need to infer even a small part of them.26 Consequently, their level of rationality is irrelevant. In a situation of common knowledge, actual beliefs are not used to take a decision but to rationally justify it. As a direct consequence of these remarks, truncated (or finitely iterated) common knowledge is dismissed as a possible candidate to represent ordinary common knowledge. It could be compared with the situation of an engineer who, having built a jumping robot and observing that it jumps too high, corrects it by lowering the ceiling. What does not matter is not merely whether epistemic iterations are actually reached or not, but whether they can be reached or not. Moreover, as Lewis (1969) notices: ‘[…] one might guess that common knowledge is the only possible source of higherorder expectations. But it is not; there is a general method for producing expectations of arbitrarily high order in isolation’ (p. 59). In other words, epistemic iterations are neither necessary nor sufficient for common knowledge. My analysis does not imply that truncated common knowledge is a useless concept but simply, paradoxical as it may sound, that it does not concern common knowledge.27 Fundamentally, attaining ordinary common knowledge is not a matter of rationality. This heterodox point of view is to my knowledge missing from existing papers whereas it implies important properties of common knowledge.28 4 Formalizing Lewisian common knowledge 4.1 Preliminaries The common knowledge concept provided by Lewis seems to answer the cognitivist objections; it should also allow one to take into account a wide range of types of reasoning. While this is promising, it now needs a proper formalization. Lewis did 25 My claim may seem similar to the statement that as long as agents know a given set of axioms and reason according to certain deductive rules, their knowledge is the same whether they come to know all the theorems that can be demonstrated from these axioms and rules, only some of these theorems, or no theorem at all. However, this is blatantly absurd and would amount to saying that there is no difference between logically omniscient and, say, logically ignorant agents. The case of induction proofs is different. 26 This answers the potential worry as to how could agents ever be in a cognitive state corresponding to the fixed-point definition, given that it is logically stronger than the infinity of iterations, themselves corresponding to an unrealistic cognitive state. 27 My account thus goes against the temptation of interpreting Rubinstein (1989)’s ’almost common knowledge’ and Bonnay and Egré (2009)’s ’token semantics’ common knowledge as close to real common knowledge. Note that the latter state explicitly that their formalism is ’neutral’ as to’whether what this [semantics] accounts for is a cognitive illusion regarding common knowledge, or whether this gives a characterization of how common knowledge is actually attained’ (p. 207). My account favours the cognitive illusion view. 28 The orthodox view still underpins many recent criticisms of common knowledge, such as those given
or implied by Parikh (2005), Peacocke (2005) and Tollefsen (2005).
123
264
Synthese (2011) 183:249–276
not use any logical formulas, but his analysis is so precise that it is almost formal. I will now suggest that his vocabulary and definitions can be translated into the probabilistic belief framework, which will allow me to compare his account to the formal approximations of common knowledge presented in the previous section. There are at least two intuitive ways to formalize realistic common knowledge with modal operators. The first way, following Monderer and Samet (1989) and already described earlier, consists in using probabilistic modal operators B p to represent beliefs, which allows to deal with the fallibilist objection. The second way, recently illustrated in Sillari (2005), concentrates on the cognitive idealization objection: it uses two distinct modal operators referring to reasons to believe and awareness of agents, such that agents form actual beliefs of a formula whenever they have reason to believe it and are aware of it (the set of basic formulas an agent is aware of is given from the start). The latter way would be of little use here. Sillari (2005) managed to describe effectively the logic of agents drawing conclusions on the basis of their awareness that they are rationally bounded. But his kind of formalization cannot help us for two reasons. First, since reaching common knowledge is not a matter of attaining epistemic iterations, nothing significant can be accomplished by supposing that agents are aware of their bounded rationality. Second, the analysis in terms of awareness lacks any real descriptive power. Since the formulas of which agents are aware are given from the start in such a model, it leaves unanswered the question of what agents actually believe (are actually aware of) in situations of common knowledge. These reasons do not directly support the use of probabilistic modal operators though. That common knowledge has nothing much to do with awareness does not help to choose between classical or probabilistic modal operators. The question is the following : in common knowledge situations, can reasons to believe be represented by a unique operator, or must they be considered as having various degrees? The introduction of degrees makes a formalism more cumbersome; such a choice therefore has to be justified. I claim that these two kinds of operators are not in opposition but rather complement each other. A fact can be more or less supported by evidence. The strength of this support depends both on the logical links between the evidence and the fact, and on an agent’s private prior information. In that sense, reasons to believe admit degrees. But agents also reason according to inductive inference rules, that allow them to draw simplified or generalized conclusions from what they perceive: in this regard, what is objectively a strong reason to believe a fact could actually constitute a full reason to believe, that is, be such that it always leads a human agent to form a full belief that this fact is the case. This inductive link from partial reasons to full reasons could only be formalized by a satisfactory inductive logic, which is notoriously lacking. Nonetheless, one that wishes to fully describe a situation of common knowledge has to go down to the partial reasons to believe - that is, to understand the situation without any inductive bias. My point will be that common knowledge can exist even when some reasons to believe are partial (that is, have a strength strictly lesser than 1). Thus my analysis does not oppose nor dismiss that of Sillari (2005) but is simply located at a different level.
123
Synthese (2011) 183:249–276
265
4.2 First steps 4.2.1 Reason to believe To formalize Lewis’ fundamental notion of ‘reason to believe’ it is necessary to avoid a purely deductive view of ‘reason’ and to give some space to an inductive one. I will take ‘reason to believe’ as ‘(sufficiently) good reason(s) to believe’. At least, one has good reason to believe a fact when his reasons to believe the opposite fact are less compelling. A straightforward approach thus consists in translating ‘the agent i has reason to p p 1− p believe E’ by: ∃ p > 21 Bi E. 21 is taken as a limit simply because Bi E = Bi \E, so if an agent believes in E with degree p > 21 , she will believe the opposite event \E at a degree 1− p < 21 ;29 she thus has minimally sufficient good reason to believe E. A good reason should be strong enough to exclude the opposite conclusion. The 21 limit must not be seen as uniquely right though. It is at least the minimal one. Nothing in the following discussion actually depends on the exact value 21 : any limit L ≥ 21 could be chosen. If the chosen condition was p = 1, only a deductively valid certainty could be a good enough reason to believe, which would lead to classical common knowledge. If a small p > 0 was chosen, it would amount to taking ‘reason to believe’ for ‘any reason to believe’, which would not fit since one can have a weak reason to believe a rather improbable event. Two potential objections must be answered at this point. First, the modeling may seem flawed already, for it uses a belief operator to represent reasons to believe, which, I emphasized, are considerably different and in particular do not invariably lead to beliefs. The immediate answer is that what changes is the interpretation given to the partition corresponding to the epistemic states of the agents; they here represent available information instead that is not necessarily perceived nor inferred by the agent. Second, it may seem more appropriate to replace the knowledge operator with a non-probabilistic belief operator by merely removing the condition that what is known must be true, rather than introducing probabilities that make the model more complex. It seems necessary though, both because it is an intuitively appealing answer to the fallibilist problem and because what I want to describe is the epistemic state of agents before they make the inductive inferences that lead to belief formation. In other words, the focus is put on the agents’ evidential state, and for that reason taking into account the strength of the evidence is a natural move. Introducing probabilities also allows one to formally derive realistic properties of reasons to believe that had not been emphasized before.30 4.2.2 Indication The next notion is that of indication. Reading Lewis again: ‘Let us say that A indicates to someone x that __ if and only if, if x had reason to believe that A held, x would 29 In other words, if we use a modal operator R to formalize ‘reason to believe’, p > 1 would entail the 2
usual property of consistency, that is, that Rφ → ¬R¬φ.
30 See the discussion of Cubitt and Sugden’s axioms in Sect. 4.3.
123
266
Synthese (2011) 183:249–276
thereby have reason to believe that __. What A indicates to x will depend, therefore, on x’s inductive standards and background information.’ That means that any good reason to believe A must entail a good reason to believe __. Previous formalization of a ‘reason to believe’ straightforwardly leads to a formalization of indication. I will say that F indicates to i that E if and only if: p
f ( p)
∃ f i : (1/2, 1] → (1/2, 1]∀ p > 21 Bi F ⊆ Bi i
E
Whenever i has reason to believe F (viz. with a given degree p > 21 ), he must have a good reason to believe E (with a given degree p = f i ( p) > 21 ).31 The functions f i are needed in order to authorize every possible relation between F and E’s degree of belief.32 According to Lewis’ definition and to our interpretation, there will be an indication whenever both degrees are greater than 21 . The f i functions aim at expressing the nature of the relation between F and E. A problem arises from the fact that they are not unique. If f i ( p) = q > 21 is an indication f ( p)+ 1
function for F and E, then for instance f i ( p) = i 2 2 is another one, since for all p f i ( p) > f i ( p) and f i ( p) > 21 . The chosen function has to be a maximal one, that p q is, such that for all p > 21 there exists no q > f i ( p)Bi F ⊆ Bi E. Such functions will be said to represent the indication of E by f for agent i. This maximality property will be implicit in everything that follows. As a consequence, the f i functions are non decreasing. Would not some simpler formulas adequately translate Lewis’ concept of indication? The following examples suggest that this is not the case: F⊆E: This is not an indication relation but one of logical consequence. It p p
entails any formula of the kind Bi F ⊆ Bi E with p ≥ p . It is thus limited to particular f i functions satisfying p ≥ f i ( p). p q p This would entail any formula Bi F ⊆ Bi E, even when q is very F ⊆ Bi E : small. If F indicates E to i in that sense, then believing with a very small degree in F provides a good reason to believe E: but if a volcano eruption indicates that the city below will be destroyed, I should not decide to move to another city only because I am not completely 31 The above formula may seem complex; this is due to the fact that a reason to believe has been defined p from the strict limit. Let us suppose that we define ‘i has reason to believe E’ by Bi E where p > L (and 1 L > 2 ). The formula becomes: p
f ( p)
1
∃ f i : [L , 1] → [L , 1]∀ p ≥ 21 Bi F ⊆ Bi i E. Note that this condition holds as soon as Bi2 F ⊆
q p BiL E with L > 21 . Along with the property Bi E ⊆ Bi E if q ≥ p, this formula ensures that for all p
p > 21 , Bi F ⊆ BiL E, so the general indication formula will be satisfied with the constant function f i ( p) = L . Though intuitively straightforward, the formal definition of indication is made more complex by choosing p > 21 instead of p ≥ 21 . But the definition of ‘reason to believe’ makes this choice necessary. 32 Note that this formulation ∃ f : (1/2, 1] → (1/2, 1]∀ p > 1 B p F ⊆ B f i ( p) E is equivalent to ∀ p > i 2 i i 1 ∃q > 1 B p F ⊆ B f i ( p) E. Although it seems more complex, the former should be preferred because 2 2 i i
using a function allows to capture the nature of the indication relation between F and E. There are multiple ways in which an event can indicate another, that is, in which the ps and the qs can be related. The functions f i encompass all the relations between degrees of belief in F and degrees of belief in E.
123
Synthese (2011) 183:249–276
P(E|F) ≥ p :
267
sure that the volcano will not erupt during the next fifty years. In other words, if F indicates E, F being implausible should not entail a strong belief in E. An agent’s private information could sometimes be precise enough so that an a priori implausible event is for her very likely to happen; but this should not always be the case independently of the particular information partition sets. This formula can be taken as representing the material, or objective, indication relation, which represents the causal links between F and E.33 However, Lewis’ indication is relative to an agent’s knowledge and to his background information, so it does not coincide with the material indication. Given P(E|F) = p, one could build an informaq tion structure where Bi1 F ⊆ Bi0 E, and another where Bi F ⊆ Bi1 E with a very small q. The former formula implies that even if, in general, smoke materially indicates that there is a fire with a probability p, it may be that it does not indicates it to agent i when it is a particular smoke non related to a fire that i is able to recognize (because, say, of his experience as a fireman). In general, P(E|F) = p will entail p Bi1 F ⊆ Bi E only when E is an element of i’s information partition set - that is, when i is perfectly able to perceive the event E, nothing more, nothing less.
Lewis’ indication relation thus is not objective but partly subjective. Whether F indicates E to agents depends not only on the relation between F and E, but also on the agents’ private information.34 Note finally that even if the above formalism contains modal operators with degrees, the concepts of ‘having reason to believe’ and indication’ do not. This is consistent with the analysis given in the previous section: while the (partly subjective) evidence for an event can be more or less strong, a reason to believe is what would lead an agent to form an actual non-probabilistic belief by means of inductive inferences. 4.3 Common knowledge Lewis defines common knowledge as follows: ‘Let us say that it is common knowledge in a population P that __ if and only if some state of affairs A holds that:
33 Formalizing Dretske (1988)’s indication relation, Godfrey-Smith (2009) states: ‘For s to indicate that p, Dretske requires that the probability of p be at least close to 1, given signal s […] How close to 1 is ‘close’? Why not be generous?’ (p. 289). He then locates the limit at 21 , while noting that some authors chose an even smaller one. Independently of the value of the limit, the crux is that this is a notion of objective indication. 34 One might object that indication should not be described by a material conditional but by a conjunction such as ‘i believes F’ and some consequences of this fact. But this interpretation goes against Lewis, who needs a separate clause to specify that the agents has reason to believe that F is the case. More precisely, such an interpretation is not needed since indication is taken as a partly objective relation between events. More precisely, the indication relation holds between one event and the epistemic consequences it would have for an agent if it was believed to be the case.
123
268
Synthese (2011) 183:249–276
(1) Everyone in P has reason to believe that A holds. (2) A indicates to everyone in P that everyone in P has reason to believe that A holds. (3) A indicates to everyone in P that __.’ The population is the set I of agents. The state of affairs A will be described as a set of possible worlds and will not formally differ from a proposition: sets of possible worlds can equally represent facts, events or whatever fits a description. The fact that A holds means nothing else than that the actual world belongs to A. This leads to the following formal definition of Lewisian common knowledge: Definition 2 Let I be a set of agents, a set of possible worlds, for all i of I i a partition set of , P a probability measure and Bi the associated belief operators. E ⊆ is said to be common knowledge among I in w ∈ if and only if there exists F ⊆ such that: (a) (b)
w∈F ∃r > 21 ∀i ∈ I w ∈ Bir F
(c)
∃s > 21 ∀i, j ∈ I ∃ f i, j : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i, j
(d)
∀i ∈ I ∃gi : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i
f
p
p
g ( p)
( p)
B sj F
E
The (a) condition expresses the fact that F holds; (b), (c) and (d) are the respective translations of Lewis’ conditions (1), (2) and (3). What this definition says is that whenever there exists variables F, r, s, f and g satisfying these relations for an event E, then E can be said to be common knowledge among the agents. This framework is usually used to define approximate common knowledge, but what it approximates is ideal common knowledge. The above definition renders explicit the core of ordinary common knowledge—as I will argue later. In other words, full ordinary common knowledge is adequately represented by approximate formal common knowledge. The variable r represents the minimum degree with which every agent has reason to believe that F holds in the current state. It corresponds to the general reason among agents for believing that F is the case; and f is where the fallibilist limitation is expressed. If we rater chose to base the definition on sets of subjective variables ri and si , r and s could still be defined as their respective minima. This choice makes the already unwieldy formulas a little easier to read. In contrast, although f i and gi could have been replaced by general functions f and g, indexed functions were conserved to emphasize the subjective aspect of the indication relation. This leads to a definition of a (r, s, f, g)-common knowledge, depending on four parameters.35 There is reason to think that the number of parameters would not bother Lewis as he himself ended up defining convention with nothing less than six parameters.36 35 Note that in the definition of common knowledge, r, s, f and g are variables rather than parameters.
Equivalently, we could give a definition of (r, s, f, g)-common knowledge where r, s, f and g are parameters given from the start; the conditions would obtained from those of Definition 2 by merely removing the existential clauses. For that reason, I refer to them as parameters in the remainder of the paper. 36 Lewis (1969), p. 79.
123
Synthese (2011) 183:249–276
269
Note that this definition is not limited to common knowledge appearing from public events, but can easily be adjusted to fit common knowledge appearing from typical events. This former kind exists among agents who do not need to know each other, but reach common knowledge because they know they observed similar events. For instance, it describes the common knowledge among several French people that Nicolas Sarkozy is the French president. Being French implies having had a great number of opportunities to learn who the president is, and knowing that every other French person has had as many opportunities, although different people did not acquire this knowledge during a common event. As Cubitt and Sugden (2003)’s formalization suggests, a slight modification using a set of evident events (one for each agent) will lead to an adapted definition of common knowledge. I will not provide this definition, since its structure is basically identical to that of common knowledge appearing from a public event. Does this definition comply with what is required of common knowledge? Cubitt and Sugden (2003) enumerated six logical properties that a right formal indication relation should possess, with these notations: ‘Ri (x)’ means that i has reason to believe x, and ‘A indi x’ that A indicates to i that x. The following properties hold for all persons i, j, for all states of affairs A, A , for all propositions x, y: (A1) (A2) (A3) (A4) (A5) (A6)
[Ri (A holds) ∧ (A indi x)] ⇒ Ri (x). [(A holds) entails (A holds)] ⇒ A indi (A holds). [(A indi x) ∧ (A indi y)] ⇒ [A indi (x ∧ y)]. [(A indi [A holds]) ∧ (A indi x)] ⇒ A indi x. [(A indi x)∧ (x entails y)] ⇒ A indi y. [(A indi R j [A holds]) ∧ R j (A indi x)] ⇒ A indi R j (x).
(A1), (A2), (A4) and (A5) are straightforwardly verified by belief operator based indication. But some issues arise with (A3) and (A6). (A3) is not satisfied. Note that this applies to any type of probabilistic common belief. It is a natural property of the p Bi operators that they are not stable by conjunction: I can believe with a degree p that it is raining in Tokyo and also with a degree p that it is raining in London without believing with the same degree p that it is raining in London and Tokyo at the same time. But it seems like an intuitive property of indication that if a situation indicates one fact and another, then it indicates both facts. Yet, there are cases where this does not hold. Let us imagine that I observe a friend listening to a short talk; for every sentence the situation indicates to me that he heard and understood it. But does the situation indicate that he heard and understood all the sentences? Given a long enough talk, the small probability that he could have been distracted at a given moment leads to a much greater probability that he was distracted at least once. Nonetheless, unless there is a hint that he actually was distracted, the situation seems to indicate that he was not. This question pertains to the more general debate of formalizing inductive reasoning, which is notoriously difficult and still an open problem.37 37 Moreover, also note that Cubitt and Sugden (2003) only need (A3) to prove one result, the ’iterated
belief theorem’ (p. 193; see the proof p. 205), which implies that agents who reason faultlessly will form
123
270
Synthese (2011) 183:249–276
This conclusion does not mean that the formalization work failed. My aim is to describe precisely the epistemic state in a situation of ordinary common knowledge. At least, the formal account of common knowledge is entailed by a situation of common knowledge. To have reason to believe A entails that I could have a strong enough belief that A holds. The problem stems from the fact that induction leads agents to replace these strong beliefs with certain beliefs: whereas they only have a proof to entertain a p-belief, they simply take it as a sure thing. The model describes the epistemic state before the inductive reasoning38 ; for that reason, it can only make manifest some consequences of having reason to believe and of indication but not provide necessary and sufficient conditions for them. The formalization goes from the world to the model: it describes the epistemic consequences of a situation, but it does not allow one to detect reasons to believe or indication relations between events in a given information structure. The case of (A6) is different. (A6) is trivially satisfied but cannot be made explicit, for the probabilistic belief framework does not allow us to formalize a reason to believe that a fact indicates something: reasons to believe concern events (set of states) but not relations between events. That creates no problem though: (A6) basically means that an agent knows what kind of reasoning the others make, granting the possibility that different agents reason differently. To quote Cubitt and Sugden (2003): ‘[…] the inductive standards’ that i uses […] may be different from the inductive standards used by others’. But the (A6) property is automatically satisfied in a framework where all agents have only one available kind of reasoning - or more exactly kinds of reasonings that lead them to form the same beliefs from the same evidence. (A6) is thus satisfied though not explicitly so, for Cubitt and Sugden (2003)’s approach is more general, while I chose one mode of reasoning and expressed it through its semantic consequences. I do not provide an analysis of inductive reasoning but properties of beliefs of agents who reason inductively. The formalization of Lewisian common knowledge seems close to formal p-common belief. Let us call the former (r, s, f, g)-(L), and the latter p-(C). What is their relation? An equivalence result exists, as the following propositions show39 : Proposition 1 ∀u∃r, s, f, g[u − (C) ⇒ (r, s, f, g) − (L)]. Proposition 2 ∀r, s, f, g∃u[(r, s, f, g) − (L) ⇒ u − (C)]. Proposition 3 ∀u, ∃r, s, f, g[u − (C) ⇔ (r, s, f, g) − (L)]. This shows that for every common p-belief (for p > 21 ), Lewisian common knowledge exists, and vice versa. The first two propositions alone provide a weak Footnote 37 continued epistemic iterations of any level (when the fact that they reason faultlessly is self-evident). However, this conclusion is precisely one that previous sections argued should be avoided. In other words, in Cubitt and Sugden (2003)’s reconstruction of common knowledge, the only result that rejecting (A3) threatens is one that is not desirable in the first place. 38 Indeed, the uncertainties that arise from the fallibilist objection can only be present in a model that formalizes the epistemic state of agents before the inductive reasoning; because what inductive reasoning does is precisely transform such beliefs of a high degree in plain beliefs. 39 The proofs are given in the appendix for more explicit formulations of the propositions.
123
Synthese (2011) 183:249–276
271
equivalence; starting from u-(C) gives us (r, s, f, g)-(L) (by Proposition 1), which in turn entails another u’-(C) by Proposition 2, but there is no apriori reason why u
should be equal to u. However, it is straightforward to prove proposition 3, that states there is a strong equivalence as well. Common knowledge will never disappear by repeated application of Propositions 1 and 2—it will never go down to a u-(C) with u < 21 . We now have a precise and detailed definition; it is a descriptive one but it stays linked to the convenient common p-belief, so the great number of technical results to which the latter leads are not lost. 4.4 Fine-tuning common knowledge I now proceed towards the definition of ordinary common knowledge. Since Lewis’ definition has some flaws and common knowledge can hardly be distinguished from common belief from observing the actions it entails, the surest way towards a satisfactory definition starts from Lewis’ account and strengthens it to come as close as possible to classical common knowledge. For that, I first explain how certain parameters of the formalized Lewisian common knowledge should be interpreted. The fallibilist criticism relies on the impossibility of certainty regarding other people’s knowledge or beliefs about an event, though one believes with certainty that event has occurred. Degrees of belief can take that into account by taking for them our estimation of the chances of a misunderstanding, of distraction or forgetfulness. In previous definitions, the fallibilist uncertainty is easily expressed through the p-evidence property of an event: what fallibilism destroys is the certainty about the knowledge others have of an event, public as it may seem. But formal characterizations do not all have the same clarity. Their properties are: For common p-belief: p (pE) F ⊆ B∗ F For the Lewis inspired formalization: f ( p) p (c) ∃s > 21 ∀i, j ∈ N ∃ f i j : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i j B sj F
Common p-belief seems to allow one to formalize ordinary common knowledge by expressing the fallibilist uncertainty through a degree p = 1 − < 1. But we saw that this uncertainty also affects the way agents perceive an event F: (ii) entails that everyone will only have good reasons to believe F occurred. What (ii) says is that the degree of belief is at least p, so it could be 1; but what is needed is a definition saying that the degree of belief of the event must be 1. Common knowledge implies something like common p-belief, but is not entailed by it. The limited precision of common p-belief is the drawback of its practical utility. On the other hand, cumbersome as it may seem, my formalization provides the required precision—alhough it was not its main goal: remember it is a mere formal translation of Lewis’ view. Setting all the parameters of (r, s, f, g) − (L) equal to 1 would entail common 1-belief, which is formally as close to classical knowledge as can be. But every parameter must actually be adjusted separately.
123
272
Synthese (2011) 183:249–276
The r parameter expresses the degree of awareness that F holds in the actual world. It has to be set to 1, and the belief has to be taken as an actual one: in common knowledge cases, agents must know that F is the case. The g parameter expresses the strength with which F indicates E. It is sufficient that g( p) be greater or equal to p—that agents always have at least as much reason to believe E as F. s represents the global publicity degree of f , which depends on the way all agents perceive F - it must also be set to 1. f represents the degree of belief one can have about the beliefs of other agents: since this is the role played by the fallibilist limit, f ( p) can never be greater than 1 − for a small > 0. As a result we get the following definition of ordinary common knowledge: Definition 3 Let I be a set of agents, a set of possible worlds, and for all i of I a p partition set Q i of . A probability measure and belief operators Bi E ⊆ is said to be common knowledge among I in w ∈ if and only if there exists F ⊆ and ∈ (0, 1) such that: (a ) w ∈ F (b ) ∀i ∈ N w ∈ Bi1 F
f ( p)
(c ) ∀i, j ∈ N ∃ f i j : ( 21 , 1] → ( 21 , 1 − ]∀ p > 21 Bi F ⊆ Bi i j p
(d ) ∀i ∈ N ∀ p >
1 p 2 Bi F
⊆
g( p) Bi
B 1j F
(where ∀ pg( p) ≥ p)
The major differences between classical and ordinary common knowledge appear in two distinctive properties. First, when ordinary common knowledge holds, agents do not know that it holds but at best strongly believe it, since they cannot be sure of each other’s correct perception and understanding. This is why common knowledge is an externalist notion: not only because it depends on mental states of more than one agents, but because its existence can only be proven by the omniscient modeler, who has access to agents’ mental states. Common knowledge partly depends on the available evidence or information, that is, on the state of the world. Second, and consequently, inferences about ordinary common knowledge are defeasible.40 In most ordinary common knowledge situation, inductive reasoning leads agents to behave is if they had classical knowledge, ignoring the faint possibility that someone might not correctly grasp the situation. They can thus realize afterwards that there was actually no common knowledge among them. This case is different than that of a false common belief, in which what is false is the belief; whereas in a false assumption of common knowledge, the knowledge is falsely thought of as common.
5 Conclusion By providing an account of ordinary common knowledge, I intended to show that most attacks against the concept are not justified. At the expense of an interpretation of Lewis’ analysis that stresses the inductive nature of reasoning used for common 40 To my knowledge, Gilbert (1989) was the first and only to notice this.
123
Synthese (2011) 183:249–276
273
knowledge, I obtained a satisfying formalization of the real common knowledge phenomenon. It precisely describes the kind of epistemic state the members of a group are in. The price to pay for the descriptive virtues of this formalization is the increased difficulty to manipulate it. Still, the weak equivalence result allows us to keep using the more handy common p-belief to study theoretical cases. The accuracy of the concept is thus increased without impairing its explanatory power. Formal approximate common knowledge seems to be the right formalization of the ordinary common knowledge phenomenon for human beings. The overall account is the following: common knowledge is not reducible to mutual expectations or epistemic iterations, whether of infinite or finite number. It is made of few explicit beliefs, it is mainly characterized by a given situation, filtered through agents’ personal information and reasoning standards. Common knowledge has an irreducible externalist side, and is defeasible. The formalization participates in reconciling the formal and informal analytical trends in the study of common knowledge, which has always been a distributed concept: used in various fields, taken intuitively or formally depending on cases, used in analysis of real world phenomena as well as to establish mathematical results. Game theorists and economists developed formal versions, while linguists and philosophers tended to favour non-formal ones. These separations, though real, do not mean that there are several concepts of common knowledge nor that different accounts cannot be compared. The subtleties of Lewis’ analysis are not an obstacle to the formalization of a common knowledge concept, once slightly modified. My hope is that the nature of common knowledge will be perceived more clearly now that it stands within a more unified background, cleared of some misdirected criticisms. 6 Appendix 6.1 Proofs We first recall the definition of p-belief. is a set of states, I a set of N agents and for each i ∈ a partition i of , Si the sigma-field generated by i , and P a probability measure over the sigma-field S generated by the intersection of agents’ p partitions. The belief at the degree p (or p-belief) that E is defined by: Bi E = {w : i (w) is the element of i which contains w. The shared P(E|i (w)) ≥ p}, where p p N Bi E defines the shared belief operator. belief operator is B∗ E = i=1 The belief operator has the following elementary properties (taken from Morris (1999), p. 387, and relabelled): (P1) (P2) (P3) (P4)
p
If F ∈ Si then F = Bi F. q p If q ≥ p then Bi E ⊆ Bi E. p p If F ⊆ E then Bi F ⊆ Bi E. p p ∀i B∗ F ⊆ Bi F.
Definition 1 u-(C) E is common knowledge in w if and only if there exists u > and F such that:
1 2
123
274
Synthese (2011) 183:249–276
(i) w ∈ F (ii) F ⊆ B∗u F (iii) F ⊆ B∗u E Definition 2 (r, s, f, g)-(L) E is Lewisian common knowledge in w if and only if there exists F such that: (a) w ∈ F
(b) ∃r > 21 ∀i ∈ I w ∈ Bir F
f ( p)
(c) ∃s > 21 ∀i, j ∈ I ∃ f i j : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i j p
g ( p)
(d) ∀i ∈ I ∃gi : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i p
B sj F
E
Proposition 1 u − (C) ⇒ (u, u, I d, U ) − (L) where U : (1/2, 1]→(1/2, 1] is the constant function such that ∀ p ∈ (1/2, 1]U ( p) = u, and I d : (1/2, 1]→(1/2, 1] is the function such that ∀ p ∈ (1/2, 1]I d( p) = p. Proof – (i) ⇒(a) is straightforward for F = F. – Let us take r = u : then (i) + (ii)⇒ (b). – We start from (iii) F ⊆ B∗u E. ∀i B∗u E ⊆ Biu E (property (P4)) ⇒ ∀i F ⊆ Biu E p p ⇒ ∀i∀ p > 21 Bi F ⊆ Bi Biu E (by property (P3)) p And we have Bi Biu E = Biu E by property (P1) (since Biu E ∈ Si ) p u ⇒ Bi F ⊆ Bi E. By defining gi such that ∀ p ∈ (1/2, 1]gi ( p) = u, we get (d). – We start from (ii) F ⊆ B∗u F. ∀ j F ⊆ B∗u F ⊆ B uj F (property (P4)) p p ⇒ ∀i, j∀ p > 21 Bi F ⊆ Bi B uj F (by property (P3)) By defining ∀ p ∈ (1/2, 1] f i j ( p) = p and s = u, we get (c). Proposition 2 (r, s, f, g) − (L) ⇒ min[ min(r, s), f ( min(r, s), g( min(r, s)])] − (C) Proof Define m = min(r, s); we of course have m > 21 . F is given; Let F = B∗m F . – We start from (b) : ∀i ∈ I w ∈ Bir F
⇒ w ∈ B∗r F
⇒ w ∈ B∗m F (by property (P2) ⇒ w ∈ F since B∗m F = F : we get (i). Therefore we have (b) ⇒ (i) for m-(C).
123
Synthese (2011) 183:249–276
275 g ( p)
– We start from (d) : ∀i ∈ I ∃gi : ( 21 , 1] → ( 21 , 1]∀ p > 21 Bi F ⊆ Bi i E g (m) In particular Bim F ⊆ Bi i E m m
And ∀i F = B∗ F ⊆ Bi F (property (P4)) g (m) ⇒ ∀i F ⊆ Bi i E g(m) ⇒ ∀i F ⊆ Bi E where ∀ p > 21 g( p) = min(gi ( p)) (property (P2)). g(m) ⇒ F ⊆ B∗ E (by definition of B ∗p operators) So (d) ⇒ (iii) for g(m)-(C). p
– We start from (c) ∃s > f i j ( p)
1 2 ∀i,
j ∈ I ∃ f i j : ( 21 , 1] → ( 21 , 1]∀ p >
Bi B sj F
f (m) ∀i, j Bim F ⊆ Bi i j B sj F
We have B sj F ⊆ B mj F by property (P2) f (m) f (m) ⇒ Bi i j B sj F ⊆ Bi i j B mj F (property (P3) f (m) So ∀i, j Bim F ⊆ Bi i j B mj F
f (m) p ⇒ ∀i Bim F ⊆ Bi i j B∗m F (by definition of B∗ operators) f m) ⇒ ∀i Bim F ⊆ Bi ( B mj F where ∀ p > 21 f ( p) = min( f i j ( p))
1 p
2 Bi F
⊆
(by property
(P2)) f (m) f (m) ⇒ B∗m F ⊆ B∗ B∗m F = B∗ F (by property (P4)) f (m) We have F = B∗m F , so F ⊆ B∗ F So (c) ⇒ (ii) for f(m)-(C).
We finally get: (a), (c), (d) ⇒ u-(C) with u = min(m, f (m), g(m)).
Proposition 3 For all u, there exist r, s, f, g such that u − (C) ⇔ (r, s, f, g) − (L) Proof We know from proposition 1 that u − (C) ⇒ (u, u, I d, U ) − (L). We then get from proposition 2 that (u, u, I d, U )−(L) ⇒ min[u, I d(u), U (u)]−(C). But I d(u) = U (u) = u by definition ⇒ min[u, I d(u), U (u)] = u. So u − (C) ⇔ (u, u, I d, U ) − (L). Acknowledgements For their comments at various stages of the draft, I thank Samir Okasha, Philippe Mongin, Paul Egré, Jacques Dubucs, as well as two anonymous referees whose detailed report was extremely helpful. This work was supported by the Arts and Humanities Research Council of the UK, Grant No. AH/F017502/1, which I gratefully acknowledge.
References Aumann, R. J. (1976). Agreeing to disagree. The Annals of Statistics, 4(6), 1236–1239. Barwise, J. (1988). Three views of common knowledge. In M. Y. Vardi, (Ed.), Proceedings of the 2nd conference on theoretical aspects of reasoning about knowledge (pp. 365–379). Morgan Kaufmann Publihers Inc., San Francisco. Bonnay, D., & Egré, P. (2009). Inexact knowledge with introspection. Journal of Philosophical Logic, 38, 179–227. Chwe, M. S.-Y. (2001). Rational ritual culture. Coordination and common knowledge. Princeton: Princeton University Press.
123
276
Synthese (2011) 183:249–276
Clark, H., & Marshall, C. (1981). Definite reference and mutual knowledge. In A. Joshi, B. Webber, & I. Sag (Eds.), Elements of discourse understanding (pp. 10–63). Cambridge: Cambridge University Press. Cubitt, R., & Sugden, R. (2003). Common knowledge, salience and convention: A reconstruction of David Lewis’s game theory. Economics and Philosophy, 19, 175–210. Dokic, J., & Egré, P. (2009). Margin for error and the transparency of knowledge. Synthese, 166, 1–20. Dretske, F. (1988). Explaining behavior. Cambridge MA: MIT Press. Fagin, R., Halpern, J. Y., Moses, Y., & Vardi, M. Y. (1995). Reasoning about knowledge. Cambridge MA: MIT Press. Gilbert, M. (1989). On social facts. Princeton: Princeton University Press. Godfrey-Smith, P. (1992). Indication and evolution. Synthese, 92, 283–312. Heal, J. (1978). Common knowledge. Philosophical Quarterly, 28, 116–131. Lewis, D. (1969). Convention—A philosophical study. Cambridge: Harvard University Press. Lismont, L. (1995). Common knowledge: Relating anti-founded situation semantics to modal logic neighbourhood semantics. Journal of Logic, Language, and Information, 3(4), 285–302. Lismont, L., & Mongin, P. (1995). Belief closure: A semantics of common knowledge for modal propositional logic. Mathematical Social Sciences, 30, 127–153. Morris, S. (1999). Approximate common knowledge revisited. International Journal of Game Theory, 23, 385–408. Monderer, D., & Samet, D. (1989). Approximating common knowledge with common beliefs. Games and Economic Behavior, 1, 170–190. Morris, S., & Shin, H. S. (1997). Approximate common knowledge and co-ordination: Recent lessons from game theory. Journal of Language, Logic and Information, 6, 171–190. Osborne, M., & Rubinstein, A. (1994). A course in game theory. Cambridge: MIT Press. Parikh, R. (2005). Logical omniscience and common knowledge: What do we know and what do WE know ? In R. van der Meyden (Ed.), Proceedings of the 10th conference on theoretical aspects of rationality and knowledge (pp. 62–77). National University of Singapore. Peacocke, C. (2005). Joint attention: Its nature, reflexivity and relation to common knowledge. In N. Eilan, C. Hoerl, T. McCormack, & J. Roessler, Joint attention: Communication and other minds. Oxford: Clarendon Press. Rubinstein, A. (1989). The electronic mail game. Strategic behavior under ’Almost Common Knowledge’. The American Economic Review, 79, 385–391. Schiffer, S. R. (1972). Meaning. Oxford: Oxford University Press. Sillari, G. (2005). A logical framework for convention. Synthese, 147, 379–400. Sperber, D., & Wilson, D. (1986). Relevance: Communication and cognition. Oxford: Blackwell. Tollefsen, D. (2005). Let’s pretend! Children and joint action. Philosophy of the Social Sciences, 35(1), 75–97. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception. Cognition, 13(1), 103–128.
123
Synthese (2011) 183:277 DOI 10.1007/s11229-009-9690-x ERRATUM
Erratum to: Synthese special issue: representing philosophy Colin Allen · Tony Beavers
Published online: 6 November 2009 © Springer Science+Business Media B.V. 2009
Erratum to: Synthese DOI 10.1007/s11229-009-9664-z The affiliation of Collin Allen should read: Department of History and Philosophy of Science, Indiana University, Goodbody Hall 130, Bloomington, IN 47405-7000, USA.
The online version of the original article can be found under doi:10.1007/s11229-009-9664-z. C. Allen Department of History and Philosophy of Science, Indiana University, Goodbody Hall 130, Bloomington, IN 47405-7000, USA e-mail:
[email protected] T. Beavers (B) Department of Philosophy and Religion, The University of Evansville, 1800 Lincoln Avenue, Evansville, IN 47722, USA e-mail:
[email protected]
123
This page intentionally left blank z
Synthese (2011) 183:279 DOI 10.1007/s11229-010-9852-x ERRATUM
Erratum to: What could be caused must actually be caused Christopher Gregory Weaver
Published online: 24 November 2010 © Springer Science+Business Media B.V. 2010
Erratum to: Synthese DOI: 10.1007/s11229-010-9814-3 On page 15 of the published article after the following sentence, there should have been a reference to footnote 47: “But, the quantum state attributed to the system allows one to infer that a given one of these outcomes will arise with a certain probability (see footnote 47).” On page 15 of the published article after the following sentence, there should have been a footnote: “The Kochen proof attempted to show that, the interrelationship among measured values predicted by quantum mechanics, are incompatible with any possibility of the values being fully determined by underlying values of hidden parameters.1 Thanks to Lawrence Sklar (via email correspondence) for being so nice in encouraging me not to worry about these accidental omissions. Note: These subtle omissions occurred accidently during the proof correction process.
1 Sklar (1992, p. 212).
The online version of the original article can be found under doi:10.1007/s11229-010-9814-3. C. G. Weaver (B) Rutgers University, New Brunswick, USA e-mail:
[email protected]
123
This page intentionally left blank z
Synthese (2011) 183:281 DOI 10.1007/s11229-010-9858-4 ERRATUM
Erratum to: Iterative information update and stability of strategies Takuya Masuzawa · Koji Hasebe
Received: 9 November 2010 / Accepted: 9 November 2010 / Published online: 18 January 2011 © Springer Science+Business Media B.V. 2011
Erratum to: Synthese DOI 10.1007/s11229-010-9835-y During the production of the article, some errors are overlooked. In many places, the author’s name of reference “Rational dynamics and epistemic logic in games” should be spelled “van Benthem”, instead of “Van Benthem” if it is not the first word of a sentence. Page 7 line 7 should read: 1. W = {U : U is a S(ϕ)-maximal consistent set.}, Page 14 line 7 should read: Lemma 11 (Idempotency Lemma) Assuming that q ∗ ∈ R, . . . (i.e., “R” should be “R”). Page 14 line 10 should read: E ⇒ E¬K i q ∗ for all i ∈ N . Page 14 line 20 should read: Theorem 12 Let q ∗ ∈ R . . .
The online version of the original article can be found under doi:10.1007/s11229-010-9835-y. T. Masuzawa Faculty of Economics, Osaka University of Economics, 2-2-8 Osumi, Higashiyodogawa-ku, Osaka 533-8533, Japan e-mail:
[email protected] K. Hasebe (B) Graduate School of Systems and Information Engineering, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba 305-8573, Japan e-mail:
[email protected]
123