Synthese (2011) 182:335–347 DOI 10.1007/s11229-010-9751-1
Intervention, determinism, and the causal minimality condition Jiji Zhang · Peter Spirtes
Received: 12 May 2009 / Accepted: 30 October 2009 / Published online: 20 June 2010 © Springer Science+Business Media B.V. 2010
Abstract We clarify the status of the so-called causal minimality condition in the theory of causal Bayesian networks, which has received much attention in the recent literature on the epistemology of causation. In doing so, we argue that the condition is well motivated in the interventionist (or manipulability) account of causation, assuming the causal Markov condition which is essential to the semantics of causal Bayesian networks. Our argument has two parts. First, we show that the causal minimality condition, rather than an add-on methodological assumption of simplicity, necessarily follows from the substantive interventionist theses, provided that the actual probability distribution is strictly positive. Second, we demonstrate that the causal minimality condition can fail when the actual probability distribution is not positive, as is the case in the presence of deterministic relationships. But we argue that the interventionist account still entails a pragmatic justification of the causal minimality condition. Our argument in the second part exemplifies a general perspective that we think commendable: when evaluating methods for inferring causal structures and their underlying assumptions, it is relevant to consider how the inferred causal structure will be subsequently used for counterfactual reasoning. Keywords Causation · Causal Bayesian network · Determinism · Markov condition · Intervention · Probability
J. Zhang (B) Lingnan University, Tuen Mun, NT, Hong Kong e-mail:
[email protected] P. Spirtes Carnegie Mellon University, Pittsburgh, PA, USA
123
336
Synthese (2011) 182:335–347
1 Introduction It is commonplace that causation and intervention are closely related concepts, but taking the connection seriously proves to have nontrivial implications. The recent revival of the interventionist account of causation, for example, has shed new light on a variety of issues in the philosophy of causation and explanation, including especially the nature of causal explanation in special and social sciences (Woodward 2003; Steel 2006a; Campbell 2007). Besides interpretive work, the interventionist perspective also underlies a powerful framework for causal modeling and reasoning, known as causal Bayesian networks (e.g., Spirtes et al. 1993; Pearl 2000). The framework not only systemizes counterfactual reasoning based on information about a causal structure, but also stimulated a chunk of work on inference of the causal structure from data. As we will describe in detail below, causal Bayesian networks are defined by a postulate that relates counterfactual probability distributions of a set of variables that would result from various interventions, to the actual probability distribution of the variables, mediated by a given causal structure. From the postulate one can derive interesting rules for counterfactual reasoning under various circumstances. On the other hand, a crucial component of the postulate is a condition that specifies, for a given causal structure, a set of probabilistic independence (and conditional independence) relations that must hold of a joint probability distribution if the distribution is generated from the structure. Known as the causal Markov condition, it provides a basis for causal inference based on the actual probability distribution. Hausman and Woodward (1999, 2004a,b) presented an interventionist defense of the causal Markov condition. The details of their arguments are controversial,1 and we do not wish to adjudicate on that dispute here. What is appealing to us, however, is their attempt to reveal an “intimate connection” between the interventionist (or manipulability) account of causation, which relates counterfactual circumstances, and the causal Markov condition, which relates the actual circumstance. To the extent that it is successful, their argument can improve our understanding of how the interventionist account of causation bears on the epistemology of causal inference in the actual circumstance. To this matter we aim to contribute two items in the present paper. First, assuming the actual probability distribution is strictly positive, we show that the defining axiom of causal Bayesian networks—which combines the interventionist ideas of invariance and modularity with the causal Markov condition—and the interventionist interpretation of causal structure entail that the true causal structure is a minimal structure compatible with the actual distribution and the Markov condition. This consequence is known in the literature as the causal minimality condition (Spirtes et al. 1993; Hitchcock 2002), but its logical connection to the interventionist ideas seems hitherto unknown.
1 For criticisms of their argument, see Cartwright (2002) and Steel (2006b). For a response to Cartwright, see Hausman and Woodward (2004a). Although in spirit our pursuit in the present paper can be viewed as an extension of Hausman and Woodward’s work, the main point we wish to make—the point that anyone who is willing to do causal inference based on the causal Markov condition has good reason to assume the causal minimality condition as well—does not depend on the success of their arguments.
123
Synthese (2011) 182:335–347
337
The logical connection fails when the actual probability distribution is not strictly positive, which is always the case when there are deterministic relationships between variables in the given system. Our second point is to argue, based on additional theorems without the positivity assumption, that even when the causal minimality condition is false, it would be a harmless assumption if we restricted the range of counterfactual reasoning we do with inferred causal structures. The argument exemplifies a general perspective we think commendable: when evaluating methods for inferring causal structures and their underlying assumptions, it is relevant to consider how the inferred causal structure will be subsequently used for counterfactual reasoning.
2 A proof of the causal minimality condition Throughout this paper, we consider causal relations between variables, and to keep things simple, we assume variables under consideration are all discrete with a finite number of possible values, though the result in this section can be readily generalized to continuous variables. The causal structure for a set of variables V is meant to be the set of direct causal relations between variables in V. Thus understood, the causal structure can be conveniently represented by a directed graph: take each variable in V as a vertex, and put a directed edge or arrow (→) between two variables X and Y if and only if X is a direct cause of Y relative to V. We call such causally interpreted directed graphs causal graphs, and we use “causal graph” and “causal structure” interchangeably in this paper. Some graph theoretical terminology will prove useful. In a directed graph, if there is an edge X → Y, X is called a parent of Y and Y a child of X. A directed path is an ordered sequence of two or more distinct vertices such that every vertex (except for the last one) in the sequence is a parent of its successor in the sequence. X is called an ancestor of Y and Y a descendant of X if X = Y or there is a directed path from X to Y. A directed cycle occurs in the graph if there is an arrow X → Y but also a directed path from Y to X. A directed graph is called acyclic if there is no directed cycle in the graph. We will only consider acyclic (a.k.a. recursive) causal structures, represented by directed acyclic graphs (DAGs). Given a DAG G over V and a joint probability distribution P over V, G and P are said to be Markov to each other if the Markov property is satisfied: according to P, every variable is probabilistically independent of its non-descendants in G given its parents in G. The causal Markov condition is simply that the DAG that represents the causal structure over V and the joint distribution over V are Markov to each other. Causal Markov Condition: Given a set of variables V whose causal structure is represented by a DAG G, every variable in V is probabilistically independent of its non-effects (non-descendants in G) given its direct causes (parents in G). This condition holds only if V does not leave out any variable that is a common direct cause (relative to V) of two or more variables in V, that is, only if V is causally sufficient (Spirtes et al. 1993). Throughout the paper we assume V includes enough variables to be causally sufficient.
123
338
Synthese (2011) 182:335–347
Applying the chain rule of the probability calculus, it is easy to show that the causal Markov condition entails a factorization of the joint probability distribution P as follows: P(V) =
X ∈V
P(X |PaG (X ))
where G is the DAG that represents the causal structure of V, and PaG (X ) denotes the set of X’s parents in G. In words, the joint probability is a product of local pieces of conditional probability—the distribution of each variable conditional on its direct causes (for variables without a parent, it is simply the unconditional probability). We now describe some key notions in the interventionist account of causation needed for our purpose. For any S ⊂ V and a value setting s (a vector of values, one for each variable in S), we assume there is a (hypothetical) intervention that sets the value of S to s. We denote the intervention by S := s, and denote the counterfactual or post-intervention probability distribution that would result from the intervention by PS:=s . (Stipulation: P∅ s simply the actual probability distribution P.) Clearly if the intervention S := s does what it is supposed to do, then PS:=s (S = s) = 1. The intervention is supposed to be precise, in the sense that it directly affects only its immediate targets: variables in S. If it affects other variables at all, it is via the causal influence variables in S have on other variables. This precision can be achieved only if the intervention is implemented by a force external to V. This external force is often modeled as an intervention variable (taking value ON or value OFF), which is supposed to be statistically independent of all those variables in V that are not in S or causally influenced by any variable in S. (For a careful characterization of the notion of intervention, see, e.g., Woodward 2003.) An important interventionist notion is that of modularity (Pearl 2000; Woodward 2003). The idea is roughly that each variable and its causal parents form a local mechanism, and each mechanism can be independently manipulated without affecting other mechanisms. In particular, if S ⊂ V is manipulated, the mechanisms for variables not in S (i.e., variables in V\S) remain unchanged.2 It is also very natural to interpret causal arrows in the causal structure in terms of interventions: X is a direct cause of Y relative to V iff there is some way of fixing all other variables in V to particular values, such that some change in X by intervention would be followed by a change in Y (Pearl 2000; Woodward 2003). In the present setup, the obvious formulation is: Interventionist Interpretation of Direct Cause (IIDC): For any X, Y ∈ V and Z = V\{X, Y } (where ‘\’ stands for the set-theoretical difference), X has a direct causal influence on Y relative to V iff there exist two values x = x for X and a value setting z for Z, s.t. PX :=x,Z:=z (Y) = PX :=x ,Z:=z (Y). 2 Of course the notion need to be qualified—for example, some extreme intervention of a cause may well destroy the causal mechanism it figures in (Woodward 2003). Since we are interested in the consequence of the notion, and more importantly, since causal reasoning is primarily, if not exclusively, concerned with effects of normal, non-extreme interventions, we will simply assume that the interventions we consider meet the qualifications.
123
Synthese (2011) 182:335–347
339
In plain words, X is a causal parent of Y relative to V if and only if, fixing all other variables to some value, there exist two interventions of X that would result in different distributions for Y. Note that these interventionist ideas—modularity and IIDC—both involve counterfactual or post-intervention probability distributions, and so they cannot be directly appealed to in causal inference from the actual probability distribution, the distribution we can obtain samples from without having an external intervention on the causal system in question.3 However, Hausman and Woodward (1999, 2004a,b) argued that there is an intimate connection between these ideas and the causal Markov condition,4 which relates the causal structure to the actual probability distribution and hence is relevant to causal inference from the actual distribution. Is the causal Markov condition the only such inferentially relevant condition that is “intimately connected” to the interventionist ideas? We think there is at least another one. That condition is best motivated by considering a limitation of the causal Markov condition for causal inference. Given two DAGs G and G , call G a (proper) subgraph or substructure of G and G a (proper) supergraph or superstructure of G if they have the same vertices, and the set of arrows in G is a (proper) subset of the set of arrows in G. A simple well-known fact is that if a probability distribution P is Markov to a DAG G, then P is Markov to every DAG that is a supergraph of G. It follows that the causal Markov condition by itself does not warrant an inference to the absence of causal arrows.5 Given this limitation, the following condition should sound natural. Causal Minimality Condition: Given a set of variables V whose actual joint distribution is P, and whose causal structure is represented by a DAG G, P is not Markov to any proper subgraph of G.
3 It is usually emphasized to be observational or non-experimental. We think “actual” is a better qualifier, because there may well be some experimental control built in the causal structure under investigation. The point is that no further intervention external to the causal system of interest is involved in generating the distribution. 4 One of their arguments (in 2004b), for example, is that conditioning on the direct causes (i.e., parents in the causal graph) of a variable simulates an external intervention on the variable (because the remaining variation after conditioning on direct causes in V can only be explained by external influence). So, if after conditioning on the direct causes of a variable X, the variable is still probabilistically associated with another variable Y, then it simulates the situation in which an intervention on X makes a difference to Y, and hence X should be expected to have a causal influence on Y, by the interventionist account of causation (which is a natural generalization of IIDC). In other words, if Y is not causally influenced by X, then Y should be independent of X conditional on the direct causes of X, which is the central implication of the causal Markov condition. 5 For the simplest illustration, consider the canonical scenario of causal inference with a randomized experiment. Suppose we are interested in the causal effect of a variable X on another variable Y. We randomly assign values to X for sufficiently many samples, and observe the corresponding values of Y. The experimental setup, thanks to randomization, rules out Y → X , or a common cause structure X ← C → Y . If, furthermore, we establish by analyzing the data that X and Y are not probabilistically independent, we can, based on the causal Markov condition (or its well known special case, the principle of common cause), infer that X has a causal influence on Y. However, if the data analysis tells us that X and Y are probabilistically independent, the causal Markov condition alone does not license the inference to “no effect”, because the structure in which there is an arrow from X to Y is still compatible with the condition.
123
340
Synthese (2011) 182:335–347
In other words, the condition states that if the causal Markov condition is true, then the true causal structure is a minimal structure that is Markov to P. Note that the causal minimality condition is also about the actual probability distribution, and hence is inferentially relevant in the sense that the causal Markov condition is. The causal minimality condition sounds like a methodological assumption of simplicity, but we shall argue that it is not just an add-on assumption, but is very well motivated in the interventionist framework. To see this, we need first motivate the defining axiom of causal Bayesian network, which specifies how a counterfactual distribution PS:=s is related to the actual distribution P according to the causal structure G. This axiom is easy to motivate given what we already said. The intervention S := s breaks the causal arrows into the variables in S in the causal structure G, because variables in S do not causally depend on their original causal parents any more, and / S, the notion of modularity we have PS:=s (S = s) = 1. For every other variable, X ∈ imply that (1) X has the same causal parents as before the intervention; and (2) the probability distribution of X conditional on its causal parents remains the same under the intervention. Point (1) tells us the causal structure after the hypothetical intervention S := s, which is the structure resulting from deleting all arrows into members of S in G. Moreover, if the causal Markov condition holds in the actual situation, we should expect it to hold in the post-intervention situation. If so, PS:=s should factorize as follows: PS:=s (V) =
S∈S
PS:=s (S)
X ∈V\S
PS:=s (X |PaG (X )).
Moreover, given point (2) above and the fact that PS:=s (S = s) = 1, we get the following principle6 : Intervention principle (IP): Given a set of variables V whose causal structure is represented by a DAG G, for any proper subset S of V and any value setting s for S, the post-intervention probability distribution for V\S, PS:=s (V\S), is related to the actual distribution P in the following way7 : PS:=s (V\S) =
X ∈V\S
P(X|PaG (X))
where PaG (X) is the set of parents of X in G. The intervention principle highlights the epistemic role of causal structure: it enables calculation of counterfactual, post-intervention distributions from the actual, pre-intervention distribution. This is what causal structure is used for. Our argument in the next section relies heavily on this point. 6 Spirtes et al. (1993) calls this “manipulation theorem”, because it is derived from the (extended) causal Markov condition plus the invariance properties. But the derivation is almost a restatement. 7 A more common formulation is this: the post-intervention joint distribution for V is related to P as fol lows: PS:=s (V = v) = X ∈V\S P(X |PaG (X )) when v is consistent with S = s; otherwise PS:=s (V = v) = 0. This implies the formulation we use in the main text, which is more convenient for our proofs later.
123
Synthese (2011) 182:335–347
341
One issue is that P(X |PaG (X )) may be undefined (when P(PaG (X )) = 0).8 The equation in the intervention principle is understood to apply only when all the conditional probabilities are well defined. For the following theorem, we assume that the actual joint distribution P is strictly positive, i.e., every value setting for V has a non-zero probability under P, so that the conditional probabilities are all well-defined. We will see the consequence of dropping this assumption in Sect. 3. Theorem 1 Let V be a set of variables. Suppose its actual joint distribution P is strictly positive, and its causal structure as defined according to IIDC is represented by a DAG G. Then the IP implies the causal minimality condition. Proof Suppose the IP holds but the causal minimality condition is violated. That means there is a proper subgraph G of G that is also Markov to P. It follows that there is an edge, say X → Y , that is in G but not in G . Since G represents the causal structure of V according to the IIDC, the fact that X → Y is in G implies that there exist x = x and z, s.t. PX :=x,Z:=z (Y) = PX :=x ,Z:=z (Y), where Z = V\{X, Y }. However, given IP, PX :=x,Z:=z (Y) = P(Y |PaG (Y )) = P(Y |PaG (Y ), X = x) PX :=x,Z:=z (Y) = P(Y |PaG (Y )) = P(Y |PaG (Y ), X = x ). Since by supposition P is also Markov to G , Y is independent of X conditional on PaG (Y ) under P. This, together with the positivity of P, imply that P(Y |PaG (Y ), X = x) = P(Y |PaG (Y ), X = x ). Hence PX :=x,Z:=z (Y) = PX :=x ,Z:=z (Y), a contradiction.
Theorem 1 suggests that, appearances to the contrary, the causal minimality condition is not just an add-on methodological preference for simplicity, but, assuming the actual probability distribution is positive, is a substantive implication of the interventionist account of causation (and the causal Markov condition). Even if Hausman and Woodward did not succeed in providing extra insights than what is usually offered to defend the Markov condition, anyone who actually performs causal inference based on the causal Markov condition has, in light of Theorem 1, an overwhelming reason to also assume the causal minimality condition, when the actual distribution is positive. 3 Causal minimality condition in the presence of determinism The positivity assumption needed for Theorem 1 requires that for each variable X in V, no value combination of its causal parents in V should completely determine what 8 A perhaps better way to understand this is to take the conditional probabilities of a variable given its causal parents as primitives, which represent stable propensities of the local causal mechanism. And the problem of “undefined conditional probability” is the problem that in the actual population, some value setting of the causal parents may never be realized, and hence the propensity associated with that setting does not manifest in the actual population, and can’t be inferred from the actual distribution. To rigorously formulate this line would occupy much more space than allowed here.
123
342
Synthese (2011) 182:335–347 X
Y
X
Z
(a)
Y
X
Z
(b)
Y
Z
(c)
Fig. 1 A counterexample to the causal minimality condition. (a), referred to as graph G, is the true causal graph according to the mechanisms described in the text; (b) and (c), referred to as graph G1 and graph G2 respectively, are two proper subgraphs of (a) that satisfy the causal Markov condition
the value of X is, or what the value of X is not; that is, every possible value of X has a non-zero probability under every value setting of its causal parents. When some local mechanism is deterministic,9 this assumption is immediately violated, and the causal minimality condition can fail. Here is an extremely simple example. Suppose we have three switches, X, Y and Z, each of which is in one of two possible states, on (1) or off (0). The mechanisms are: (a) X’s state is determined by a toss of a fair coin: P(X = 1) = P(X = 0) = 0.5. (b) Y is turned on iff X is turned on. (c) Z is turned on with probability 0.8 if X and Y are both turned on; otherwise, Z is turned on with probability 0.2. Notice that the local mechanism for Y is deterministic. Given this description, the causal structure in accord with the IIDC is obviously represented by the graph G in Fig. 1a. The joint probability distribution is also easy to calculate: P(X = 1, Y = 1, Z = 1) = 0.4, P(X = 1, Y = 1, Z = 0) = 0.1, P(X = 0, Y = 0, Z = 1) = 0.1, P(X = 0, Y = 0, Z = 0) = 0.4, and other value settings have zero probability. The causal minimality condition fails. It is easy to check that the actual distribution is also Markov to G1 (Fig. 1b) and to G2 (Fig. 1c), both of which are proper subgraphs of G.10 Moreover, besides the presence of a deterministic mechanism, there is nothing special about this example, and it does not take long to convince oneself that when there are deterministic relationships, the failure of the condition is a rule rather than an exception.11 Are we then necessarily guilty if we make use of the causal minimality condition in causal inference? Let us pay a closer look at the two incorrect structures. 9 Again, by “local mechanism” we mean that composed of the variables in V. So X might be (objectively)
governed by a deterministic mechanism, but if some of its causes are left out of V, the local mechanism for X (relative to the variables in V) may still be indeterministic. 10 G1 entails that Y and Z are independent conditional on X. G2 entails that X and Z are independent conditional on Y. Both are true of the actual distribution. 11 We should note that the failure of positivity does not require deterministic causal mechanisms in the system. It could just be that for some variables in the system, some combination of values has zero probability, for whatever reason. And the failure of positivity does not entail the failure of the causal minimality condition. What we claim here is simply that the causal minimality condition typically fails when positivity fails as a result of the presence of deterministic relationships. We thank a referee for pressing this point.
123
Synthese (2011) 182:335–347
343
Take G1 for example. Suppose we mistake G1 for the true causal structure, what is the consequence for counterfactual reasoning, that is, for calculating post-intervention probabilities from the actual probability distribution? Here are some sample calculations according to the IP, taking G1 as the true causal structure: PX :=1 (Y = 1, Z = 1) = P(Y = 1|X = 1)P(Z = 1|X = 1) = 0.8, PY :=0 (X = 0, Z = 1) = P(X = 0)P(Z = 1|X = 0) = 0.1, PZ :=1,X :=1 (Y = 0) = P(Y = 0|X = 1) = 0. All of them, one can easily check, are correct answers (calculated according to the true causal structure). This is also the case if we calculate these quantities using G2. Despite the simplicity of the examples, the correctness reflects a general fact: Theorem 2 Suppose G is the true causal structure for V, and the IP holds. Let G be any subgraph of G that is also Markov to the actual distribution. Then for any post-intervention probability PS:=s (T = t)(S, T ⊂ V), if it can be inferred from the actual distribution based on G,12 it can be calculated correctly based on G . Proof Consider any intervention do(S = s). By the IP, PS:=s (V\S) =
X ∈V\S
P(X|PaG (X)),
whenever the conditional probabilities on the right hand side are defined. On the other hand, if we calculated Ps (V\S) based on G , we would get PS:=s (V\S) =
X ∈V\S
P(X|PaG (X )).
So it suffices to show that for every X in V, P(X|PaG (X)) = P(X|PaG (X )), whenever the latter is defined. But this directly follows from the fact that PaG (X) ⊆ PaG (X)— because G is a subgraph of G—and the fact that P is Markov to G (and so X is
independent of PaG (X)\PaG (X) conditional on PaG (X)). Therefore, whatever counterfactual probability can be calculated from the actual probability distribution according to the true causal structure, can also be calculated correctly according to any substructure of the true one as long as the substructure is still Markov to the actual distribution. This does not mean, however, that the incorrect causal structure does not get any counterfactual probability wrong. According to G1, for example, PX :=1,Y :=0 (Z = 1) = P(Z = 1|X = 1) = 0.8. This is obviously far off given the specified mechanism for Z : PX :=1,Y :=0 (Z = 1) = 0.2. In this case, the true causal structure fares better, not by giving the right answer, 12 That is, when all the relevant conditional probabilities according to G are defined.
123
344
Synthese (2011) 182:335–347
but by signaling that this quantity is not calculable from the actual distribution, because P(Z |X = 1, Y = 0) is not defined. Thus the incorrect causal structure may overshoot. Can we control that? It is of course no point recommending that one should only calculate those counterfactual quantities that can be calculated based on the true causal structure. Is there any criterion one can use that does not depend on knowing the true causal structure? Here is one such criterion. Corollary 3 Suppose the IP holds. Let G be a substructure of the true causal structure that is Markov to the actual distribution. For any S ∪ T = V, if P(S = s, T = t) > 0, then the post-intervention probability PS:=s (T = t) can be calculated correctly from the actual distribution based on G . The corollary follows immediately from Theorem 2—because when P(S=s, T=t) > 0, every conditional probability needed to calculate PS:=s (T = t) according to G is defined. Why is all this relevant to the causal minimality condition? Suppose the condition fails. That means some proper substructure of the true causal structure also satisfies the Markov condition. Then there exists a minimal substructure of the true causal structure that satisfies the Markov condition with the actual distribution. Consider any such minimal substructure. The foregoing discussion is supposed to show that this minimal substructure would still be useful for counterfactual reasoning. Corollary 3, in particular, gives us a manageable criterion for judging when we can use this object to do correct counterfactual reasoning. The criterion we expect can be further improved,13 but that is a separate technical issue. The important epistemological point is that, for the purpose of counterfactual reasoning, the minimal substructure in question is a respectable target for causal inference. With respect to this target, there is nothing wrong to assume the causal minimality condition. Hence, even in the presence of determinism, the causal minimality condition would be a harmless assumption, as long as we keep in mind the limitation of the targeted causal structure in aiding our counterfactual reasoning, and such criteria as the one stated in Corollary 3 that can guide our legitimate use of what is inferred based on the causal minimality condition. Therefore, even when the causal minimality condition is literally false, the intervention principle entails a sort of methodological or pragmatic justification of the condition. This justification becomes especially compelling when the actual probability distribution is strictly positive. In that case, the sufficient condition for correctness given in Corollary 3 is always satisfied, which entails that, in terms of counterfactual reasoning, nothing is lost by appealing to a minimal substructure of the true causal structure in the sense that all post-intervention probabilities can be correctly calculated. 13 Consider the sample calculations on page 10. For P X :=1 (Y = 1, Z = 1) and PY :=0 (X = 0, Z = 1) the criterion is applicable, but the criterion does not authorize calculating PZ :=1,X :=1 (Y = 0), even though
the latter can be calculated correctly. We are searching for more powerful criteria. In the best scenario, the criterion could tell us exactly whether a counterfactual quantity is calculable from the (unknown) true causal structure. If so, using a minimal structure would entail no loss in power for calculating counterfactual distribution from the actual distribution. We are not optimistic about finding such a criterion, but we can’t prove it is impossible at this point.
123
Synthese (2011) 182:335–347
345
It is worth noting that, when positivity holds, this pragmatic justification is not entirely redundant given what we have shown in Sect. 2. The subtlety is that the pragmatic justification does not depend on the interventionist interpretation of causal arrows—the proof of Theorem 2 does not make use of IIDC. All it depends on is a conception of causal structure in terms of its epistemic role defined by the intervention principle. To bring the last point home, the following result helps: Theorem 4 Let V be a set of variables. Suppose its actual joint distribution P is positive and some (acyclic) structure over V satisfies the IP. Then the structure as defined according to IIDC is the uniquely minimal structure that satisfies the IP. Proof Let G d denote the structure defined by IIDC. We first show that any structure G that satisfies the IP is a superstructure of G d . Suppose, for the sake of contradiction, G is not a superstructure of G d . Then there is an arrow X → Y in G d but not in G. Since G d is defined according to IIDC, the presence of X → Y in G d means that there exist x = x and z such that PX :=x,Z:=z (Y) = PX :=x ,Z:=z (Y), where Z = V\{X, Y}. But G satisfies the IP, which means that PX :=x,Z:=z (Y) = PX :=x ,Z:=z (Y) = P(Y |PaG (Y )) Because X ∈ / PaG (Y) (as X → Y is not in G). Hence we have a contradiction. So the original supposition is false, and G is a superstructure of G d . Next, we show that G d satisfies the IP (if any structure does). Let G be a minimal structure that satisfies the IP, that is, a structure that satisfies the IP such that no proper substructure of G satisfies the IP. From the previous argument it follows that G is a superstructure of G d . We now show that G = G d . Suppose not. Then G is a proper superstructure of G d . So there is an edge X → Y in G which is not in G d . Let G be the same structure as G except that the edge X → Y is removed. We claim that G also satisfies the IP. To show this, it suffices to show that P(Y |PaG (Y )) = P(Y |PaG (Y ), X ) = P(Y |PaG (Y )) because other vertices have the exact same parents in G as they do in G. But we have that for any value pa for PaG (Y ), and any x = x , P(Y |PaG (Y ) = pa, X = x) = PPaG (Y ):=pa,X :=x,Z:=z (Y )
= PPaG (Y ):=pa,X :=x ,Z:=z (Y ) = P(Y |PaG (Y ) = pa, X = x )
where Z = V\({X, Y } ∪ PaG (Y )), and z is any value for Z. This is true because otherwise X would be a direct cause of Y according to IIDC, and hence would be a parent of Y in Gd . It follows that P(Y |PaG (Y )) = P(Y |PaG (Y )), which implies that G also
123
346
Synthese (2011) 182:335–347
satisfies the IP. But G is a proper substructure of G, which contradicts our choice of
G. So the supposition is false, and G=G d . In more plain words, Theorem 4 says that the structure defined by IIDC is the simplest structure that satisfies the intervention principle (if any acyclic structure satisfies the principle). So, if we view causal structure functionally as that which can be used for counterfactual reasoning according to the intervention principle, then the structure defined by IIDC stands out as the natural candidate for being the “true” causal structure. In other words, when the probability distribution is positive, there is an excellent pragmatic motivation for IIDC itself, and with it, the causal minimality condition.
4 Conclusion To summarize, the causal minimality condition in the theory of causal Bayesian networks is very well motivated in an interventionist account of causation. The connection can be seen in two ways when the actual distribution is positive. On the one hand, it is a logical consequence of the IIDC and the IP, the latter of which is but a combination of the interventionist notion of modularity with the causal Markov condition. On the other hand, it follows from the IP alone that the causal minimality condition would be a completely harmless assumption if all we care is the correctness of counterfactual reasoning. The latter line of argument can be extended to situations where the actual distribution is not strictly positive, as we showed in Sect. 3. Although the technical results there may be further improved, the philosophical message should be clear: even when the causal minimality condition fails, assuming it in the inference of causal structures would be fine, if we put necessary restrictions on the subsequent use of inferred causal structures for counterfactual reasoning. This point is general. The epistemology of causation has two sides: hunting causes and using them, to borrow the title of a recent book (Cartwright 2007). In evaluating a method for hunting causal structure, or its underlying assumptions, it is important to take into account how the inferred structure will be subsequently used. Conversely, how the structure is learned is also crucially relevant to how it should be used. The risk of false assumptions in hunting causes may be controlled in using them. Finally, we should note that although the causal minimality condition clearly adds something to the causal Markov condition, the exact power and value of the condition for causal inference is not completely understood. One thing that is clear is that the condition is much less powerful than the better known and logically stronger causal faithfulness condition. However, it has been established recently that assuming the causal minimality condition, a significant portion of the causal faithfulness condition is empirically testable (Zhang and Spirtes 2008). For this reason at least, it is practically valuable to provide independent justifications for the causal minimality condition, regardless of whether it is also appropriate to assume the causal faithfulness condition.
123
Synthese (2011) 182:335–347
347
Acknowledgements We thank Clark Glymour, Christopher Hitchcock, James Woodward, and two anonymous referees for valuable comments. Versions of this paper were presented at the twenty-first philosophy of science association biennial meeting and the monthly Monday meetings of formal epistemology at University of Konstanz. We thank the audience, especially Michael Baumgartner, Franz Huber, and Wolfgang Spohn for helpful criticisms and discussions
References Campbell, J. (2007). An interventionist approach to causation in psychology. In A. Gopnik & L. Schulz (Eds.), Causal learning: Psychology, philosophy and computation (pp. 58–66). New York: Oxford University Press. Cartwright, N. (2002). Against modularity, the causal Markov condition and any link between the two: Comments on Hausman and Woodward. British Journal for the Philosophy of Science, 53, 411–453. Cartwright, N. (2007). Hunting causes and using them: Approaches in philosophy and economics. Cambridge: Cambridge University Press. Hausman, D., & Woodward, J. (1999). Independence, invariance and the causal Markov condition. British Journal for the Philosophy of Science, 50, 521–583. Hausman, D., & Woodward, J. (2004a). Modularity and the causal Markov condition: A restatement. British Journal for the Philosophy of Science, 55, 147–161. Hausman, D., & Woodward, J. (2004b). Manipulation and the causal Markov condition. Philosophy of Science, 71, 846–856. Hitchcock, C. (2002). Probabilistic causation. In E. Zalta (Ed.), Stanford Encyclopedia of philosophy. http://plato.stanford.edu/entries/causation-probabilistic/. Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge: Cambridge University Press. Spirtes, P., Glymour, C., & Scheines, R. (1993) Causation, prediction and search. New York: Springer (2000, 2nd ed., Cambridge, MA: MIT Press). Steel, D. (2006a). Methodological individualism, explanation, and invariance. Philosophy of the Social Sciences, 36, 440–463. Steel, D. (2006b). Comment on Hausman and Woodward on the causal Markov condition. British Journal for the Philosophy of Science, 57, 219–231. Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford: Oxford University Press. Zhang, J., & Spirtes, P. (2008). Detection of unfaithfulness and robust causal inference. Minds and Machines, 18(2), 239–271.
123
This page intentionally left blank z
Synthese (2011) 182:349–374 DOI 10.1007/s11229-010-9745-z
A dialogue system specification for explanation Douglas Walton
Received: 4 February 2010 / Accepted: 9 April 2010 / Published online: 22 April 2010 © Springer Science+Business Media B.V. 2010
Abstract This paper builds a dialectical system of explanation with speech act rules that define the kinds of moves allowed, like requesting and offering an explanation. Pre and post-condition rules for the speech acts determine when a particular speech act can be put forward as a move in the dialogue, and what type of move or moves must follow it. A successful explanation has been achieved when there has been a transfer of understanding from the party giving the explanation to the party asking for it. The dialogue has an opening stage, an explanation stage and a closing stage. Whether a transfer of understanding has taken place is tested by a dialectical shift to an examination dialogue. Keywords Argumentation · Formal dialogue system · Artificial intelligence · Examination dialogue · Models of explanation · Scripts · MOPs · Scientific explanation Dialogue models of argumentation of the kind developed in Walton and Krabbe (1995) are now proving their worth as tools useful for solving many problems in argumentation studies, artificial intelligence, and multi-agent systems. Many formal dialogue systems have been built (Bench-Capon 2003; Prakken 2005, 2006), and through their applications (Verheij 2003), we are getting a much better idea of the general requirements for such systems, and how to build them. Reed (2006) has provided a dialogue system specification that enables anyone to construct a formal dialogue model of argumentation by specifying its components and how they are combined (Reed 2006, p. 26). This dialogue system specification provides a more convenient method for setting up formal dialogue systems of kinds that are useful for modeling argumentation
D. Walton (B) University of Windsor, Windsor, ON, Canada e-mail:
[email protected]
123
350
Synthese (2011) 182:349–374
in computing and that have been built and are currently being built for various applications. According to the argument of this paper, a variant on Reed’s dialogue system specification can also be applied to dialogue systems for explanation, and it offers a logical and philosophical basis for the notion of explanation employed in case-based systems of explanation (Leake 1992; Schank et al. 1994). Dialogue models of explanation in computing are based on examples of dialogical sequences of questions and answers in which one party tries to explain to another how some machinery works (Cawsey 1992; Moore 1995). The dialogues incorporate user feedback that enables the explanation process to recover from misunderstandings. A more abstract prototype dialogue theory of explanation CE has been built in Walton (2007a). According to this theory, both asking for and providing an explanation consist of special types of moves (speech acts) that have pre and post condition rules in dialogues. This paper builds on these models, and extends them in a particular direction, especially by solving one central problem. The problem can be posed by getting a first rough idea of how the sequence of events in the dialogue system of explanation will generally run. • Both parties know about some account, a coherent story about an event, for example, and understand the account generally. • However, the one party finds an anomaly in the account, something that she does not understand, and assumes that explainer understands it and can explain it. • She asks a question requesting an explanation, and he replies by attempting to give an explanation. • Either the explanation is successful in transferring understanding or not. • If it is successful the dialogue stops. • If the attempt is not successful, the dialogue may stop, but it may also be useful for it to continue, depending on the circumstances at that point. One of the main problems concerns the last event in the sequence. We don’t want the dialogue to go on forever, but we want to leave it open enough so that the explanation offered can be tested and repaired, so it might eventually culminate in a successful explanation. So how do we set the right conditions for the termination of the dialogue so that this need for flexibility can be accommodated by conditions for closure that are precise and workable? This will be the main problem set for the specification system for explanation dialogue built in the paper, but because of the fundamental and interdisciplinary nature of the topic, other problems arise for which there is little space for discussion. A short account of the main unsolved problems is given at the end. 1 Two examples We begin this section with two examples of explanations of the kind that might be classified under the category of everyday explanations that we all encounter and use on a daily basis in conversational exchanges. These examples give the reader an idea of the target we are aiming at in providing a theory of explanation. The first example, an explanation by a science teacher to an audience of students (Unsworth 2001, p. 589), is used in science education. The explanation assumes that the students can be expected to know that coal is widely used as an energy source,
123
Synthese (2011) 182:349–374
351
that it is black and fairly hard, and that it is found in the earth. It also assumes that the students may not be familiar with the process of how coal is formed in the earth. Here is the explanation given by the teacher to the students: “Coal is formed from the remains of plant material buried for millions of years. First the plant material is turned into peat. Next the peat turns into brown coal. Finally the brown coal turns into black coal”. The explanation is concise, but it relies on some other implicit elements as well, in addition to the one already mentioned. It is assumed that the students know what coal is, that they know what plant material is, that they know what peat is, and that they know that one material can change into another in the earth. The anomaly for the students that gives rise to their lack of understanding is that they also know that plant material is soft and brown, whereas they know that coal is hard and black. How could something that is hard and black come from something that is soft and brown? It is this anomaly that provokes the need for an explanation. Showing the intervening link of the peat helps the students come to understand enough about the process so that the anomaly is resolved. If not, and they ask further questions, very likely the science teacher can tell them more about the process, assuming that he or she has further scientific knowledge about the subject they lack. In the second example, somebody asks why the radiators are usually located under windows in a room, when windows are the greatest source of heat loss. The following explanation is offered. The windows are the coldest part of a room and when air in the room comes in contact with them, it falls to the floor. The cold air from the window is heated when it passes the radiator, then it rises and a moving current of air continuously circulates around the room. If the radiator were placed against an inside wall, that inside area of the room would stay warmer than coldest part of the room, the area where the windows are. We would have a noticeable temperature difference in the two areas that would not be comfortable for those in the room. Here again, the explainer assumes that the two of them share common knowledge about many implicit assumptions not stated in the explanation as given. For example, the explainer assumes that the questioner already knows that when warm and cold air are combined in an enclosed apace, the warm air tends to rise and the cold air tends to fall. The question presents an anomaly. If the windows are the greatest source of heat loss, then putting the radiators under the windows in a room would seem to be wasteful of energy. So why is it so commonly done? To grasp the anomaly, you have to be aware of the common knowledge that building practices generally avoid doing things that are wasteful of energy. The respondent, in his explanation puts forward a connected account showing how placement of the radiator under the window in a room generally leads to a convection current that circulates the warm and cold air around the room, mixing it together and providing a moderate temperature throughout the room that makes it comfortable for the people in it. Just as in the first example, the person offering the explanation expects that the person to whom the explanation was directed already knows quite a bit about a kind of situation familiar to both of them. The question expresses an anomaly posed by the situation of the hot radiator under the window making a lot of heat wasted, if the windows are the greatest source of heat loss. This doesn’t make sense because conservation of energy is a well-known goal
123
352
Synthese (2011) 182:349–374
in designing human habitation. Unnecessary heat loss is a bad thing, and so why the normal placement of radiators would lead to such apparently unnecessary heat loss is puzzling. The explanation solves the puzzle by giving an account of heat circulation in a room, showing that the heat loss is not as great as the questioner initially appeared to assume, and that putting the radiators elsewhere in the room would have negative consequences. The aim of this paper is to build a dialectical system of explanation primarily meant to be applicable to everyday examples like these two out of the following components. • • • • •
Opening Move: this move starts the explanation process when a request for an explanation is made by one party. Speech Act Rules: these rules define the different speech acts (kinds of moves) that are allowed in the dialogue. Pre and Post Condition Rules: these rules determine, respectively, (a) the conditions under which a speech act can be put forward as a move in the dialogue, and (b) which type of move (or moves) must follow it. Success Criterion: it determines when an explanation is successful, i.e. when transfer of understanding can be taken to have been achieved. Closing Move: this point occurs either when the explanation that was offered is successful, or when no explanation can be given, and therefore the dialogue should end. The former occurs when the dialogue has proceeded through a testing stage (if required) showing that the success criterion has been met.
As indicated by the success criterion, a successful explanation has been achieved when there has been a transfer of understanding from the party giving the explanation to the party asking for it. The purpose of offering an argument to another party is to give the other party a reason to accept a claim doubted by that other party. It is a proposition that is at issue, or is unsettled. The purpose of offering an explanation is to help the other party who indicates by his questioning that he doesn’t understand something. If the explanation is to be helpful, it should help the questioner to come to understand something that he did not understand before. A successful explanation should make the questioner come to understand, by relating what he fails to understand to what he already understands. This statement of the goal of an explanation is a normative ideal, however. In real instances one party can mislead the other by giving an explanation that she knows to be wrong, or by accepting an inadequate explanation. Alternatively the party who receives the information may say she understands, or may even think she understands, but be wrong. It is assumed in the model that both participants will follow rules for co-operative dialogue, but as we will see, this Gricean assumption can be violated in real instances of explanations, and so real cases need to be tested for success. But there are some hard questions posed by this way of defining the notion of explanation. How is it to be determined when such a transfer has taken place? What is understanding? This question seems like an especially hard one, as it could be rephrased as, ‘How can we understand understanding?’ Another question is how it can be tested whether an explanation is successful. There are some important limitations to the scope of the paper. One is that there is not enough space to apply the system to an extensively developed set of case studies of
123
Synthese (2011) 182:349–374
353
real explanations found in texts of every day discourse of the kind that can be found in Cawsey (1992), Moore (1995) and Leake (1992). The other is that although studying explanations in special fields is an important part of the topic, there is no space here to include topics like scientific explanation and historical explanation. However, in the problems for further research section, there are suggestions for further research on these matters, and some problems are posed that suggest how to extend the findings of the paper in these directions.
2 Basic components of an explanation dialogue Von Wright (1971) described explanations that convey understanding of an action or event. Understanding, in this sense, should not be taken to refer merely to a feeling of personal confidence that one has understood something. Since then the notion of understanding has become a component in case-based explanation in artificial intelligence (Schank 1986; Schank and Abelson 1977; Schank and Riesback 1981; Schank et al. 1994). These case-based models of explanation are dialectical in that they involve a transfer of understanding between two parties who can communicate with each other. They also involve a sense of ‘understanding’ that is reconstructive in the sense that one party in a dialogue can use understanding of familiar situations to fill gaps in the understanding of another. In this sense, understanding should be taken to have a dialectical meaning that can be modeled in a framework of two parties reasoning together who share some common knowledge about how things normally go in stereotypical situations. To grasp this dialectical sense of understanding, we look to the formal dialogue models used to represent various aspects of argumentation (Reed 2006). To grasp how transfer of understanding can be modeled in a formal rule-governed dialogue structure, we need to build on Hamblin’s notion of the commitment store of a participant in a dialogue as analyzed in Walton and Krabbe (1995). As each partner in a dialogue makes a move, statements are inserted into his/her commitment store, or deleted from it. For example, if a party asserts statement A, then A is inserted into her/his commitment set. A commitment store is basically just a set of statements, but inferences can be drawn from these statements representing implicit commitments. If an agent is committed to one statement, then the other party to the dialogue can often assume justifiably that he must be committed to other related statements as well. Of course, she can always ask him. But in many cases she can assume that he is committed to some statement indirectly, based on what he said. For example, suppose Bob went to a pizzeria and ordered a pizza. It can normally be assumed that he is committed to paying for the pizza before he leaves the pizzeria. Also, the retraction of one commitment often requires a stability adjustment, meaning that other statements implying this commitment will also have to be retracted in order to preserve consistency (Walton and Krabbe 1995, pp. 144–149). In a rigorous persuasion dialogue (RPD), the moves and responses are restricted tightly by the rules so that what is allowed is precisely indicated as a small number of options at each move. For example, only yes–no questions can be asked, and the only answer allowed is yes or no. In a permissive persuasion dialogue (PPD), participants have more choices in what kinds of moves they can make at each turn, and how many things they can say at a given move (Walton and Krabbe
123
354
Synthese (2011) 182:349–374
1995, p. 126). Also, responses to a previous move are less strictly determined. For example, a party may be allowed to put forward an argument and ask a question at the same move. In either type of dialogue, commitment sets do not always have to be consistent, but if one party’s commitment set can be shown by the other party to be logically inconsistent, the first party needs to remove the inconsistency, and perhaps also retract other commitments related to it. The rules governing the operations of commitment sets in the Walton and Krabbe systems are used as a basis in this paper to show a way toward representing transfer of understanding in an explanation dialogue. At the beginning of an explanation dialogue, each party is assumed to have a knowledge base that operates more or less like a commitment store in an argumentation dialogue. Each knowledge base is a set of statements, including particular statements and general statements that can act as rules to draw inferences by applying to other statements. The participants must also share a common knowledge base containing general and particular common knowledge about the event that is to be explained. This common knowledge base contains common-sense procedural knowledge that enables a language user to understand how things typically happen in stereotypical situations, enabling her/him to fill in missing elements not explicitly stated in a given text of discourse. These commonly known normal ways of doing things in familiar situations were codified in early work in AI (Schank and Abelson 1977) using what they called scripts, based on the theory that much common sense reasoning is based on unstated assumptions in a text of discourse that can be added in to fill gaps to make chains of reasoning explicit. Their standard illustration is the restaurant example, consisting of the following set of seven explicit statements. (1) John went to a restaurant. (2) The hostess seated John. (3) The waitress gave John a menu. (4) John ordered a lobster. (5) He was served. (6) He left a tip. (7) He left the restaurant. The account implicit in this set of statements can be made explicit by filling in gaps by drawing plausible inferences. We can infer defeasibly that lobster was listed on the menu. Maybe it was a special item not listed on the menu, and the waitress told John about it. Still, from statements 3 and 4 in the list, we can derive the implicit statement by inference that lobster was listed on the menu. Normally restaurant customers get their information about what is available from the menu they are given. It is also reasonable to infer defeasibly that John ate the lobster. We can fill in gaps by inserting implicit statements based on implicit assumptions about the normal ways of doing things when a person goes to a restaurant. A more flexible way to represent familiar routines that represent common knowledge is to use smaller modules called MOPs, or memory organization packages (Schank 1986). These also represent stereotyped sequences of events, but are smaller than scripts and can be combined in a way that is appropriate for the situation when they are needed. For example, the space launch MOP includes a launch, a space walk and a re-entry (Leake 1992, p. 73) as parts of a package of connected events. MOPs are used in case-based reasoning (CBR), a pragmatic approach to explanation used in AI. CBR is the process of solving new problems based on the solutions of similar past problems. A mechanic who fixes an engine by recalling the cases of another car with a similar problem uses CBR. Scripts and MOPs can be used to build or amplify what is here called an account or is often called a story, a connected sequence of events or
123
Synthese (2011) 182:349–374
355
actions that hangs together, is ordered as a sequence, and that contains gaps that can be filled in. A special type of account commonly found in everyday explanations is that of the anchored narrative (Wagenaar et al. 1993) in their theory of anchored narratives. If a questioner raises doubts about such an account, the answerer can support the acceptability of the account by giving reasons or “anchors” that ground the account in some independent facts or considerations that support it. The notion of an anchored narrative is more complex than that of a script, because it also involves justifying parts of the account that are questionable, or may even be dubious. In such a case, the explanation that was given may not only need to be filled out by making implicit parts of it explicit, some parts of it may have to be justified by producing arguments to back them up. Here we are dealing not just with explanations, but also with arguments used to support an explanation. This aspect will turn out to be important later. Each participant’s understanding of the anomaly being discussed will change and evolve over the course of a dialogue. At the beginning of an explanation dialogue both participants share a common knowledge base containing the MOPs needed for the explanation queries and attempts that will follow. As the part of the dialogue where the explanation is asked for and provided proceeds, MOPs will be brought forward from the knowledge base that was there at the beginning. The MOPs are inserted for use by the participants and deleted when they are not is use. Hence they operate in a way comparable to the way that commitment stores operate in an argumentation dialogue. The MOPs at the beginning of a dialogue represent the way things can normally be expected to go in kinds of situations that are familiar to both parties. During the later part of the dialogue one party puts forward an account of something that happened, a kind of story that may or not be true in reality, but that neither party wants to dispute. The other party may find something puzzling in the account, something that does not look normal or quite right, and ask for an explanation of the perceived anomaly. As the dialogue proceeds, statements will be inserted into or deleted from each party’s knowledge base as each of them makes moves in the dialogue. What triggers the need for an explanation is that one party fails to understand something in the account the other is taken to understand. Then the other party is expected to amplify the account in a way that will provide the required understanding. CBR explanation systems have already been implemented that roughly fit the dialogue framework so far sketched out. For example, ACCEPTER (Leake 1992) is a computer system for story understanding, anomaly detection and explanation evaluation. Explanations are directed towards filling knowledge gaps revealed by anomalies. ACCEPTER has two special features (xii). (1) Explanations are built from uncertain inferences based on plausible reasoning. (2) Context (including explainer beliefs and goals) is crucial to explanation evaluation. The examples of stories processed by ACCEPTER include the death of a race horse, the death of a basketball star, the explosion of the space shuttle Challenger, the recall of Audi 5000 cars for transmission problems, a fictional story about a lame racehorse that wins a race, and an account of an airliner that leaves from the wrong departure gate (Leake 1992, p. 38). Although ACCEPTER fits some parts of the explanation dialogue system built below, it does
123
356
Synthese (2011) 182:349–374
not fit all of them. Some problems in building the system will especially bring out features that arguably do not fit with ACCEPTER. In this paper, the aim is not to build a formal dialectical model representing any particular type of explanation, nor is it to build an implemented explanation system for computing like ACCEPTER. Rather the aim is to build a general stencil or format, a dialogue system specification for explanation. Reed (2006) has already specified the general requirements for a dialogue system specification as follows. A dialogue is a set of moves form a first one to a last one, where the two parties (in the simplest case) take turns making moves. The system needs to set out what locutions (speech acts) are permitted for the participants to make at each move. The pre-conditions are the conditions that must be met before one of the locutions can be legally uttered. The specification also needs to set out conditions defining what counts as an acceptable reply (next move) to any given type of move. These are called the post-conditions of a move. A dialogue system can be captured completely according to Reed (2006, p. 26) by specifying the pre-conditions and post-conditions of every possible locution, along with two other factors. One is the set of rules governing the participants’ commitment stores and the other is a list of the termination states of the dialogue: “Pre and post conditions can be completely specified by listing those dialogic obligations, commitment store entries and structural conditions that their locutions depend upon or establish”. Reed’s specifications are intended to apply to formal dialogue systems for argumentation, and the question is whether comparable conditions can be adapted to a dialogue system for explanation. We model all the types of dialogue as having three stages, an opening stage, an argumentation stage and a closing stage. The model of explanation dialogue proposed here will have three corresponding stages, an opening stage, an explanation stage and a closing stage. The goal of an explanation type of dialogue is for there to be a transfer of understanding from the one party to the other. At the opening stage, the participants agree to take part in a certain type of dialogue, and to follow the rules and conventions of the dialogue, which they both understand and accept. At this stage, it should be clear, for example, that they are engaging in an explanation dialogue, as opposed to some other type of dialogue like an argumentation dialogue, or some dialogue in which information is simply to be exchanged. During the explanation stage, a request for an explanation is made, and then the other party responds to the request. Following these moves, the two parties make other kinds of moves that are ideally supposed to lead to the closing stage, where the explanation is judged to be successful or not. In the explanation dialogue system CE of Walton (2007a), the closing stage had two rules. The first states that if the explainee makes the reply ‘I don’t understand’ in response to an explanation offered by the explainer, the dialogue can continue. The second rule states that if the explainee makes the reply ‘I understand’ in response to an explanation, the dialogue ends at that move. This attempt to provide closure rules was based on the assumption that the criterion for the successful completion of the dialogue is the explainee’s being satisfied with the explanation given by the explainer. The problem with this criterion is that the explainee could be faking, or could simply be mistaken. Even though he says he now understands what he formerly did not, this may simply not be true. Even though he has the psychological feeling that he understands, it may well be that he does not really understand the explanation that was
123
Synthese (2011) 182:349–374
357
offered. In other words, we need a better test for the success of an explanation other than its being acceptable or feeling right to the explainee.
3 The shift to examination dialogue Scriven (1972, p. 32) provided a different way of testing the success of an explanation in the following quoted remark, expressed in the form of a dialogue. How is it that we test comprehension or understanding of a theory? We ask the subject questions about it, questions of a particular kind. They must not merely request recovery of information that has been explicitly presented (that would test mere knowledge, as in knowing the time or knowing the age of the universe). They must instead test the capacity to answer new questions. Based on this remark, we now formulate Scriven’s Test: the success of an explanation is judged by the explainee’s capacity to answer new questions, shown in an extension of the dialogue sequence where probing questions are put to the explainee. Using Scriven’s test for the success of an explanation, the closure rules for CE need to be modified. The explainee needs to show real understanding, and not merely claimed understanding. But how is real understanding to be judged? How can Scriven’s test be implemented in some method that would tell us when real understanding has been achieved so that the explanation can be judged to have been successful? The proposal made here is to use something called an examination dialogue. The examination dialogue is embedded into the original explanation type of dialogue to provide a continuation of it in which the explanation offered and accepted in the explanation is tested. Examination discourses (perastikoi logoi) were defined by Aristotle (1928) in On Sophistical Refutations (165b4–165b6) as consisting of questions and replies designed to test an answerer’s claims to knowledge. Such a dialogue is “based on opinions held by the answerer and necessarily known to one who claims knowledge of the subject involved.” The aim of this kind of dialogue, according to Aristotle (On Sophistical Refutations 172a33), is to “attempt to test those who profess knowledge.” Socrates use of his skills of examination in the Platonic dialogues provides the classic examples. Lawyers are familiar with the use of examination skills in trials, for example in questioning an expert witness. But we also need to use examination skills in practical affairs of everyday life. For example, this type of dialogue takes place when you communicate with your physician, or other expert advisers, when they give you advice or recommend a particular course of action when you are trying to decide what to do. An analysis of the structure of examination dialogue was presented in Walton (2006). Examination dialogue was shown to have two goals, the extraction of information and the testing of the reliability of this information. The first goal is carried out by the asking of questions in order to obtain information from the respondent, and by an exegetical function used to obtain a clear account of what the respondent means to say. The testing goal is carried out with critical argumentation used to judge whether the information elicited is reliable. To perform this function, the information is tested against the respondent’s other statements, known facts in the case, and other
123
358
Synthese (2011) 182:349–374
information thought to be true. This type of dialogue was shown in Walton (2006) to be most prominent in law and in both legal and non-legal arguments based on expert opinion. It was also shown to be central to dialogue systems for questioning and answering in expert systems in artificial intelligence. The examples studied also included exegetical analyses and criticisms of religious and philosophical texts as well as legal examinations and cross-examinations conducted in a trial setting. Dunne et al. (2005) have built a formal model of examination dialogue in which one party, called the questioner, elicits statements from another party called the responder. The questioner has the aim of discovering the responder’s position on some topic being discussed. The questioner may do this either to gain insight into the responder’s understanding of the topic, or to expose an inconsistency in the responder’s position. Their system is designed to model the process in which one party scrutinizes the other party’s position to reveal internal inconsistencies in it. The examiner wins if she shows that the responder is committed to an inconsistency. This finding is achieved if the party being questioned replies that he denies a particular proposition or has no comment on it, but then the examiner shows that he has already revealed through his previous replies, or by evidence already accepted in the case, that he is committed to this proposition. According to their classification, examination dialogue is embedded in an information-seeking dialogue, and it is also seen, in some cases, as a prelude to persuasion dialogue (Dunne et al. 2005, p. 1560). Further work (Bench-Capon et al. 2008) has shown how commitment in examination dialogue can be modeled using value-based argumentation frameworks. There can be dialectical shifts, or changes of context from one type of dialogue to another during the same continuous sequence of argumentation (Walton and Krabbe 1995). Consider the case of a contractor and a homeowner engaged in negotiation dialogue on a proposal to install a concrete basement in a house where the contractor begins to inform the homeowner about the city regulations on thickness of concrete for house basements. The standard example (Parsons and Jennings 1997) is the case where two agents have a joint intention to hang a picture. One has the picture and a hammer, and knows where the other can get a nail. They have a deliberation dialogue but can’t agree on who should do which task. They then shift to a negotiation dialogue in which the one agent proposes that he will hang the picture if the other agent will go and get the nail. There can be many different kinds of dialectical shifts of this kind in everyday discussions. In some cases, the new dialogue contributes to the success of the previous one. This kind of case is classified as a functional embedding of the one dialogue into the other. In other cases, the one dialogue is an interruption in the first one, but there is no serious problem because the first dialogue can easily be resumed once the second one has finished. However, in some cases, the advent of the second dialogue blocks the progress of the first one, or seriously interferes with it, and presents a serious obstacle to its progress. These kinds of cases are classified as illicit dialectical shifts (Walton and Krabbe 1995). However, the shift in examination dialogue from information-seeking to persuasion dialogue, of the kind noted by Dunne, Doutre and Bench-Capon, is an embedding of a highly typical and especially significant sort. It was shown in Walton (2006) that examination dialogue can be of two basic types, and each one was named after terms used in Greek philosophy. Guthrie (1981, p. 155) drew a distinction between two types of examination, defining peirastic discussion
123
Synthese (2011) 182:349–374
359
as “testing or probing” and exetastic discussion as “examining critically”. Guthrie described the distinction between these two types of examination as a component of the Aristotelian method of dialectical discussion used for testing and investigating (p. 155). In the peirastic type, the aim is merely get an account representing what the respondent is supposedly claiming, based on the available textual evidence of the discourse. In this type of dialogue the one party in a dialogue tries to make sense of what the other has said by interpreting and reconstructing what was said. The exetastic type is more argumentative. The questioner probes into the weak points of the answerer’s account, asking critical questions, and even questioning statements and implicit assumptions in the account. The aim of this process is to reveal implausible statements, internal inconsistencies, logical weaknesses and gaps in the account. Both types of examination can be used to test an explanation, but the second type is the harder test to pass. The goal of an explanation dialogue is for there to be a transfer of understanding from the one party to the other. This goal defines what it is for an explanation attempt to be successful in that type of dialogue. It is assumed that both parties accept this goal as part of the opening stage when they agree to take part in an explanation. This implies that both parties desire a transfer of understanding to take place, and that both will be co-operative in politely following the rules of the dialogue. The general goal of an examination dialogue is quite different. Its twin goals are to extract information from the respondent and to test the reliability of this information. Examination dialogue is more adversarial than explanation dialogue. The examiner uses questions to test the reliability of the information obtained from the respondent. To carry out such a test one means at the disposal of the questioner is to try to trap the respondent into committing to an inconsistency, or into committing to a statement that is not plausible. These moves may make the respondent look foolish, or may even make it appear that the respondent is lying. Thus examination dialogue can become quite aggressive in some instances and even appear to be hostile. In some instances it even shifts to interrogation dialogue, an even more adversarial type of dialogue with different goals (Walton 2003). An interesting aspect of explanation dialogue to study concerns cases where the dialogue goes wrong, and participants show that they are not well-intentioned or cooperative. These include cases where one party seeks to mislead the other, either by giving a false explanation or by accepting an inadequate one. They even include cases where one party seeks to maliciously waste the other party’s time and energy by being whimsical or acting capriciously. These same sorts of difficulties can occur in the applicability of explanation models to computer systems. The computer system, even if it is designed and created with good intentions, may be bug-ridden, and so act in a manner that appears irrational or malicious to an independent observer. In other cases, something disguised as an explanation may really function as a different type of dialogue. In recommender systems, something that is offered as an explanation to the user may really be an attempt to sell something to him by guiding him to the purchase of a product that is available online. For these reasons, examination dialogue can provide a means of testing whether a transfer of understanding has really taken place or not in an explanation dialogue. If the explainee merely desires to convince the explainer that such a transfer has taken
123
360
Synthese (2011) 182:349–374
place when it has not, the explainer might be able to expose this failure this by probing into the explanation by shifting to an examination dialogue. On the other side, if the explainer seeks to confuse, to obfuscate, to prevaricate, or even to intimidate the explainee, rather than to transfer understanding, the explainee can critically probe into the offered explanation to reveal the defects and problems in it, and possibly even reveal it as spurious. Hence we now turn to a consideration of how explanations can be tested to see whether they are really successful or not, by means of a shift to an examination dialogue. 4 The shift model and two objections We now return to Scriven’s test which says that the success of an explanation is judged by the explainee’s capacity to answer new questions, shown in an extension of the dialogue sequence where probing questions are put to the explainee. The hypothesis now put forward is that Scriven’s test can be implemented using the model of a dialectical shift from an explanation dialogue to an examination dialogue. We begin with a rough outline of how a sequence of explanation dialogue typically runs in the system, and is evaluated in it as successful or not, as shown in Fig. 1. The sequence begins with two requirements set at the opening stage of the dialogue. The first is the explainer’s offering an account, a set of assumed facts or accepted statements that are connected together by inferences (box 1 in Fig. 1). The second is that the explainee has found an anomaly in the account, something in it that he does not understand (box 2). Then (box 3) the explanation stage is set into motion, where the explainee asks a question asking for understanding of the anomaly, and the explainer offers an explanation that attempts to provide the requested understanding (box 4). Then (box 5) there is a shift to different type of dialogue in which the explainer’s comprehension of the explanation is tested by the explainer’s asking a series of probing questions designed to see if the explainee now understands the account or not. If
Account Presented by Explainer
Explainee Fails to Understand Anomaly
1
9
2
Explanation Successful
11
10 Test Failed
Explanation Unsuccessful
3 Explainee Asks for Understanding
Transfer of Understanding
8
Fig. 1 Typical dialogue sequence in Explan
123
5 Explanation Tested in Examination Dialogue Interval
4 Explainer Offers Explanation
7 Shift Back to Explanation Dialogue
6 Test Passed
Synthese (2011) 182:349–374
361
the test is passed (box 6), it can be taken that the required understanding has been achieved, and the dialogue can then shift back from examination to the main explanation dialogue (box 7). If transfer of understanding has been carried out (box 8), the explanation can be evaluated as successful (box 9). What happens if the test carried out during the examination interval is failed (box 10)? This shows that the explanation was unsuccessful (box 11). So now what should be done? Should the dialogue stop there? The solution shown in Fig. 1 is that the dialogue can be continued. The explainee can try to rephrase the question by indicating better what he failed to understand, in light of the previous examination dialogue (box 3). Then the explainer can offer a different explanation, modified to better suit the needs of the explainee (box 4). This explanation improvement cycle shown in Fig. 1, {3, 4, 5, 10, 11, 3}, can go around several times, as the two parties move collaboratively to better and better explanations until enough success has been achieved so that transfer of understanding has taken place. Failure occurs when the two parties remain stuck in this feedback loop because the examination dialogue keeps failing. In such a case, once the shift is made back to explanation dialogue, the explanation dialogue still fails. How can the parties break this failure cycle? An answer to this question will be given in the next section. Before getting there two objections to the shift model need to be replied to. The first objection is that confining explanations to attempts to resolving anomalies is too narrow, because in some instances of explanations, there is no anomaly, just something that the explainee cannot understand. For example, suppose a person gets a financial statement from his investment counselor about the current worth of his investments in a mutual fund, and the document is so complex that he cannot understand it. There is no anomaly in the document that he can pinpoint, but still, he does not understand it. He phones the counselor and asks her how to make sense of what is in the document. Wouldn’t we say that he is asking for an explanation, and if so, doesn’t this show that an explanation does not have to be of an anomaly. To deal with such cases, a distinction needs to be drawn between an explanation and a clarification, as distinctive types of dialogues (Walton 2007b). Both explanation and clarification involve transfer of understanding from one party to another in a dialogue, but explanation can be of an event, or of an anomaly of any sort. A clarification dialogue occurs where one party has made some move in the dialogue, a verbal move or speech act, and there is something that is unclear to the second party. Then the second party, at his next move, declares that he does not understand what was said, and then he requests that the first party provide the understanding needed to remove the obscurity. The purpose of a clarification dialogue is to achieve clarity about something that is unclear (obscure) to the one party. Removing obscurity is one kind of transfer of understanding, but there are many other kinds as well. While an explanation responds to a perceived anomaly, a clarification responds to an obscurity. Another difference is that an explanation arises from an account, very often of some reported event, whereas a clarification arises from a previous message in a dialogue. Schlangen (2004, p. 137) brings out this point very well when he writes that what examples of clarification have in common is that unlike normal questions, they are “not about the state of the world in general, but about aspects of previous utterances”. Further work needs to be done on
123
362
Synthese (2011) 182:349–374
giving an illustration, which appears to be different from explanation and clarification. These issues are included in the problems for further research section below. The second objection can be posed by imagining the hypothetical case of a science teacher who is excellent in every way, except that his knowledge base is riddled with falsehoods. His students, who know no better, accept his explanations, and let’s even assume that when examined on them, they answer the questions well, showing that they understand the explanations the teacher offered. The objection that might be raised when considering this hypothetical case is that it shows that Explan makes the success of an explanation too explainee-relative. The objection suggests that in addition to the success conditions, truth conditions should need to be met in order to make an explanation a good one. The solution to this problem is provided by the shift from the explanation dialogue to the exetastic type of examination dialogue when required. In this type of dialogue, questions are raised on whether the statements in the given explanation are true, or factually accurate. An exetastic dialogue, like an anchored narrative, is argumentative. It probes critically into the weak points in an account. It requests justifications (supporting arguments) for claims made. For example in a scientific explanation, this kind of examination includes consideration of whether the explanation in question fits with existing data, including the use of experimental results to test the explanation. Another part of the answer to this objection comes from considering objections arising from a second problem.
5 The problem of the failure cycle A third problem is how the Explan system can deal with the failure cycle displayed in Fig. 1 depicting the typical sequence of dialogue in Explan. Failure occurs when the two parties remain stuck in this feedback loop because the examination dialogue keeps failing. In such a case, once the shift is made back to explanation dialogue, the explanation dialogue still fails. How can the parties break this failure cycle? The solution to this problem is to be found by incorporating a double dialectical shift from explanation dialogue to examination dialogue and then back again, and by providing a success criterion for the original explanation dialogue that can be achieved through the success of the intervening examination dialogue. The problem of the failure cycle {3, 4, 5, 10, 11, 3}, shown in Fig. 1, occurs where the examination dialogue interval turns out to be unsuccessful at point 11 in the sequence. What should happen here? For example, in the science teaching dialogue on coal, suppose the student examines the explanation offered by the teacher as well as she can, and the teacher answers her questions as well as he can, but the examination dialogue fails to throw any light on the explanation offered. The student is not convinced that the teacher’s explanation has stood up to critical scrutiny and concludes that the teacher does not know what he is talking about. The teacher is convinced that the student has not asked the right questions in her examination interval and still has not understood how his explanation has resolved the anomaly she questioned in it. Perhaps keeping trying to reopen the examination dialogue might eventually lead to success, but there needs to be some sequence of moves leading up to closure to solve
123
Synthese (2011) 182:349–374
363
the problem of how to formulate the post-condition rules of Explan that were left open in Sect. 4. The solution to the problem lies in more fully formulating the closure conditions for examination dialogue when such a dialogue is embedded in an explanation dialogue. The criterion suggested by Scriven’s test is that the explainee must have proved her capacity to answer new questions, but we now have to add to this test. For the examination dialogue to be good enough to be closed off before the shift, both parties have to have performed well enough. The explainee has to have asked the right questions to show that she has understood the explanation well enough to probe into it critically, and the explainer has to have dealt with questioning well enough to show that he really knows what he is talking about. When this has taken place in a given case is discussed in Dunne et al. (2005) and Walton (2006). In real cases, however, there are often practical limits on the process imposed by costs and circumstances. The solution is provided by breaking the structure of an explanation dialogue into a characteristic sequence of fourteen substages leading to the closure of the explanation dialogue. The sequence is linear, up to substage 4, but then there is a choice point so that the sequence becomes a tree with two branches. The explanation dialogue can be closed off in two ways, depending on which branch is followed. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
14.
Explainer has put forward some account, a coherent story about event. Explainee finds an anomaly in the account, and assumes that explainer understands it and can explain it. Explainee asks for an explanation of the anomaly, and explainer replies by attempting an explanation. Explainee is satisfied with the explanation or not. Either option can lead to a continuation of the dialogue. If the explainee is not satisfied, she can ask further questions about the account. This option leads to a continuation of the explanation dialogue where the explainer is questioned by the explainee. If the explainee is satisfied, the explainer can ask further questions to test whether she really understands the account or not. This option leads to a shift to an examination dialogue in which the explainee is questioned by the explainer. If the examination dialogue is unsuccessful, then so is the original explanation dialogue. The original explanation dialogue is now closed. If the examination dialogue is successful, there is a dialectical shift back to the explanation dialogue. The results of the gain in understanding can now be carried over to the continuation of the original explanation dialogue. If the results of the gain in understanding from the examination dialogue are sufficient for a transfer of understanding of the kind required by the original explanation dialogue, the explanation is successful. The original explanation dialogue is now closed.
According to this way of plotting the path of the explanation dialogue to closure, both parties need to have passed dialectical tests. Both need to have contributed to the
123
364
Synthese (2011) 182:349–374
examination well enough so that light is thrown on the understanding of both. The account of the explainer has to stand up to scrutiny, and the scrutiny undertaken by the explainee has to show that she understands how the account works, what its implicit elements are, and how they fit in with the parts explicitly stated. Also, both parties have to prove that they can critically evaluate the account by asking and responding to probing questions about what appear to be the weak points in it. Explan incorporates a modified version of Scriven’s test that takes both performances into account. The solutions to these three problems proposed in Sects. 4 and 5 makes it possible to construct a dialogue system specification for explanation that meets the requirements set out above. As noted in Sect. 2, an explanation dialogue has three stages, an opening stage, an explanation stage and a closing stage.
6 The opening stage For it to be clear that the two participants are starting an explanation dialogue, four requirements must be met. The first two are more general and the second two are more specific, requiring introduction of some other notions. The first requirement is that the two parties to the dialogue share understanding of some things, and especially that they share and accept some common knowledge about the way things normally work in some domain they are familiar with. The second requirement is that one party, called the explainer, is presumed to have understanding of something that the second party, called the explainee, lacks. In the example of the teacher explaining to students how coal is formed, it is presumed that the teacher has scientific understanding of this process and that the students do not. This example also shows that the situation is not so simple. For example, it is assumed that there is more than one student in the class. But to make as simple a structure as possible for the basic notions, the dialogue system specification assumes that there only two participants in a dialogue representing the roles of explainer and explainee. We need to stress though that in real cases of explanations, what is represented as one party in the dialogue may in fact be a large group. The third requirement is that there has to be an account that both parties have access to. In the radiators example the explainer puts forward a connected account showing how placement of the radiator under the window in a room generally leads to a convection current that circulates the warm and cold air around the room, mixing it together and providing a moderate temperature throughout the room that makes it comfortable for the people in it. Normally in an explanation dialogue the account is given by the explainer to the explainee, but in any case, both of them have to have access to it. An account is a set of statements in which there are inferences from some statements to others but it needs to be stressed that there can also be implicit statements drawn by inference from the explicit statements. In the radiators example, the explainer assumes that the explainee already knows that when warm and cold air are combined in an enclosed space, the warm air tends to rise and the cold air tends to fall. Just as in the first example, the person offering the explanation expects that the person to whom the explanation was directed already knows quite a bit about a kind
123
Synthese (2011) 182:349–374
365
of situation familiar to both of them. In the dialogue system, accounts are based on scripts, MOPs or stories. The fourth requirement for the opening stage of an explanation dialogue is that the explainee has to detect an anomaly in the account, something that doesn’t fit in with the account. An anomaly is something the explainee does not understand in an account, even though she understands the rest of the account. For example, it may be an inconsistency, or a statement in the account that appears implausible. The explainee’s question in the radiators example presents an anomaly. If the windows are the greatest source of heat loss, then putting the radiators under the windows in a room would seem to be wasteful of energy. So why is it so commonly done? To grasp the anomaly, you have to be aware of the common knowledge that building practices generally avoid doing things that are wasteful of energy. It would be anomalous for rooms to be normally configured with radiators under the window if, as it appears, this leads to wasteful heat loss.
7 The explanation stage The explanation stage is initiated by the explainee’s putting forward a special type of speech act. There can be various kinds of explanation questions that ask for different kinds of explanations. However, the system Explan is meant to be a simple and basic dialogue system specification on which specialized and more complex systems can be built, and so there is only one kind of explanation speech act in it. It has the form ‘ExplanAnom x’, where x is an anomaly in an account that has been given by the other party. The speech act ‘ExplanAnomx’ makes a request to the explainer to provide understanding concerning the anomaly x. The explanation dialogue is opened by the explainee’s putting forward the speech act ‘ExplanAnomA1’, where A1 is an instance of x. The dialogue proceeds to the second move when the explainer makes an attempt to explain the anomaly. At the third move, the explainee can accept the explanation or not. But other responses are also allowed. The explainee may still not understand what she needs to understand, and so she may have to ask further questions about aspects of the explanation that appear puzzling. Moulin et al. (2002, pp. 174–176) showed that there are three kinds of explanations that are common in AI, trace explanations, strategic explanations and deep explanations. Let’s begin with trace explanations. In expert systems, the system produces an explanation in response to a user’s how or why questions by producing an execution trace, a sequence of inferences leading from statements in the knowledge base to the statement queried. Strategic explanations place an action in context by revealing the problem-solving strategy of the system used to perform a task. Deep explanations require two separate knowledge bases and a transfer from the system’s base to the user’s that fills in gaps in the user’s knowledge base. The system has to know what the user knows, to fill in the gaps. It is this third type that best fits the dialogue model. A chain of inferences in an account is called a sequence of reasoning. Of the three kinds of explanations mentioned above, the simplest is the trace explanation, and we use this type as an illustration of an account here. In a trace explanation, a statement A that has been queried is traced by chaining backward in a knowledge base to the set
123
366
Synthese (2011) 182:349–374
of facts (statements) and rules (of inference) in the knowledge base. A is derived by a chain of inferences from the facts, where the process is viewed as forward chaining. Looked at in reverse, such a chain of arguments is an explanation. This kind of explanation fits the covering law model, as long as the inferences in the chain are only of the deductive or inductive sort. But there are other kinds of explanation. In other cases, an account can take the form of a script, an account that has gaps in it because not all the connections in the account are stated explicitly. These gaps have to be filled in by making assumptions about common knowledge shared by a speaker and hearer. A speech act is a type of move made by one or the other party as a dialogue proceeds. One speech act is the request by one party to the other party to offer an explanation of an anomaly. For each type of move, there are pre-condition rules that set the conditions under which a party is allowed to make that type of move, and post-condition rules that set the allowable replies to each type of move by the other party. Generally, the participants take turns as follows. The explainee makes the first move by asking for an explanation, and then the explainer gets a chance to respond by offering one. If the explainer offers one, the explainee can simply accept it by saying ‘I understand’, but if she replies by saying she does not understand, she can then proceed to ask questions about it. At this point, the dialogue shifts to a different type of dialogue as explained in the section on the closing stage below. Speech acts allowed Assertion: Putting forward a statement, A, B, C, …, is a permissible locution, and truth-functional compounds of statement-letters are also permissible locutions.1 Factual Question: The question ‘A?’ asks ‘Is it the case that A is true?’ Explanation Request: The speech act ‘ExplanAnomx’ makes a request to the explainer to provide understanding concerning some anomaly x. Explanation Attempt: a response to a previous explanation request made by the explainee that purports to convey understanding to the explainee. Inability to Explain Response: ‘I can’t explain it’, concedes that the explainer has no explanation attempt at this point to offer of the statement asked about. Positive Response: A response claiming that the hearer understands an explanation. Negative Response: A response claiming that the hearer does not understand an explanation. Pre-condition rules Pre-condition Rule for an Explanation Request: In order the speech act ‘ExplanAnomx’ to be put forward, the statements fitting in for the x variable must constitute an anomaly. Pre-condition Rule for an Explanation Attempt: The previous move by the other party must be a request for an explanation. 1 Assertions include only statements (propositions), and do not include promises, commands, and so forth.
123
Synthese (2011) 182:349–374
367
Pre-condition Rule for an Inability to Explain Response: The previous move by the other party must be a request for explanation. Pre-condition for the Positive Response: The previous move by the other party must be an explanation attempt. Pre-condition for the Negative Response: The previous move by the other party must be an explanation attempt. Post-condition rules Post-condition Rule for an Explanation Request: An explanation request must be followed at the next move by an explanation response. Post-condition Rules for an Explanation Attempt: An explanation response must be followed at the next move by the other party’s saying ‘I understand it’ or ‘I don’t understand it’. Post-condition for the ‘I understand it’ Response: to be determined below. Post-condition for the ‘I don’t understand it’ Response: to be determined below. The last two post-condition rules are not formulated yet, because of a problem that arises in formulating the rules for the closing stage. This problem is solved in Sect. 8 once the rules for the closing stage have been formulated. 8 The closing stage It appears that there can be two different ways of determining when the closing stage has been arrived at. On one view, the closing stage is reached when the explainer has offered an explanation and the explainee is satisfied with it. The dialogue system for explanation CE (Walton 2007a) was built on the following two rules for the success of an explanation attempt. CESR1. If after any explanation attempt made, the explainee replies by saying, ‘I understand’, the explainer’s clarification attempt is judged to be successful. CESR2. If after any explanation attempt is made, the explainee replies by saying ‘I don’t understand’, the explainer’s explanation attempt is judged to be unsuccessful. These success rules are used in CE to define the closing stage of an explanation dialogue, based on the assumption that the closing stage is reached once the explanation attempt carried out in the dialogue is judged to be successful or unsuccessful. The problem with this way of setting up rules for the closing stage is that in many of the most significant cases, determining success or failure on the basis of whether the explainee says she understands the explanation offered is not enough to close the dialogue. The “feels-right” explanation is often associated with bias (Trout 2002, pp. 223–228). On another view, the dialogue should only be closed when the explanation has been tested, and has been found to have passed the tests that should be required of it. Only then can it be said whether the explanation is truly successful or not. But what is the test? In science, ideally, the test is to collect all the data required to conclusively test the explanation experimentally. But for practical purposes, with many
123
368
Synthese (2011) 182:349–374
of the explanations we give in everyday conversations that are good enough for what is required, resources are not available for collecting more data necessary for a satisfactory tentative explanation to be offered. In many instances, for practical purposes, collecting more data to test the explanation further would be too costly, or would just not be useful because of the limitations of the present needs and circumstances. On this view, testing the explanation by critically probing into gaps and questionable parts in it, based on what is already known, would be good enough to provisionally accept it. Examination dialogue can fit either of these methods of testing. It can proceed by critical questioning in argumentation or by the collection and examination of further data, for example by experimental testing. The context concerning what the purpose of the explanation is supposed to be plays a role in deciding which of the two views is applicable. If the context is that of a scientific inquiry, further testing by collecting of data may be the best criterion for closure. If the explanation is part of an everyday conversational exchange, conducting experimental tests or launching into a detailed scientific explanation might not be appropriate. These moves may even impede the transfer of understanding. Thus we should not take a ‘one shoe fits all’ approach to this problem. In order to keep to the most general approach of building a simple system as a starting point for developing other more complex models of explanation dialogue, we have proposed a middle view between the two views outlined above. This view is tailored to seeing explanation as based on defeasible reasoning that leads to a plausible explanation based on the known facts, but is open to correction or improvement as more data is brought in to fill out an account or support it by external evidence. On this view, an explanation is successful if it is tested by the explainee’s critical questioning that probes into its weak spots, or by examining further data, and if it survives this testing process by answering all the questions satisfactorily. An explanation is unsuccessful if it fails this testing process. The closure rules are meant to solve the problem of the failure cycle illustrated by the possibility of the feedback cycle {3, 4, 5, 10, 11, 3}, illustrated in the typical explanation sequence in Fig. 1, and the problem of the unsuccessful explanation in the example of the science teaching dialogue presented in Sect. 4. For an explanation dialogue to be successful, understanding has to be transferred from the explainer to the explainee. What is the evidence that this transfer has been achieved? It is to be found in the shift to an examination dialogue. However, as noted above, the need to test an explanation, and the extent to which it needs to be tested, vary with the context. In a science class, the anomaly may be posed by a simple misunderstanding that can be explained briefly, and that everyone is satisfied with. In a context of scientific research, the anomaly may be a wicked problem and the explanation of it may be lengthy, complex, and involve experimental testing. Thus the closure rules must allow for such pragmatic variations. The closure rules need to fit the 14-step sequence that leads to closure set out in Sect. 5. If both parties are satisfied with the explanation offered, that can be the end of the dialogue. There may be no need to go into more depth. However, if either party is not satisfied, he or she can ask more questions, extending the dialogue. If the explainee is not satisfied, she can ask more questions, and may need to (step 5). The explanation should proceed in this direction, ideally until the explanation finally makes sense to
123
Synthese (2011) 182:349–374
369
the explainee. This may never happen, so in practice some limit will need to be set on the time or cost. The explanation is only successful however, if the anomaly is removed and the explainee understands what she asked about. If the explainer is not satisfied that the explainee really understands, then as shown at step 7, there may need to be a shift to an examination dialogue. This way of handling explanation attempts suggests the following closure rules. Closure Rule 1: If both parties are satisfied, the dialogue can be closed. Closure Rule 2: If the explainee is not satisfied, she should ask further questions, continuing the dialogue until it has reached a point where either (a) she is satisfied or (b) her questioning must be closed off for practical reasons. Closure Rule 3: If the explainer is not satisfied, there should be a shift to an examination dialogue in which the explainee’s understanding of the explanation is tested. Closure Rule 4: The examination dialogue terminates when either (a) the explainer is satisfied or (b) his questioning must be closed off for practical reasons. Closure Rule 5: When the examination dialogue ends, there is a shift back to the continuation of the original explanation dialogue. Closure Rule 6: The explanation dialogue terminates when either (a) there has been a transfer of understanding of the kind required or (b) it must be closed off for practical reasons. These rules are meant to be realistic, in that they allow for the possibility that the dialogue may need to be terminated even though it is not known whether it has been successful or not. The examination dialogue tests the understanding of both parties, and ideally expands it, so that these gains can be transferred into a better explanation. One might still want to object that these closure rules might work well enough for everyday conversational explanations and practical explanations, where the issue is not how deeply either party understands. A good enough explanation to do the job, or to move the conversation forward, is all that may be required. However, one might object that in cases of scientific explanations, not in a teaching setting necessarily, but in the context of scientific research and investigation, objective standards are necessary. Whether each party is individually satisfied is not a high enough standard. A way to respond to this objection is to introduce a third party into the dialogue. According to the dialectical model proposed by Pera (1994, p. 133), the structure of scientific argumentation is a dialogue structure with three participants, an inquiring community C1 , nature N, and another community C2 . C1 is an inquirer who asks a question, poses a problem, or puts forward a hypothesis h and tries to support it with observations or experimental results. N provides data e. Then a discussion takes place between C1 and C2 in a framework F that Pera calls “the factors of scientific dialectics”. F can perhaps be seen as a set of dialogue rules appropriate for the discussion. These rules define what counts as evidence, what sorts of argument are allowed as relevant, and what the standards of proof are. Because of the importance of debate in this model, and because of the role of “dialectical techniques of confutation and persuasion” in it, Pera (1994, p. 133) says he will call it the dialectical model. Pera’s dialectical model provides an elegant way of extending the basic two-party dialogue structure of the Explan system specification to a three-party dialogue structure that could be used to model scientific explanation. However, the reader will recall
123
370
Synthese (2011) 182:349–374
that it is beyond the scope of this paper to use the Explan system to build more complex dialectical systems that can be used to model more specific contexts of explanation like scientific explanation. This task must be left as a problem for future research. 9 Problems for further research The following problems for further research are singled out as the most important. #1 How can Explan help us to determine whether something in a text is an argument or an explanation? #2 How can we build a useful typology of types of explanations for use in Explan? #3 How well does Explan apply to explanation of human actions, for example in history and law? #4 Can Explan model understanding in science, and apply it to case studies of scientific explanations? With respect to problem #1, it can be said, broadly speaking, that the goal of an argument is to remove doubt, whereas the goal of an explanation is to convey understanding of an anomaly in a given account. But how do we determine whether the purpose of some discourse fits one or the other of these goals? We have to examine the text of the case carefully for textual indicators of the kind studied by Snoeck Henkemans (1992). However, the key to doing this lies in the pre and post-conditions for the speech acts. An argument is not only put forward in a different way from an explanation, but is reacted to in a different way as well. How can the Explan system be helpful for this job, when it is carried out in a way comparable to the work on identifying arguments in texts? The aim of this investigation was not to provide a typology of different types of explanations. There was no space here for this project, even though it is a prerequisite for building formal dialogue systems based on different kinds of explanation questions, like how questions, why questions, questions asking about human actions, and so forth. With respect to problem #2, it needs to be said that there are typologies of explanation questions, but there is little agreement among them, and none of them seems especially useful for developing the Explan system in this direction. Perhaps the reason for the heterogeneous variety is that they come from different fields, like logic, computing, linguistics and psychology, and they seem to have different purposes in mind for using explanations. It can be suggested, however, that a good place to start is the categorization scheme for types of explanations given by Kass and Leake (1987), based on and their large collection of examples of anomalies and explanations. The classification of different types of explanations given in the categorization scheme of Kass and Leake (1987, pp. 3–4) provides a hierarchy of types of explanations divided at the top level into three types of explanations. • •
Explanations involving intentional actions, for example an explanation of a person’s decision to drop out of school. Such explanations involve plans and goals. Explanations involving material forces, for example, an explanation of an unexpected snow storm caused by material forces. This type of explanation also
123
Synthese (2011) 182:349–374
•
371
includes cases like device problems and the lack of a resource necessary for an event to take place. Explanations involving social forces, for example an explanation of an increase in the crime rate. This type of explanation does not involve plans and goals, and excludes explanations of goal-directed actions by institutions. It involves behavior that results from the interactions of many independent agents whose actions are not coordinated.
Kass and Leake (1987, p. 3) note, however, that in some cases more than one type of explanation may be applicable. For example, if we are trying to explain why the government wastes money, we might offer an intentional explanation, like “they think they can solve every problem by throwing money at it”, or we might offer a social explanation, such as “the interaction of branches of government causes huge overhead”. This categorization scheme, along with the many examples of everyday explanations collected by Kass and Leake is a good place to begin the study of different types of explanations. The category of intentional actions brings us to problem #3. There is huge literature on problem #3, both in computing, especially in the field of planning, and in philosophy, especially philosophy of history. Collingwood (1946) called the simulative process used by the historian “re-enactment” (Dray 1995). Dray (1964, pp. 11–12), described the components of Collingwood’s theory of re-enactment: in these words: “Clearly the kinds of thoughts which Collingwood’s theory requires are those which could enter the practical deliberations of an agent trying to decide what his line of actions should be”. There are some nice resources in argumentation and computing that arise from the argumentation scheme for practical reasoning (Atkinson et al. 2006). Explanation of human actions, of the kind especially common in history and law, is typically based on goal-directed reasoning. One agent explains the actions of the other by attributing presumed goals to the other. Goal-directed or means-end reasoning, called practical reasoning, is used in planning in AI (Bratman et al. 1988). Value-based argumentation frameworks employ schemes for practical reasoning in a dialogue framework (Bench-Capon 2003). Pera’s dialectical model of science, as noted at the end of Sect. 8, provides an elegant way of extending Explan to confront problem #4, but any attempt to move in the direction of applying Explan to scientific explanations also takes us to the problem of precisely defining the notion of scientific understanding. In a case of scientific explanation, say explaining friction as a macro-phenomenon by talking about the micro-properties of surfaces, “it is clear that we are now constrained to explanations using the primitives and laws of physics” (Scriven 2002, p. 50). It is the phenomena of everyday experience that need to be understood in a special way, and it is the laws and primitives of physics that are taken to be understood. Hence, as Scriven points out, scientific explanation is not reduction to the familiar, but transfer of a special kind of understanding required by a special kind of explanation. There is a growing literature on helping us to better understand the special notion of understanding in the natural sciences (Friedman 1974; Trout 2002; Moulin et al. 2002). Finocchiaro (1980) has used case studies of scientific discovery to show how scientific explanation can be viewed as a dialectical process of growth of understanding
123
372
Synthese (2011) 182:349–374
as questions are asked and hypotheses are offered as answers that require experimental testing. The Explan system offers a syntactic structure for explanation dialogue by specifying the form each move must take at each of the three stages of such a dialogue, and by giving pre and post-conditions for each move in such a dialogue, but it does not yet define a precise semantics for the system. A semantic structure is also needed that specifies the units of understanding, and how they are sent as messages in the dialogue from the one party to the other. So far this structure has not yet been provided, in any precise way. The best we have been able to do so far is to use existing resources of case-based reasoning to model in a general way how understanding is successfully transferred. This process is successfully carried out when an anomaly in an existing script is queried by one party and then resolved by the other party by patching up the existing script to fit it all together better so that it now makes sense to the questioner. The outcome should be a change from a script that was previously fragmented (in the understanding of the questioner) to a script that is fitted back together. It can be noted that legal explanations also have a three-party dialogue structure consisting of the pro side, the contra side, and a third party trier, a judge or a jury. The third party listens to the arguments put forward and queried by the other two parties and weighs them as weaker or stronger. In audience-specific value-based models of persuasion dialogue (Bench-Capon et al. 2007), the audience is identified with an ordering of values. A given argument is assessed by the audience in accordance with its preferred values. In Gordon and Walton (2009), the audience weighs the relative strength of arguments presented to them, and an argument evaluation structure associates an audience with a stage of dialogue and assigns proof standards to propositions. When Explan is extended by adding a third party audience, this audience uses standards for the success of an explanation to judge whether the given explanation is more satisfactory or less satisfactory. Precisely how legal explanations can best be modeled in three-party dialogues along these lines, however, remains a problem for further research.
10 Conclusions This paper has defined the components needed for a dialectical system specification of explanation discourse called Explan, and showed how to combine these components to produce the system specification. It offers a dialogue structure with three stages, an opening stage, an explanation stage and a closing stage. One problem encountered was that of the failure cycle that can occur in the closing stage, and this problem was solved by carefully specifying the rules for the closing stage. Another problem was to devise a means for testing the success of an explanation. This problem was solved by embedding an examination dialogue into the explanation dialogue. In legal explanations in a courtroom setting, there are rules for examinations and cross-examinations. In scientific explanations, the process of examination involves close scrutiny of the data provided by nature, and the designing and running of experiments to test a hypothesis. By solving these problems, the system specification builds a process model of explanation in which two parties take turns making moves according to procedural rules. The rules set out a normative model for explanation so that any example of a
123
Synthese (2011) 182:349–374
373
real explanation can be evaluated as reasonable or not according to standards set by the stages, the rules, and the goal of the dialogue. For example, circular explanations can be evaluated as unsuccessful on the basis that they fail to transfer understanding in the way required of a successful explanation in the Explan system. There has been not enough space to test the Explan system with many examples of explanations, but resources for such a testing process have been given by Kass and Leake (1987). They built up a corpus, the Yale explanation corpus of 170 anomalies, with one or more explanation for each, yielding a total of over 350 explanations. The model is a system specification that can be used to build specific dialectical systems meant to be applicable to realistic cases of explanations of different kinds. The intent is to produce a dialogue system specification that is very general so that it can accommodate many different formal models of explanation dialogue that fit the general pattern of the system, and many different dialectical contexts of use, like everyday conversational explanations, scientific explanations, explanations in special scientific fields like computing, historical explanations, legal explanations, and so forth. Because of the extreme generality of such a project, and because of its wide breadth of application to so many different kinds of explanations in different contexts and fields, many unsolved problems have had to be left for future research. Still, the Explan system specification provides a way of moving forward in a direction that is different from the traditional one but that is attracting more and more interest in the past few years. Acknowledgements This work was supported by a Research Grant, ‘Argumentation in AI and Law’, from the Social Sciences and Humanities Research Council of Canada. I would like to thank the members of CRRAR for comments made when I read an earlier draft of this paper to them, especially Steven Patterson and Jin Rongdong, whose comments led to some especially significant improvements.
References Aristotle. (1928). On sophistical refutations. Loeb classical library. Cambridge, MA: Harvard University Press. Atkinson, K., Bench-Capon, T. J. M., & McBurney, P. (2006). Computational representation of practical argument. Synthese, 152, 157–206. Bench-Capon, T. J. M. (2003). Persuasion in practical argument using value-based argumentation frameworks. Journal of Logic and Computation, 13, 429–448. Bench-Capon, T. J. M., Doutre, S., & Dunne, P. E. (2007). Audiences in argumentation frameworks. Artificial Intelligence, 171(1), 42–71. Bench-Capon, T. J. M., Doutre, S., & Dunne, P. E. (2008). Asking the right question: Forcing commitment in examination dialogues. In P. Besnard, S. Doutre, & A. Hunter (Eds.), Computational models of argument: Proceedings of COMMA 2008 (pp. 49–60). Amsterdam: IOS Press. Bratman, M., Israel, D., & Pollack, M. (1988). Plans and resource-bounded practical reasoning. Computational Intelligence, 4(3), 349–355. Cawsey, A. (1992). Explanation and interaction: The computer generation of explanatory dialogues. Cambridge, MA: MIT Press. Collingwood, R. G. (1946). The idea of history. Oxford: Clarendon Press. Dray, W. (1964). Philosophy of history. Englewood Cliffs: Prentice-Hall. Dray, W. (1995). History as re-enactment: R. G. Collingwood’s idea of history. Oxford: Oxford University Press. Dunne, P. E., Doutre, S., & Bench-Capon, T. J. M. (2005). Discovering inconsistency through examination dialogues. Proceedings IJCAI-05 (pp. 1560–1561). Edinburgh.
123
374
Synthese (2011) 182:349–374
Finocchiaro, M. (1980). Scientific discoveries as growth of understanding: The case of Newton’s gravitation. In T. Nickles (Ed.), Scientific discovery, logic, and rationality (pp. 235–255). Dordrecht: Reidel. Friedman, M. (1974). Explanation and scientific understanding. The Journal of Philosophy, LXXI, 5–19. Gordon, T. F., & Walton, D. (2009). Proof burdens and standards. In I. Rahwan & G. Simari (Eds.), Argumentation and artificial intelligence (pp. 239–260). Berlin: Springer. Guthrie, W. K. C. (1981). A history of Greek philosophy. Cambridge: Cambridge University Press. Kass, A., & Leake, D. (1987). Types of explanations. Technical Report ADA183253. Alexandria, VA: U.S. Department of Commerce. Leake, D. B. (1992). Evaluating explanations: A content theory. Hillsdale: Erlbaum. Moore, J. D. (1995). Participating in explanatory dialogues. Cambridge, MA: MIT Press. Moulin, B., Irandoust, H., Belanger, M., & Desbordes, G. (2002). Explanation and argumentation capabilities. Artificial Intelligence Review, 17, 169–222. Parsons, S., & Jennings, N. R. (1997). Negotiation through argumentation: A preliminary report. In M. Tokoro (Ed.), Proceedings of the second international conference on multi-agents systems (pp. 267– 274). Menlo Park, CA: AAAI Press. Pera, M. (1994). The discoveries of science. Chicago: The University of Chicago Press. Prakken, H. (2005). Coherence and flexibility in dialogue games for argumentation. Journal of Logic and Computation, 15, 1009–1040. Prakken, H. (2006). Formal systems for persuasion dialogue. The Knowledge Engineering Review, 21, 163–188. Reed, C. (2006). Representing dialogic argumentation. Knowledge-Based Systems, 19(1), 22–31. Schank, R. C. (1986). Explanation patterns: Understanding mechanically and creatively. Hillsdale, NJ: Erlbaum. Schank, R. C., & Abelson, R. P. (1977). Scripts, plans, goals and understanding. Hillsdale, NJ: Erlbaum. Schank, R. C., Kass, A., & Riesbeck, C. K. (1994). Inside case-based explanation. Hillsdale, NJ: Erlbaum. Schank, R. C., & Riesback, C. K. (1981). Inside computer understanding. Hillsdale, NJ: Erlbaum. Schlangen, D. (2004). Causes and strategies for requesting clarification in dialogue. In M. Strube & C. Sidner (Eds.), Proceedings of the 5th SIGdial workshop on discourse and dialogue (pp. 136–143). East Stoudsburg, PA: Association for Computational Linguistics. http://acl.ldc. upenn.edu/hlt-naacl2004/sigdial04/pdf/schlangen.pdf. Scriven, M. (1972). The concept of comprehension: From semantics to software. In J. B. Carroll & R. O. Freedle (Eds.), Language comprehension and the acquisition of knowledge. (pp. 31–39). Washington: W. H. Winston & Sons. Scriven, M. (2002). The limits of explication. Argumentation, 16, 47–57. Singh, M. P. (1999). A semantics for speech acts. Annals of Mathematics and Artificial Intelligence, 8, 47–71. Snoeck Henkemans, F. (1992). Analyzing complex argumentation: The reconstruction of multiple and coordinatively compound argumentation in a critical discussion. Amsterdam: SICSAT. Trout, J. D. (2002). Scientific explanation and the sense of understanding. Philosophy of Science, 69(2), 212–233. Unsworth, L. (2001). Evaluating the language of different types of explanations in junior high school texts. International Journal of Science Education, 23, 585–609. Verheij, B. (2003). Dialectical argumentation with argumentation schemes: an approach to legal logic. Artificial Intelligence and Law, 11, 167–195. von Wright, G. H. (1971). Explanation and understanding. Ithaca, NY: Cornell University Press. Wagenaar, W. A., van Koppen, P. J., & Crombag, H. F. M. (1993). Anchored narratives: The psychology of criminal evidence. Hertfordshire: Harvester Wheatsheaf. Walton, D. (2003). The interrogation as a type of dialogue. Journal of Pragmatics, 35, 1771–1802. Walton, D. (2006). Examination dialogue: An argumentation framework for critically questioning an expert opinion. Journal of Pragmatics, 38, 745–777. Walton, D. (2007a). Dialogical models of explanation. Explanation-aware computing: Papers from the 2007 AAAI workshop. Technical Report WS-07-06 (pp. 1–9). Menlo Park, CA: AAAI Press. Walton, D. (2007b). Clarification dialogue. Studies in Communication Sciences, 7, 165–197. Walton, D., & Krabbe, E. C. W. (1995). Commitment in dialogue. Albany: State University of New York Press.
123
Synthese (2011) 182:375–391 DOI 10.1007/s11229-010-9749-8
Disproportional mental causation Justin T. Tiehen
Received: 27 January 2010 / Accepted: 27 April 2010 / Published online: 21 May 2010 © Springer Science+Business Media B.V. 2010
Abstract In this paper I do three things. First, I argue that Stephen Yablo’s influential account of mental causation is susceptible to counterexamples involving what I call disproportional mental causation. Second, I argue that similar counterexamples can be generated for any alternative account of mental causation that is like Yablo’s in that it takes mental states and their physical realizers to causally compete. Third, I show that there are alternative nonreductive approaches to mental causation which reject the idea of causal competition, and which thus are able to allow for disproportional mental causation. This, I argue, is a significant advantage for such noncompetitive accounts. Keywords
Mental causation · Nonreductive physicalism · Proportionality · Yablo
Gilmore is in agony. His toothache keeps getting worse, and now he has broken down and started to cry. If nonreductive physicalism is true there must be some physical realizer of Gilmore’s pain; call the type of realizer ‘P1 ’. Then, letting M be the proposition that Gilmore is in pain, P1 that he is in a P1 state, and E (for effect) that he cries, consider the following counterfactual. [Cry]: (M & ∼P1 ) > E. That is, if Gilmore had been in pain but not P1 , he still would have cried. Here is one way this could be true. Imagine that pain is multiply realized, with P1 and P101 being its two nomologically possible physical realizers. Imagine next that there is a physical law that P101 states causally necessitate crying. Then, putting things in terms of the standard Stalnaker–Lewis analysis of counterfactuals, the closest world where Gilmore is in pain but not P1 will be a world where his pain is realized by P101 .
J. T. Tiehen (B) University of Puget Sound, Tacoma, WA, USA e-mail:
[email protected]
123
376
Synthese (2011) 182:375–391
Assuming that the physical law just mentioned obtains in this closest world, Gilmore will cry there. Thus, [Cry] is true. A number of philosophers think that counterfactuals like [Cry] are crucially important in accounting for mental causation, and more specifically in solving the causal exclusion problem facing nonreductive physicalists.1 Especially influential on this front has been Stephen Yablo, who in a series of well known papers has defended an account of mental causation based on such counterfactuals.2 In this paper I argue that such an approach is misguided. In Sects. – , I argue that accounts of mental causation that appeal to counterfactuals like [Cry] are susceptible to counterexamples of a certain sort: counterexamples involving what I call disproportional mental causation. In Sect. , I argue that the core problem with an account like Yablo’s is not its appeal to counterfactuals per se, but rather its guiding idea that mental and physical states causally compete. Any nonreductive view that posits such causal competition will be susceptible to counterexamples involving disproportional mental causation. Finally, in Sect. , I show that there are alternative nonreductive approaches to mental causation that do not posit causal competition and which thus are able to account for disproportional mental causation. This point is developed into a novel argument in favor of such non-competitive approaches. I We begin by reviewing Yablo’s account of mental causation, which comes in two parts. First part: Yablo claims that the realization relation that obtains between mental and physical states is either identical or at least very similar to the determination relation that obtains between determinables and determinates.3 On this view, the sense in which scarlet and crimson are different ways for an object to be red is either identical to or at least very much like the sense in which being in P1 and being in P101 are different ways for the subject in our example to be in pain. If, as Yablo further contends, determinables and their determinates do not compete for causal influence—which Yablo understands as “encompassing everything from causal relevance to causal sufficiency”4 —then this initial move of construing realization in terms of determination promises to go some way by itself toward dissolving the exclusion problem.5 For the sake of my argument I am willing to grant Yablo this component of his view. I will assume for now that realization just is determination. The second part of the view is Yablo’s claim that causes generally are proportional to their effects, meaning, inter alia, they do not incorporate detail irrelevant to those effects. This idea is captured with the following principle. 1 Kim (1998) provides the classic presentation of the problem. 2 See Yablo (1992, 1997, 2003). Other authors who appeal to counterfactuals relevantly like [Cry] include
LePore and Loewer (1987), Loewer (2001), Mills (1996), Bennett (2003) and Bealer (2007). 3 In his (1992) he claims that realization is determination, while in his (1997) Yablo weakens it to the claim
of strong similarity. 4 Yablo (1992, p. 274). 5 Although see Gillett and Rives (2005), who raise exclusion worries for determinables.
123
Synthese (2011) 182:375–391
377
[PP]: A state D incorporates detail that is irrelevant with respect to an effect E, and so does not cause E, if there is some state C such that C is a determinable of D and the following counterfactual is true: (C&∼D) > E. To see [PP] in action, consider Sophie the pigeon who pecks whenever she sees red.6 When presented with a scarlet triangle, Sophie pecks. Is the triangle’s being scarlet properly regarded as causing Sophie’s pecking? No it is not, says [PP]. For red is a determinable of scarlet, and had the triangle been red without being scarlet—for instance, had it been crimson—Sophie still would have pecked. The intuition you are invited to have here is that it is the triangle’s being red and not its being scarlet which causes the pecking.7 The Yablo view, then, is what while determinables and determinates do not compete over causal influence—again, understood as encompassing causal relevance and sufficiency—they do compete when it comes to causation itself. In the present case, the triangle’s being red wins the causal competition; its being scarlet loses. The same dynamic is alleged to arise in the mental/physical case. Reconsidering Gilmore’s pain, and granting again that realization is determination, [PP] entails that if [Cry] is true then Gilmore’s P1 state incorporates detail irrelevant to his crying, and so is not properly regarded as causing his crying.8 This makes room for the possibility that what causes Gilmore’s crying is his pain, just as in the Sophie case what causes the pecking is the triangle’s being red rather than its being scarlet. Again, the intuition you are invited to have here is that the truth of [Cry] shows that the crucial thing for causing Gilmore’s crying is the pain itself, not the particular way it happens to be realized. When a counterfactual like [Cry] is false, on the other hand, the account entails that it is the underlying physical realizer which causes the effect in question, not the mental state being realized. Suppose there is a P1 -detector pointed at Gilmore while he undergoes his pain, and let B be the (true) proposition that the detector beeps, registering the presence of a P1 state. Then the following counterfactual is false: [Beep]: (M&∼P1 ) > B. Had Gilmore’s pain been realized by anything other than a P1 state—for instance, had it had been realized by a P101 state—the P1 -detector would not have beeped. On Yablo’s account the falsity of [Beep] shows that it is the P1 state and not the pain which causes the beeping. In this case at least, the amount of detail the P1 state incorporates is just right.9 Let’s put the two parts of the view together. Again, Yablo claims that realization just is the determinable/determinate relation, and that this entails mental and physical states do not compete over causal influence, understood as encompassing causal 6 Yablo (1992, p. 257). 7 On Yablo’s view the triangle’s being scarlet is causally sufficient for Sophie’s pecking, but does not
cause the pecking. There are serious questions about how exactly Yablo understands the relation between causation and causal sufficiency, but I will not press them here. 8 The P state is still causally sufficient for the crying. See footnote 7. 1 9 Compare Yablo’s (1992, pp. 277–278) discussion of the epiphenomenalist neuroscientists.
123
378
Synthese (2011) 182:375–391
sufficiency and relevance. Supposing this is right, then even if we grant that for any physical effect there is always some purely physical state causally sufficient for that effect—as we must, given the physicalist thesis that the physical realm is causally closed—this by itself does not exclude us from holding that irreducible mental states are still causally relevant to those physical effects. However, merely to say this is not excluded is not yet to make a positive case for mental causation. And it is here, in making the positive case, that causal competition enters the picture, for on Yablo’s view mental states do indeed compete with their physical realizers when it comes to causation itself. No worries though. Mental epiphenomenalism is avoided because mental states often win these competitions. The way to assess which side wins a causal competition is by appealing to proportionality considerations, which involves evaluating counterfactuals like [Cry] and [Beep]. When such counterfactuals are true, as [Cry] is, the winner of the competition is the mental state: Gilmore’s pain, and not his P1 state, causes his crying. When such counterfactuals are false, as [Beep] is, the winner is the physical realizer: Gilmore’s P1 state, and not his pain, causes the detector to beep. II Sometimes critics of this general sort of counterfactual-based approach complain that the counterfactuals in question could be true even while mental states were causally inert. In our example these critics would argue that even if [Cry] is true this does not establish that Gilmore’s pain causes him to cry. Instead, what the truth of [Cry] might reflect is just that if Gilmore’s pain had been physically realized in some other way, that alternative physical realizer would have caused him to cry, causally excluding his pain. In short, these critics charge that the approach fails to come to grips with the true depth of the exclusion problem.10 My objection runs in roughly the opposite direction. What I will argue is that there are cases in which the relevant counterfactual is false and yet we nevertheless have compelling reason to say that there is mental causation. In a sense, my charge is that Yablo takes the exclusion problem too seriously; he concedes too much to those opponents of nonreductive physicalism who push the problem. Getting well ahead of myself, I believe that Yablo’s mistake is to concede that there is at least some domain in which causal competition takes place between mental states and their physical realizers. Nonreductive physicalists should insist there is no form of causal competition at all. For the purpose of constructing a counterexample to Yablo’s account, let’s suppose that the laws of nature are such that there are exactly four nomologically possible physical realizers of pain: P1 , P2 , P101 , and P102 . The subscripted numerals here are meant to track physical similarity: P1 and P2 are very similar to one another, while each is quite different from P101 and P102 . To flesh out the story we can imagine that P1 and P2 are physical realizers found only in human beings, while P101 and P102 are found only in Martians. Next suppose that while P1 , P101 , and P102 states all 10 See for instance Leiter and Miller (1994), who object to LePore and Loewer (1987) along these lines.
123
Synthese (2011) 182:375–391
379
causally necessitate crying, P2 states do not. In fact, we can even add, it is nomologically impossible for a being in a P2 state to cry. Again to flesh out the story, we can imagine that people with P2 realizers behaviorally manifest their pains in various ways. They wince, they scream, they gnash their teeth, and so on. But never do they cry.11 Given this setup, [Cry] is false. Gilmore is a human being, and so if he had been in pain but not P1 his pain would have been realized by a P2 state.12 It is nomologically impossible for subjects in P2 to cry though. Therefore, if Gilmore had been in pain but not P1 he would not have cried. Even though [Cry] is false, however, we still have compelling reasons to say that Gilmore’s pain causes his crying in the scenario set out. Sects. – are devoted to spelling out these reasons in detail, but as a first pass the guiding idea can be put as follows. Gilmore’s actual pain is realized by P1 , not by P2 . But then, I say, the causal status of his pain should not turn on how things go with P2 . It should turn only on how things go with P1 -realized pains. Since P1 causally necessitates crying, we should conclude that P1 -realized pains, like Gilmore’s, cause crying. If I can defend this view successfully, I will have established a counterexample to Yablo’s account: Gilmore’s pain causes his crying even though [Cry] is false, and thus even though it is not proportional to his crying.
III There are two possible strategies a defender of Yablo could adopt in trying to resist my line of attack. The first is to argue that I have failed to describe a scenario in which [Cry] really would be false; the second is to concede that [Cry] is false but contend that this is no embarrassment to Yablo’s view since Gilmore’s pain does not cause his crying. In this section I will address the first strategy. One way to advance the first strategy would be to raise broadly functionalist worries about whether P2 could qualify as a physical realizer of pain given that it is nomologically impossible for subjects in P2 states to cry. According to functionalism, a physical state is a pain realizer only if it occupies the causal role associated with pain. If a capacity to cause crying is an essential part of that role, then P2 ’s inability on this
11 To make things most realistic, we could suppose not that P , P 1 101 , and P102 states by themselves caus-
ally necessitate crying, but rather that there is some background condition BC such that any one of these states taken together with BC necessitates crying. By extension we could then suppose not that P2 by itself is nomologically incompatible with crying, but rather that it taken in conjunction with BC is. So, people with P2 realizers do sometimes cry, but never when BC obtains—P2 is incompatible with crying given BC. For ease of exposition I leave out explicit reference to background conditions in the text, but I invite you to read them into my discussion if it makes the example more compelling. 12 The present argument could make do with the weaker claim that if Gilmore had been in pain but not P 1
he might have been in P2 . On the standard analysis of counterfactuals this would be enough to ensure the falsity of [Cry]. I make the stronger claim in the text because it allows me to simplify my presentation and because I find the stronger claim not implausible. See footnote 20 though.
123
380
Synthese (2011) 182:375–391
front disqualifies it as a pain realizer. If this is right, then the Gilmore scenario we have imagined is impossible upon closer inspection.13 In response, I note that we can assume that P2 does an otherwise perfect job at occupying pain’s causal role. P2 states are causally necessitated by tissue damage, they causally necessitate wincing, screaming, teeth gnashing, and so on. The only blemish on their causal role résumé is the part about the crying. If P2 does such a nearperfect job at occupying pain’s functional role, it is utterly implausible to disqualify it as a realizer of pain on functionalist grounds. In case this implausibility needs to be drawn out a bit, consider an intuition pump. Imagine I come up with an evil new medical experiment I want to run on human beings. The good news is that the experiment promises to increase our medical knowledge ever so slightly; the bad news is that it is excruciatingly painful for typical subjects to participate in. Now, as it turns out, in one tenth of all human beings the closest thing there is to a perfect realizer of pain’s functional role is the merely near-perfect P2 . Just to be clear: this bit of the story begs no question against proponents of the present functionalist-inspired line, for that line does not entail that a state with P2 ’s causal profile is impossible, nor does it entail that such a state could not be found in human beings. Rather, what the view in question entails is that if a tenth of the population were to have P2 states, then that tenth would not really undergo pain, since they would have no physical state perfectly realizing pain’s functional role. Given the functionalist-inspired view in question, then, it would seem there could be no strong moral objection to me performing my medical experiments on the tenth of the population with P2 states. For, although the tenth will wince and scream and gnash their teeth during the experiment; although they will hide from us to try to avoid participating in it, and then later beg and plead with us after we find them; although they will act like typical pained human beings in almost all respects, they will not really be in pain. They cannot really be in pain, according to this functionalist-inspired view, since they do not cry. This alone is grounds enough to deny that their P2 states are pain realizers, and thus to deny that they feel pain. They do not cry, so we can start the medical experiments with an easy conscience! Thankfully, most functionalists are not committed to such a crazy view. Following David Lewis, many functionalists appreciate that the psychological theory used to functionally define mental states (i.e., the theory functionalists Ramsify in generating their functional definitions) might not be perfectly realized, but that if it’s close enough—if, for instance, the disjunction of the conjunctions of most of the clauses of the theory is true—we should still say that the states defined by the theory obtain (i.e., that the mental terms of the theory refer). This is what we find in our present scenario with P2 : the vast majority of the clauses of the defining psychological theory will be true of people with P2 states, with the only false clause being that pain causes crying in them. Adopting terminology from Lewis, what we should say in such a case is that since P2 does such a near-perfect job occupying pain’s functional role, it counts as a 13 Fodor (1991, p. 25) briefly considers this sort of objection in response to an argument made by Schiffer
(1991), but ultimately sets it aside. According to Fodor, a state like P2 could qualify as a pain realizer, despite its inability to cause crying, if it has something sufficiently in common with other pain realizers. As I’m about to argue, this holds in our example.
123
Synthese (2011) 182:375–391
381
near-realizer of pain, and thus as a realizer-simpliciter of pain.14 On this very familiar and most plausible version of functionalism, beings in P2 states are in pain, just as my objection to Yablo supposes. Moving on, here is another way one might argue that I have failed to describe a scenario in which [Cry] is false. Even if one grants that P2 qualifies as a pain realizer, one might contend that it is a comparatively unusual realizer given that it does not causally necessitate crying. From there, one could then argue that the nearest worlds in which Gilmore is in pain but not P1 are not worlds where his pain is realized by a P2 state. Those worlds are far away. The closest worlds where his pain is alternatively realized are worlds where it is realized by P101 or P102 , since these are the more typical realizers. If so, then [Cry] is true since P101 and P102 states causally necessitate crying. In short, the idea is that the unusualness of P2 as a physical realizer of pain trumps its physical similarity to P1 when it comes to determining the proximity of worlds for the sake of evaluating [Cry]. In response, I note that we can simply stipulate that P2 is no more unusual as a realizer of pain than any of the other realizers are. We can stipulate that it is nomologically impossible for P1 states to causally necessitate wincing, for P101 states to necessitate screaming, and for P102 states to necessitate teeth-gnashing. Each of these realizers is then just like P2 in doing a merely near-perfect job occupying pain’s associated causal role. If so, then P2 isn’t a comparatively unusual realizer of pain after all, and the present line of defending Yablo fails to get off the ground. I cannot think of any other promising argument for denying that [Cry] is false in the scenario described, so at this point let’s shift to consider the second strategy that a defender of the Yablo account could adopt. She could concede that [Cry] is false but contend that this poses no problem for the account because the correct verdict in the Gilmore case is that Gilmore’s pain does not cause him to cry. I have two independent arguments against this suggestion.
IV First, consider Counterpart Gilmore, an intrinsic duplicate of Gilmore’s who inhabits a different possible world. Counterpart Gilmore’s world is as much like Gilmore’s as possible except that there P2 states do causally necessitate crying. Suppose then that, like Gilmore, Counterpart Gilmore suffers a pain, his pain is realized by a P1 state, and he cries. When evaluated with respect to Counterpart Gilmore, [Cry] will thus be true, not false. Had Counterpart Gilmore been in pain but not P1 , his pain would have been realized by P2 , and P2 states causally necessitate crying at Counterpart Gilmore’s world. Therefore, had Counterpart Gilmore been in pain but not P1, he would have cried. Yablo’s account thus entails there is a deep causal difference between Gilmore’s pain and Counterpart Gilmore’s pain. Since [Cry] is true when evaluated with respect to Counterpart Gilmore but false when evaluated with respect to Gilmore, it follows that Counterpart Gilmore’s pain is proportional to crying while Gilmore’s pain is not. 14 See Lewis’s (1970, p. 432) discussion of near-realization.
123
382
Synthese (2011) 182:375–391
In turn it follows that Counterpart Gilmore’s pain causes his crying but Gilmore’s pain does not. It is implausible, however, that the two pains could causally differ in any important way. The only difference between them is that one takes place in a world where P2 states causally necessitate crying while the other does not. How could this difference make a difference? After all, neither Gilmore nor Counterpart Gilmore is in a P2 state. We can further suppose that neither Gilmore nor Counterpart Gilmore will ever in their lives be in a P2 state, or even have the slightest causal interaction with one. P2 states are as utterly unconnected to Gilmore and Counterpart Gilmore as anything in their worlds are. If so, I cannot see how a causal difference between the two pains could be grounded in this sole, seemingly irrelevant difference in the laws between their worlds. This argument turns on a comparison of intrinsic duplicates from worlds governed by different laws of nature. This might spark concern: we don’t generally expect intrinsic duplicates to be causally alike if the laws at their worlds differ. The concern is misplaced. For one thing, I emphasize again that the difference in laws has to do with P2 states only, and these laws seem irrelevant to Gilmore and his Counterpart. For another, it is possible to recast the preceding argument in epistemic terms that eliminate entirely the comparison across worlds. In the epistemic version of the argument we imagine that we presently know that the pain realizers P1 , P101 , and P102 all causally necessitate crying. We also presently know that there is a fourth realizer, P2 . What we do not know yet is whether P2 causally necessitates crying. Given this state of knowledge, the question once again is whether Gilmore’s P1 -realized pain causes his crying. Yablo’s account entails that in order to answer this question, we need to learn that which we presently do not know: whether P2 causally necessitates crying. To learn that it does would be to learn that [Cry] is true and thus (according to Yablo) that Gilmore’s pain causes his crying. To learn that it does not would be to learn that [Cry] is false and thus (according to Yablo) that Gilmore’s pain does not cause his crying. In order to figure out whether there is mental causation in the Gilmore case, then, we need to set Gilmore himself aside and go study P2 states, see what they causally necessitate. This cannot be right. Given that P2 is as utterly unconnected to Gilmore as anything in the world is, I say that this cannot be the way to come to know whether there is mental causation in Gilmore’s case. If my verdict is correct then Yablo’s account must be wrong. This epistemic argument is a recognizable variation on the Counterpart Gilmore argument, but, again, avoids comparison across worlds with different laws. I return to the Counterpart Gilmore version, though, in order to frame my final points for this section. So far I have aimed to establish that Gilmore’s pain could not differ in causal status from Counterpart Gilmore’s pain. I now add the further claim that both should be regarded as having positive causal status—that is, as causing crying. For, if mental causation ever takes place, then surely it takes place with Counterpart Gilmore’s pain. Any viable account of nonreductive mental causation will agree with this conclusion, including Yablo’s. Since the two pains cannot causally differ, it thus follows that mental causation takes place with Gilmore’s pain as well. And so, despite the falsity of [Cry], we have compelling reason to hold that Gilmore’s pain causes his crying.
123
Synthese (2011) 182:375–391
383
V The second argument for the causal efficacy of Gilmore’s pain requires some stage setting. There clearly is no strict, exceptionless law linking pain to crying at Gilmore’s world, given that not all pains are accompanied by crying there—P2 -realized pains are not. In principle, a Yablo defender could try to use the absence of a strict law to argue against the causal efficacy of Gilmore’s pain. This would not be a promising line to take, however. For, it is widely accepted that in the actual world there are no strict, exceptionless psychological laws.15 At best, it is held, there may be ceteris paribus laws. Now, perhaps the most influential account of ceteris paribus psychological laws is due to Jerry Fodor.16 For the sake of the argument that follows I don’t need to endorse everything Fodor says, or even commit myself to there being such things as ceteris paribus laws. Perhaps whatever true ceteris paribus psychological generalizations there are should be assigned a status short of full-blown lawhood.17 The one element of Fodor’s account I’ll be using is his view of what goes on “beneath” a ceteris paribus psychological generalization, at the underlying level of physical realizers. There is no strict law linking pain and crying at Gilmore’s world, but perhaps we can say that pained subjects cry there, ceteris paribus. That is, they cry provided that their pain is not realized by a P2 state. P2 -realized pains constitute an “absolute exception” to the ceteris paribus generalization that crying accompanies pain.18 Fodor’s view is that the ceteris paribus nature of psychological generalizations is generally to be explained at the underlying physical level in terms of realization-specific absolute exceptions of this very sort. It is precisely because there are such absolute exceptions that the true psychological generalizations there are hold only ceteris paribus rather than strictly.19 If Fodor is right, then there is nothing unusual or artificial about the Gilmore case we have constructed. Take any actual psychological generalization you please and there will be some absolute exception to it, relevantly like P2 -realized pains. Relating the discussion back to Yablo, I claim that if we deny mental causation in Gilmore’s case, then by parity of reasoning we might well be forced to deny that there is any mental causation here in the actual world. To make the argument explicit, let’s call a world a Fodor World if it satisfies each of the following three conditions: (i) every true psychological causal generalization there holds only ceteris paribus; (ii) every such generalization has an absolute exception, in the sense just spelled out; and (iii) for each mental state falling under a given generalization, the nearest
15 Davidson (1970) is the classic defense of this thesis, but many philosophers’ accept Davidson’s conclusion without accepting his reasoning. 16 Fodor (1991). See also the further discussion in Fodor (1989, 1991). 17 For arguments against there being any ceteris paribus laws see for instance Schiffer (1991) and Earman
et al. (2002). 18 The term “absolute exception” is taken from the exchange between Schiffer (1991) and Fodor (1991), who use it to cover cases exactly like Gilmore’s. 19 See Fodor (1991, pp. 108–111). An attractive feature of this view is that by positing absolute exceptions,
Fodor is able to explain how it could be that psychological generalizations hold only ceteris paribus while the physical laws governing realizers are all exceptionless.
123
384
Synthese (2011) 182:375–391
counterfactual world where that mental state is alternatively realized by an absolute exception to the generalization is at least as close as the nearest counterfactual world where the mental state is alternatively realized by a non-exception to the generalization.20 Intuitively, the idea is that a Fodor World is a place where every putative instance of mental causation is relevantly like the case involving Gilmore’s pain. At a Fodor World then every counterfactual relevantly like [Cry] is false. Therefore, just as Yablo’s account entails that Gilmore’s pain does not cause his crying, it also entails that there is no mental causation at all in a Fodor World. Two points about Fodor Worlds. First, it would seem to be a wide open empirical question whether the actual world is a Fodor World. I can see no armchair reason to think it is not.21 Therefore, it is a wide open empirical possibility on Yablo’s view that there is no actual mental causation. Second, even if the actual world isn’t a Fodor World, it might at least turn out to be rather Fodorish. For instance, it might turn out that while no actual putative instance of mental causation involving a desire is Gilmore-like, every putative instance involving a belief is. Or it might turn out that half of all putative mental causation is Gilmore-like while half is not. These, again, would seem to be wide open empirical possibilities. Therefore, it is a wide open empirical possibility on Yablo’s view that there is no belief causation, or that half of all putative mental causation is bogus. These results are unacceptable. Much of the motivation behind Yablo’s account is that it promises to save mental causation as we intuitively understand it, more or less. But now this appears very much up in the air. Whether it does or not depends on the empirical question of whether the actual world is a Fodor World, a possibility that I otherwise would have thought nonreductive physicalists could be open to. In order to avoid these results we must reject any view that denies that Gilmore’s pain causes his crying. We must reject Yablo’s view. The preceding argument suggests that the problem the Gilmore case poses for Yablo’s account is potentially quite deep. If all or many actual instances of mental causation are Gilmore-like, then an account of mental causation which yields the wrong verdict on the Gilmore case is doomed to yield the wrong verdict on all or many actual cases of mental causation. 20 Condition (iii) is jointly entailed by (i) and (ii), given the following principle: for each mental state M and for each physical realizer Pi of M, the nearest counterfactual world where M is realized by Pi is just as close as the nearest counterfactual world where M is realized by some other physical realizer P j . In other words, the nearest alternative physical realizations of a mental state are all equally far away. I am not sure this principle is correct, but I am not sure it is incorrect either—maybe, when it comes to assessing counterfactuals like [Cry], all alternative physical realizers should be treated equally. The Gilmore case as it has been set out violates the principle since we have assumed that the nearest world where Gilmore’s pain is realized by P2 is closer than nearest world where it is realized by either P101 or P102 . However, this assumption has been inessential to the argument: even if the nearest P2 -world were merely as close as the nearest P101 -world and P102 -world, everything in my argument would go through given the standard analysis of counterfactuals, since this would still be enough to ensure that if Gilmore had been in pain but not in P1 , he might have been in P2 . See footnote 12. Even if the present principle is to be rejected and (i) and (ii) don’t jointly entail (iii), there are other scenarios on which (iii) could be true. 21 Fodor himself does not explicitly discuss condition (iii). However, he does suggest that both conditions (i) and (ii) hold here in the actual world, and so if (i) and (ii) jointly entail (iii)—as suggested in the last note—he is committed to the actual world being a Fodor World.
123
Synthese (2011) 182:375–391
385
VI From this point on I will take it as established that the preceding arguments give us good reason to reject the Yablo account as it stands. Now, one could agree with this assessment while still thinking there is something importantly right about the appeal to proportionality. A number of philosophers influenced by Yablo have embraced something like the proportionality component of his view while rejecting or at least remaining neutral on his use of counterfactuals.22 What these philosophers take from Yablo are the ideas that (i) there is indeed causal competition between mental and physical states, and (ii) the way to sort out which side wins such a competition is by looking around at what happens with alternative physical realizations of the putative mental cause. Let me say something about each of these in turn. Regarding (i), I have been employing the notion of causal competition throughout this paper, but it will be helpful here to clarify it a bit. Let’s say that a mental state and its physical realizer causally compete just in case there is some important causal dimension along which at most one of the states can have a positive causal status. By “important causal dimension,” I mean to cover things like causal relevance, causal sufficiency, and causation itself. By “positive causal status,” I mean statuses like that of being causally relevant to a given effect (rather than irrelevant), that of being causally sufficient for a given effect (rather than insufficient), and that of being a cause of a given effect (rather than not a cause). When Yablo says that mental states and their physical realizers do not compete over causal relevancy or sufficiency, what he means is that it is possible for both states to have a positive status along these causal dimensions—it is possible for both a mental state and its physical realizer to be causally relevant to some effect, or for both to be causally sufficient for that effect. There is another causal dimension along which there is causal competition according to Yablo, however—that of causation itself. It is not possible for both a mental state and its physical realizer to have the positive status of causing a given effect. This is why Yablo appeals to proportionality—to sort out which state wins the causal competition, which state is awarded the positive status of cause. My root objection to Yablo is really an objection to this competitive element of his view. I believe that the proper account of nonreductive mental causation needs to be thoroughly non-competitive. It needs to say that there is no causal dimension along which mental states and their physical realizers causally compete in the sense spelled out. I will survey a few non-competitive accounts in the next, concluding section, and show that they are not susceptible to the problems I pose for competitive accounts like Yablo’s. In this section, however, my focus will be on (ii), the second idea philosophers have taken from Yablo. I regard the true core of the proportionality component of Yablo’s 22 This includes Shoemaker (2001, 2007) and Williamson (2000, 2005). Shoemaker appeals to proportionality, but explains proportionality in terms of his own subset model of realization rather than counterfactuals. Williamson (Williamson 2000, p. 82) cites Yablo and relies on something like proportionality while arguing for the causal efficacy of knowledge, but, as he later (2005) makes explicit, rejects Yablo’s counterfactualbased approach to causation.
123
386
Synthese (2011) 182:375–391
view not to be his particular use of counterfactuals, but rather his idea of settling a causal competition between a mental state and its physical realizer by looking at what happens with alternative realizations of that mental state. Counterfactuals like [Cry] enter the picture because they provide one way of trying to capture this idea for settling causal competitions. Potentially there are other ways one might try to capture the idea instead, however. Perhaps some of these other ways are not as susceptible to counterexample as Yablo’s counterfactual-based approach is. Along these lines, consider the following conveniently unsophisticated proposal. To determine whether a mental state of type M causes an effect of type E, divide M’s realizers into two groups: (1) those that do causally necessitate E effects, and (2) those that do not. The putative instance of mental causation is then genuine only if group (1) is bigger than group (2). Applying this proposal to the Gilmore case, P1 , P101 , and P102 all belong to group (1) while only P2 belongs to group (2). Therefore, according to the proposal, Gilmore’s pain causes his crying—the proper verdict. What this shows is that it is possible to get the Gilmore case right while still retaining what I’m regarding as the heart of proportionality, the idea of settling causal competitions by looking at what happens with alternative realizations. No doubt, more sophisticated proposals that accomplish this result could be developed as well. Perhaps then the proper conclusion to draw at this point is just that we need a new way of cashing out the guiding idea of proportionality, not that the idea should be jettisoned completely. This is what I now will argue against. I think that the core idea of proportionality needs to be thrown out, regardless of how we cash it out. To make my case I need a new thought experiment, so meet Mullin. Tomorrow is the big logic exam, and Mullin is feeling quite anxious about it. Her anxiety putatively causes her heart to race. Suppose that there are four nomologically possible physical realizers of anxiety: P10 , which realizes it in human beings (including Mullin); P20 , which realizes it in Martians; P30 , which realizes it in Venusians; and P40 , which realizes it in Jupiterians.23 Suppose also that while P10 states causally necessitate heart rate acceleration, all other physical realizers of anxiety fail to do so. Martians, it turns out, have glowing orbs instead of hearts, and so never undergo heart rate acceleration. Venusians have hearts, but hearts made of iron which always beat at the same constant rate. Jupiterians have hearts physically indistinguishable from our own, but the wiring connecting their hearts to their brains is so different from ours that Jupiterian heart rates decelerate during anxiety. At Mullin’s world then it is only when anxiety is physically realized by P10 that heart rate acceleration ensues. The Yablo-style counterfactual is thus false: if Mullin had been anxious but not in a P10 state, her heart rate would not have accelerated. More to the present point, however, it seems that no matter how one might try to cash out the idea of proportionality, of taking mental causation to depend on what happens with alternative physical realizations, any proportionality-based account will be forced to say that Mullin’s anxiety does not cause her heart rate acceleration, since heart rate acceleration does not occur when anxiety is realized by anything other than
23 The similarity of these physical realizers won’t matter for the discussion.
123
Synthese (2011) 182:375–391
387
a P10 state. Reconsider the unsophisticated divide-and-count approach. It entails that Mullin’s anxiety does not cause her heart to race since group (2), which includes P20 , P30 , and P40 , outnumbers group (1), which includes only P10 . Intuitively, this seems like the wrong result. For all we know the real world may be exactly like Mullin’s world. Would learning tomorrow that there are aliens who lack hearts, or whose iron hearts always beat at a constant rate, or whose differently wired hearts decelerate during anxiety, put any pressure on us at all to deny what is presently a matter of commonsense, that in human beings anxiety does cause hearts to race? Surely not. We might qualify our causal claims about anxiety in light of these findings. Instead of saying that anxiety causes heart rate acceleration, full stop, it would be better to say that anxiety causes this effect in human beings. But this still would be to assign causal efficacy to those anxieties that do take place in human beings, and so it would agree with the commonsense verdict that Mullin’s anxiety causes her heart to race. Even stronger anti-proportionality intuitions are generated if we shift to a case in which rationality considerations enter in, and then pit rationality against proportionality. The fact that rationality can oppose proportionality has not been noted in the literature, to my knowledge, and so I think the following case is especially interesting. After hours of studying, Mullin begins her logic exam. The first question presents students with a proposition, Q, which deductively entails another proposition, A, the sought answer. Mullin gets the question right. Putatively, her thought of Q causes her subsequent thought of A. Suppose there are four nomologically possible physical realizers of thinking of Q: P15 , which is the realizer of Mullin’s thought, P25 , P35 , and P45 . Suppose next that while P15 states causally necessitate thinking of A, each of the other physical realizers causally necessitates thinking of the same wrong answer, ∼A. We can imagine the problem as a trick question, where only minds physically built in a certain way avoid being taken in by the trick. Now, some philosophers hold that there are constitutive norms of rationality such that a necessary condition on being a thinker at all is that one be fairly rational.24 There is much to be said for such a view. However, no plausible version of it could be used to rule out as impossible the scenario just described. We can suppose that those beings without P15 realizers are as rationally adept as can be but for the one exception that when they think about Q, they wrongly infer ∼A instead of A. Any version of the constitutive rationality thesis which says that even this is too much irrationality for those beings to qualify as thinkers is absurd. The scenario is metaphysically possible then. What’s more, I don’t see any obvious reason to think it is empirically improbable. To begin with, it would be surprising if facts about physical realization didn’t impose constraints on a thinker’s rational acumen, so that certain rationally proper inferences are simply impossible for thinkers built in a given physical way.25 Assuming this is so, the real question is whether such constraints are uniform across different physical realizations or whether thinkers built 24 Famously, Davidson (1970) assigns this view a central role in his account of mental causation. 25 It is worth reemphasizing here the point made back in footnote 11, that the alleged impossibility may
be relative to certain background conditions. So, for instance, perhaps the inference in question is one that it sometimes possible, but not given limitations on time, attention, and so on.
123
388
Synthese (2011) 182:375–391
one physical way are saddled with different rational limitations than thinkers built another. This is an open empirical matter. Human beings make more errors reasoning with modus tollens than they do with modus ponens. Perhaps in Martians the pattern is reversed, and perhaps in addition there is no explanation for this human/Martian difference at the psychological level, but only one at the underlying physical level. As in the anxiety case, any proportionality-based account of mental causation will entail that Mullin’s thought of Q does not cause her thought of A, and this is true regardless of how proportionality is cashed out. What the present example adds to the discussion is that here, Mullin is performing the way she rationally ought to perform. A thought of the correct answer is precisely what a thought about a logic problem should cause. This, I suggest, makes it even more counterintuitive to follow proportionality here and deny causal efficacy to Mullin’s thought. If we are forced to deny the causal efficacy of anyone’s thought in the example—and, ultimately, I don’t think we are—we should deny it for those beings that get the problem wrong, not Mullin who gets it right. After all, irrationality is the sort of thing we would expect from thinkers whose mental states are being causally excluded. Heightened rationality is not.
VII Nonreductive accounts of mental causation can be divided into two camps: those that deny causal competition between mental and physical states, and those that grant some form of competition but contend mental states often win. As we have seen, proportionality based-accounts belong to the latter group. In constructing cases of disproportional mental causation, what we do is stack the deck of the supposed competition as heavily as possible in favor of the physical realizer, so that it must be the winner if there is a winner at all. For instance, if Mullin’s thought of Q causally competes with her P15 state, then her thought of Q must lose—the case was constructed so as to ensure this result. It is an embarrassment for competitive accounts of mental causation that they are unable to accommodate disproportional mental causation. To avoid this embarrassment, we need to look to noncompetitive accounts instead. One example of a noncompetitive account is Donald Davidson’s.26 Davidson’s view of mental causation is notoriously problematic, but it might still serve as a helpful reference point. According to Davidson, mental event tokens are identical to physical event tokens, while mental types are irreducible to physical types. In addition he holds that causation is a binary extensional relation that holds between events regardless of the properties those events instantiate or exemplify. Given this view of causation, Davidson thinks that an adequate account of mental causation has done its job in full once it has shown mental events are causes. This view thus qualifies as non-competitive by our standards: along the only causal dimension that matters, that of causation itself, it is possible for both mental and physical states to have a positive causal status with respect to a given effect—in fact, given the token identity, it is guaranteed that a mental token has a positive status if and only if physical token does. 26 Davidson (1970), and then further defended and elaborated in his (1993).
123
Synthese (2011) 182:375–391
389
Pretend for a moment Davidson’s view were satisfactory. Looking back at the examples of disproportional mental causation in this paper, we see that Davidson’s account yields the proper verdict each time. Take the logic case. On the Davidson view, Mullin’s thought of Q is token identical with her P15 state, and so there is no room for causal competition between them, not even—pace Yablo—with respect to causation itself. True, the mental type of thinking of Q is irreducible to the physical type of P15 ; but again, types are irrelevant to causation. If we assume that Mullin’s P15 state causes her thought of A, it follows by the token identity that her thought of Q causes this effect. And this, on the Davidson view, is all that matters for mental causation. On the view, whatever happens with alternative physical realizations of the thought of Q is irrelevant to the question of whether Mullin’s particular thought causes her subsequent thought of A. Even if no alternative physical realization of thinking of Q causally necessitates thinking of A, this does nothing to undermine the proposed token identity between Mullin’s thought of Q and her P15 state, and so it does nothing to undermine her thought’s claim to causing her thought of A. Davidson gets disproportional mental causation right—he is able to allow for it. Of course, Davidson’s view is not satisfactory as it stands; that it is not is the starting point of much recent work on mental causation. But this undermines our use of it as a reference point only if the problems facing his account are inherent to noncompetitive views generally. And this does not seem to be the case. Davidson’s main problem was that he was wrong to think that an adequate account could get by merely with showing that coarse-grained Davidsonian mental events are causes. In addition, any satisfactory account needs to establish the causal efficacy of mental properties. It needs to show that mental events are causes by virtue of their mental properties.27 A noncompetitive account that promises to do just this has been developed by David Robb.28 According to Robb, properties, insofar as they are relevant to causal relations, should be conceived as tropes, or abstract particulars. Like Davidson, Robb holds that mental events are token identical with physical events. Going beyond Davidson, he also holds that mental tropes are token identical with physical tropes. The view qualifies as nonreductive because in addition to events and tropes it also recognizes types, conceived as either universals or resemblance classes of tropes. Mental types are distinct from physical types (so understood) for familiar reasons of multiple realizability. Because types are not relevant to causation, however, this type distinctness does not open the door to causal competition. On Robb’s view, Mullin’s thought of Q is the same event as her P15 state, and so if the P15 state causes the thought of A, it follows that the thought of Q causes this effect. But in addition, on Robb’s view, the causally relevant physical property exemplified by this event, the P15 trope, is token identical with the exemplified mental property, the thought of Q trope, and so by the token identity this mental property is causally relevant. This is the sought result: Mullin’s thought of Q causes her thought of A, and it does so by virtue of its mental property (trope). The view improves on Davidson’s
27 Among the first to present this criticism were Honderich (1982), Sosa (1984) and Kim (1984). 28 Robb (1997, 2001).
123
390
Synthese (2011) 182:375–391
in making mental properties causally relevant, but is like in Davidson’s in disallowing causal competition. On Robb’s view, it is irrelevant what happens with alternative physical realizations of the thought of Q. For again, even if those alternative realizations fail to causally necessitate thoughts of A, this does not undermine the proposed token identities, and so it does not undermine the claim that Mullin’s thought of Q causes her thought of A by virtue of the former thought’s mental property. Is there any way to develop a noncompetitive account that allows for disproportional mental causation other than by embracing some sort of token identity thesis, be it at the level of events (Davidson), tropes (Robb), or whatever? The key to doing so would be to show that there is some potential mind/body relation other than identity that prevents causal competition from arising. One possibility on this front would be to develop an account that took over the determinate/determinable component of Yablo’s view but dropped entirely the proportionality component. The rough idea would be to go further than Yablo by saying that determination gives rise to no form of causal competition, not even with respect to causation. Developing such an account falls outside the scope of the present paper. I mention it here in part because I really do think it might be a promising option, and in part to remind the reader of what I find objectionable in Yablo’s account and what I don’t. The friend of disproportional mental causation and noncompetitive views can be quite sympathetic to Yablo’s comparison of realization to determination. In this closing section, I have not meant to suggest that noncompetitive accounts are completely without their own unresolved problems. Rather, what I have argued is that such accounts possess a certain virtue: they can allow for disproportional mental causation. This, I have tried to show, is a significant advantage for such accounts over those competitive accounts, like Yablo’s, with which they are competing. References Bealer, G. (2007). Mental causation. Philosophical Perspectives, 21, 23–54. Bennett, K. (2003). Why the exclusion problem seems intractable, and how, just maybe, to tract it. Noûs, 37, 471–497. Davidson, D. (1970). Mental events. (Reprinted in Essays on actions and events, 1980, Oxford: Oxford University Press). Davidson, D. (1993). Thinking causes. (Reprinted in Mental causation, by J. Heil & A. Mele, Eds., 1995, Oxford: Oxford University Press). Earman, J., Roberts, J., & Smith, S. (2002). Ceteris paribus lost. Erkenntnis, 57, 281–301. Fodor, J. (1974). Special sciences. Synthese, 28, 77–115. Fodor, J. (1989) Making mind matter more. (Reprinted in A theory of content and other essays, 1990, Cambridge MA: MIT Press). Fodor, J. (1991). You can fool some of the people all of the time, everything else being equal; Hedged laws and psychological explanations. Mind, 100, 19–34. Gillett, C., & Rives, B. (2005). The non-existence of determinables: Or, a world of absolute determinates as default hypothesis. Nous, 39, 438–504. Honderich, T. (1982). The argument for anomalous monism. Analysis, 42, 59–64. Kim, J. (1984). Epiphenomenal and supervenient causation. Midwest Studies in Philosophy, 9, 257–270. Kim, J. (1998). Mind in a physical world: An essay on the mind-body problem and mental causation. Cambridge, MA: Bradford. LePore, E., & Loewer, B. (1987). Mind matters. The Journal of Philosophy, 84, 630–642. Leiter, B., & Miller, A. (1994). Mind doesn’t matter yet. Australasian Journal of Philosophy, 72, 220–228. Lewis, D. (1970). How to define theoretical terms. The Journal of Philosophy, 67, 427–446.
123
Synthese (2011) 182:375–391
391
Loewer, B. (2001). Review of mind in a physical world by Jaegwon Kim. The Journal of Philosophy, 98, 315–324. Mills, E. (1996). Interactionism and overdetermination. American Philosophical Quarterly, 33, 105–117. Robb, D. (1997). The properties of mental causation. Philosophical Quarterly, 47, 178–194. Robb, D. (2001). Reply to Noordhof on mental causation. Philosophical Quarterly, 51, 90–94. Schiffer, S. (1991). Ceteris paribus laws. Mind, 100, 1–17. Shoemaker, S. (2001). Realization and mental causation. In C. Gillett & B. Loewer, Physicalism and its discontent. Cambridge: Cambridge University Press. Shoemaker, S. (2007). Physical realization. Oxford: Oxford University Press. Sosa, E. (1984). Mind–body interaction and supervenient causation. Midwest Studies in Philosophy, 9, 271–281. Williamson, T. (2000). Knowledge and its limits. Oxford: Oxford University Press. Williamson, T. (2005). Replies to commentators. Philosophy and Phenomenological Research, 70, 468–491. Yablo, S. (1992). Mental causation. The Philosophical Review, 101, 245–280. Yablo, S. (1997). Wide causation. Philosophical Perspectives, 11, 251–281. Yablo, S. (2003). Causal relevance. Philosophical Issues, 13, 316–328.
123
This page intentionally left blank z
Synthese (2011) 182:393–411 DOI 10.1007/s11229-010-9748-9
Self-location is no problem for conditionalization D. J. Bradley
Received: 3 January 2010 / Accepted: 5 May 2010 / Published online: 30 May 2010 © Springer Science+Business Media B.V. 2010
Abstract How do temporal and eternal beliefs interact? I argue that acquiring a temporal belief should have no effect on eternal beliefs for an important range of cases. Thus, I oppose the popular view that new norms of belief change must be introduced for cases where the only change is the passing of time. I defend this position from the purported counter-examples of the Prisoner and Sleeping Beauty. I distinguish two importantly different ways in which temporal beliefs can be acquired and draw some general conclusions about their impact on eternal beliefs. Keywords
Sleeping Beauty · Self-location · Conditionalization · The Prisoner
1 Introduction Sometimes we learn what the world is like. Other times, we learn about where we are in the world. This paper is about the way these two kinds of belief interact. I will argue that self-locating beliefs have less impact on non-self-locating beliefs than has been suggested in the recent literature. Theories which say they do have an impact are motivated by particular thought experiments, notably the Prisoner (Arntzenius 2003) and Sleeping Beauty (Elga 2000). I will argue that these cases have been mishandled and do not show any puzzling connections between self-locating and non-self-locating belief.
D. J. Bradley (B) The City College of New York, 160 Convent Avenue, New York, NY 10031, USA e-mail:
[email protected]
123
394
Synthese (2011) 182:393–411
2 Conditionalization and two types of belief change Our most successful theory of confirmation—Bayesian confirmation theory— admits one basic rule of belief change: conditionalization. This says that an agent’s degree of certainty, or credence, in a belief after learning a piece of evidence should equal their earlier degree of certainty in the belief, conditional on the evidence. Formally, if an agent has prior probabilities P(Hi ) at t0, and learns E and nothing else between t0 and t1, then her t1 probabilities should be P(Hi |E), where P(E) > 0. More succinctly, PE (Hi ) = P(Hi |E). Here is a useful way to think about conditionalization: A range of worlds are initially (epistemically) possible for the agent. One of these worlds is the actual world, the others are not. E is true in some worlds but not others, so the agent is uncertain about E. When the agent learns E, he eliminates all the not E worlds and increases his probability in the E worlds. Thus conditionalization can be pictured as the agent eliminating false possibilities and zooming in on the truth. Note that learning something that was previously uncertain is an essential part of conditionalization.
-E
E
The actual world
Traditional confirmation theory deals only with belief contents that do not change in truth-value over time. Call these eternal beliefs.1 But many of our belief contents do change in truth-value over time.2 Call these temporal beliefs.3 Temporal beliefs are beliefs that locate the agent in time, such as ‘it is 12:00’.4 Such beliefs can be learnt in the way just described—by eliminating possibilities. For example, when someone is uncertain what time it is and looks at her watch, she acquires a new temporal belief by 1 Beliefs that must be relativized to an agent will count as eternal for us e.g. I am DJB. 2 I follow Kaplan (1989) in allowing that content can be temporal. But my arguments do not depend on this.
In fact I am inclined to hold that contents are eternal (Schaffer ms) If you dislike temporal contents, replace my talk of contents with characters, roles or sentences. The sentence ‘it is Monday’ definitely changes in truth value. 3 For ease of exposition, I will assume these categories, temporal and eternal, are mutually exclusive and exhaustive. We may learn both in virtue of the same experience. All I need is that any beliefs acquired can be divided into an eternal component and a temporal component. 4 See Perry (1979), Lewis (1979). Temporal beliefs need not be explicitly about the time. For example the belief ‘The sun is shining’ will be temporal on our definition. It locates the agent at a time when the sun is shining.
123
Synthese (2011) 182:393–411
395
eliminating false possibilities. (They are false relative to her current temporal position, not absolutely.) In such cases, the temporal belief discovered does not change in truth-value over the period in which it changes from being disbelieved to being believed. Instead, the agent learns something of which they were initially uncertain. False possibilities have been eliminated, and the agent has ‘zoomed in’ on the truth, exactly as needed for conditionalization. Call this type of belief change Discovery. Discovery: Belief change in virtue of the discovery of the truth of the content of the belief, where the truth-value did not change over the period of interest. But this story cannot always be applied to temporal beliefs. The reason is that temporal contents change in truth value. The content of ‘it is the 21st century’ has changed in truth value; it used to be false, but is now true. We need to change our temporal beliefs in accordance with the changing facts. Thus, we acquire temporal beliefs in virtue of the content of the belief changing from false to true.5 (We also lose them when they change from true to false.) Call this type of belief change Belief Mutation6 . Belief Mutation: Belief change in virtue of a change in the truth-value of the content of the belief7 . Belief Mutation applies only to cases where the agent learns only temporal information, such as staring at the hands of a clock in the knowledge that nothing unexpected is about to happen. By contrast, staring at the hands of a cuckoo clock at it strikes the hour may result in various eternal beliefs being learnt, for example that the cuckoo is red. Titelbaum (2008) gives the example of someone watching a film that they have seen so many times that they know every frame by heart (p. 556). Belief Mutation has been overlooked until relatively recently, but once noted it is clear that it creates a problem for conditionalization being the only basic rule of belief change. Conditionalization says that beliefs should change only when something that was previously uncertain has been discovered. But Mutation allows belief change even when nothing that was uncertain is discovered. As I watch the hands of a clock move I may be uncertain about nothing, but my beliefs about the time should still change. And the reason they should change is that the truth value of the beliefs change. It used to be, say, 12:00 but now it is 1:00.8
5 One complication I want to bracket is that we might be mistaken, so what really matters is whether we think the belief has changed in truth-value. I will assume this possibility of error won’t affect my arguments. Note the same complication applies to Discovery and is generally ignored or assumed away. 6 The change can usefully be described as mutation due to features it shares with biological mutation. It happens naturally over time, for example, and no involvement from other beliefs is necessary, just as no interference from other organisms is needed in biology. 7 I will sometimes call this just ‘Mutation’. 8 I first saw diagrams similar to those that follow in a talk by Andy Egan in 2006.
123
396
Synthese (2011) 182:393–411
The actual world at 12:00
-E
E
-E
12:00
The actual world at 1:00
E
1:00
Conditionalization is clearly the wrong model for this type of belief change, and quickly leads to absurdity if we try to apply it9 . So when temporal beliefs are taken into account, conditionalization is not the only basic rule of belief update; we need new rules governing Mutation. To briefly give a sense of these rules, some will be simple such as ‘If 24 hours pass, the belief [it is Monday today] should be replaced by [it was Monday yesterday]’. Similarly, ‘If it stops raining, the belief [it is raining] should be replaced by [it was raining earlier]’. If the agent loses track of time they will be more complex, perhaps resulting in a weighted sum: ‘if there is a 50% chance 24 hours have passed and a 50% chance 48 hours have passed, then replace the belief [it is Monday] by assigning a 50% chance to [it was Monday yesterday] and a 50% chance to [it was Monday two days ago]’. Mutation and Discovery may occur together; I will assume that all belief change can be decomposed into a Mutation component and a Discovery component. For example, if I wake up and see it is 7:52, then there is Mutation (I give up my last conscious belief that it was midnight) and also Discovery (I have discovered it is 7:52 rather than 7:51).10 But what happens to confirmation theory when temporal beliefs are admitted? That is, what is the relation between evidence and hypothesis when evidence and hypothesis can be temporal? Six possibilities need to be distinguished. Let’s say (following Titelbaum) that if the evidence can shift one’s degree of belief in the hypothesis,11 the evidence is relevant.
9 Suppose an agent correctly believes it is 12:00 at 12:00. If the agent at 12:00 were to conditionalize on ‘it is 1:00’, he would have both ‘it is 12:00’ and ‘it is 1:00’ in his belief set. Such an agent would be horribly confused. So conditionalization seems to say that if at a later time the agent learns it is 1:00 then the agent should be horribly confused. But this is not the situation at all. See Titelbaum’s Sleeping In example (p. 566). 10 I am grateful to Wolfgang Schwarz for discussion of these issues. 11 More specifically the type of evidence and hypothesis. But omitting this shouldn’t lead to any confusion.
123
Synthese (2011) 182:393–411
397
Hypothesis
Evidence
Relevant?
1 2
Eternal Eternal
3 4
Temporal Temporal
Eternal (Discovery) Temporal a) Mutation b) Discovery Eternal (Discovery) Temporal a) Mutation b) Discovery
Yes No Yes Yes Yes Yes
Temporal evidence can be acquired by Mutation or Discovery; eternal evidence can only be Discovered. This paper is about 2. The main focus will be defending my answer of No at 2a; I will briefly defend my answer of Yes at 2b at the end. It is worth first distinguishing the other cases I am not discussing. 1 is the traditional case where both the evidence and the hypothesis are eternal. Standard examples involving balls in urns or experimental results fit into 1. There is no doubt such evidence is relevant. 3 is concerned with cases where an eternal belief is learnt, and this changes one’s degree of belief in a temporal hypothesis. For example, suppose you know a coin flip of Tails results in you being woken at 8am while a coin flip of Heads results in you being woken at 9am. If you have just been woken and are unsure of the time, then learning that the coin landed Heads (eternal) confirms that it is 9am (temporal). So the evidence is relevant. For case 4, suppose the evidence is ‘it is 2000’ and the hypothesis is ‘it is the 21st century’. I take it to be uncontroversial that the evidence is relevant, whether it is acquired by Mutation or Discovery. The controversial case is 2, and 2a especially. We have seen that we need new rules of belief change to model Mutation, but do we need new rules of belief change governing how eternal beliefs are affected by Mutation? Question Can Belief Mutation produce a shift in credence in eternal beliefs? The need to address this issue has become apparent due to thought-experiments which appear to show that Mutation can produce a shift in the agent’s credence in eternal beliefs. If so, the answer to the Question is Yes. This answer would mean that credence in an eternal belief can change even when nothing that was uncertain is learnt (because Mutation can occur even when nothing that was uncertain is learnt). If nothing uncertain is learnt, conditionalization permits no belief change. So (eternal) belief change in the absence of something uncertain being learnt amounts to a violation of conditionalization for eternal beliefs12 . We would need new rules to govern such changes. A need for such new rules has been suggested by Elga (2000) and Arntzenius (2003), and new rules have been defended by Halpern (2004), Meacham (2008), Titelbaum (ibid.). But despite its popularity, rejecting conditionalization is a radical move. Conditionalization is the heart of our best theory of confirmation (see Earman 1992; Howson and Urbach 1993 for classic Bayesian texts), encodes common sense and is supported by several important arguments (Teller 1976; Williams 1980; Van Fraassen 1999; Greaves and Wallace 2006).
12 The problems for conditionalization are also problems for Reflection (Van Fraassen 1984). But conditionalization is the more important rule, so I will focus on it. See Schervish et al. (2004) and Weisberg (2007) for relevant discussions.
123
398
Synthese (2011) 182:393–411
(Why am I not concerned about giving up conditionalization for Mutation itself then? Because Mutation applies only to beliefs that change in truth-value. The arguments cited all assume that the beliefs don’t change in truth-value. For example, someone who bets that it is Monday will not expect the bookie to wait 24 hours and say ‘It’s Tuesday—you lose!’) I will argue that in fact there is no need for new rules of belief change because the thought experiments that purport to show violations of conditionalization fail. My answer to the Question is No; Mutation should have no effect on credence in eternal beliefs (2a). However, I will argue in the final section that Discovery of temporal beliefs can shift credence in eternal beliefs (2b). I will now consider the thought-experiments purported to show that Mutation can affect credence in eternal beliefs, and argue that they fail.
3 The Prisoner Imagine you are a prisoner (Arntzenius 200313 ). The prison guard will flip a fair coin at midnight. If the coin lands Heads he will turn off the light in your cell at midnight. If the coin lands Tails he will leave the light on (Fig. 1). You are locked in your cell at 6pm. As there is no clock in your cell, you lose track of the time. Imagine it has been a few hours since you were locked in your cell. The light is still on. You think it might be after midnight, but you’re not sure. Arntzenius claims that at this point, your degree of belief that the coin landed Tails should go up. I agree. He thinks that this is a new and puzzling way in which temporal beliefs affect eternal beliefs. I disagree. I think that an eternal belief has been Discovered and conditionalization can be applied. Let’s look more carefully at how the prisoner’s beliefs evolve over time.14 Fig. 1 The Prisoner Boxes represent locations where the light is on
11pm
12am
1am
Heads
Tails
13 Arntzenius gives four other problem cases for conditionalization and reflection in this paper. One is Sleeping Beauty, another is a version of the prisoner, and another is a straight-forward case of memory erasure (Shangri-La), in which the agent violates conditionalization involuntarily. The final case seems to be a combination of the earlier cases. 14 I want to point out that it is very easy to put yourself in a Prisoner type situation, and I strongly recommend
the experience.
123
Synthese (2011) 182:393–411
399
First consider a case where there is no light being switched off. What happens to an agent’s temporal beliefs as time passes? Two things happen. First of all, they shift forward in time. The belief that it is 6 pm is replaced by the belief that it is 7 pm. This is belief mutation. But when the agent is an imperfect timekeeper, something else happens; the beliefs become more spread out. That is, the agent becomes less certain about exactly what time it is. At 7 pm, the agent might assign an 80% probability to it being within 10 minutes of 7 pm. But by 11 pm, he might only assign a 50% probability to it being within 10 minutes of 11pm (Fig. 2). Now let’s add the extra uncertainty of the coin toss. As well as being uncertain about the time, you are also uncertain about whether the coin landed Heads or Tails. So at 7pm, your probability distribution is spread over various times in two possible worlds, Heads and Tails. Each curve is half the area it was when there was no coin toss to be uncertain about (Fig. 3). Now let’s add the fact that the lights go off at midnight if Heads lands. Consider what happens as the right hand side of the probability distribution edges towards midnight. That is, what happens as you start to think that it may already be later than midnight? If the light remains on, then the possibility that it is later than midnight and Heads will be eliminated. This is because if the coin landed Heads, the light goes off at midnight. If it really is after midnight and the light is still on, then the At 7pm
6
7
8 o’clock
At 11pm
9
10
11 12
1 o’clock
Fig. 2 The passage of time Fig. 3 Was it Heads, and what time is it?
At 7pm
Heads 6 7 8 o’clock
Tails 6 7 8 o’clock
123
400 Fig. 4 The shift to Tails. The probability space from the right hand side of the Heads curve is transferred to the uneliminated parts of the curve. The probability of Tails grows to more than 50%
Synthese (2011) 182:393–411
At 11pm
Heads
Tails
12:00
coin must have landed Tails. This means that the probability of Tails must go up15 (Fig. 4). Consider your credence at 11:59 pm. You will think it might be after midnight and take the light being on as evidence of Tails. This shift towards Tails is foreseeable at 6pm. As Arntzenius points out, this is an odd situation. It appears that nothing that was uncertain at 6 pm has been discovered by 11:59 pm. So there can be no change in credence that is due to conditionalization. If credences change nonetheless, conditionalization is violated and some other rule of belief change is needed.16 All that appears to have happened is that time has passed, so it looks like a case where mere Belief Mutation is relevant to an eternal belief. So the answer to the Question (can Mutation produce a shift in credence in eternal beliefs?) would be Yes. But I will argue that an eternal piece of evidence has been Discovered (on the model of Belief Discovery) between 6 pm and 11:59 pm. The shift in credence of Tails is due to conditionalization on this piece of evidence. 4 Diagnosis: What the prisoner learns I think that the prisoner Discovers a piece of evidence, and this is responsible for the confirmation of Tails. To see what this new belief is, we have to consider why the prisoner changes his degrees of belief in the first place. The shift is caused by the prisoner’s new belief that time has passed. Suppose that the prisoner’s degree of belief that it is after midnight is 0% at 6pm and greater than 30% at 11pm. Given a very 15 If the coin landed Heads, the light goes off at midnight. Then you know for certain that it is midnight and the coin landed Heads. Otherwise, the light stays on and your degree of belief in Tails continues to rise. Eventually, you will be confident that it is after midnight and your degree of belief in Tails will approach 1. 16 I agree with Arntzenius that Reflection is violated however. Weisberg and Seidenfeld et al. both point
out that Reflection need not hold when the agent loses track of the time.
123
Synthese (2011) 182:393–411
401
plausible introspective awareness, the prisoner knows (at 11 pm) that his credence that it is after midnight is greater than 30%.17 He also believes the light is still on. Combining these results generates a new belief that has been Discovered. New Discovered Belief The light is on after my credence that it is after midnight has gone up above 30%. When the prisoner was first put in the cell at 6pm, he didn’t know if the lights would still be on by the time his credence that it is after midnight had gone up above 30%. For all he knew, the light might have been turned off before his credence that is was after midnight got that high. So when the light stays on, he Discovers a new piece of evidence on which he can conditionalize. (This new belief is eternal, but what is essential for my argument is that it is Discovered.) The probability of the New Discovered Belief being true given Tails is 1; the light will stay on all night if Tails landed. But if Heads landed, the light might have been turned off before his credence that it is after midnight had gone up above 30%. The new evidence, being more likely given Tails, confirms Tails. P(The light is on after my credence that it is after midnight has gone up above 30%|Tails) = 1 > P(The light is on after my credence that it is after midnight has gone up above 30%| Heads) This account shows that it is not merely the Mutation of temporal beliefs that causes the shift in the credence in Tails, but the Discovery of a new belief. One might worry that there is an inevitable change in credence. That is, one might worry that the prisoner knows at 6pm that he will learn ‘the light is on after my credence that it is after midnight has gone up above 30%’. If he knows he will learn this in the future, he should surely believe it now, and things are looking mysterious again. But the prisoner cannot be certain that he will acquire any such belief. At 6 pm he might think it possible that the lights will go on before he comes to think there is any chance of it being after midnight. That is, at 6 pm, he may assign a non-zero credence to the eternal proposition that his 11:59 pm credence that it is after midnight will be zero. If so, he will not expect an inevitable shift. (This does not imply that he believes he is a perfect time-keeper, for he may also believe that his credence that it is after midnight could remain zero even after midnight.) I conclude that the Prisoner does not show that Belief Mutation has any effect on credence in eternal beliefs. Arntzenius’s example appears to show that the Mutation of temporal beliefs can affect eternal beliefs in unexpected ways. But I have argued that the prisoner Discovers a new eternal belief that he didn’t know at 6pm. The prisoner updates by conditionalizing on this new belief. No new norm of eternal belief change is required. I’ll now argue that in a second case, the argument that Belief Mutation leads to a violation of conditionalization is inconclusive.
17 Note that the Prisoner doesn’t need, and I haven’t assumed, perfect transparency of his own credences.
Some vague idea that his credence that it is after midnight is greater than it was earlier is sufficient.
123
402
Synthese (2011) 182:393–411
5 Sleeping Beauty It is Sunday night. Sleeping Beauty is about to be drugged and put to sleep. She will be woken briefly on Monday. Then she will be put back to sleep and her memory of being awoken will be erased. She might be awoken on Tuesday. Whether or not she is depends on the result of the toss of a fair coin. If it lands Heads, she will not be woken. She will sleep straight through to Wednesday, and the experiment will be over. If it lands Tails, she will be awoken on Tuesday. The Monday and Tuesday awakenings will be indistinguishable. Sleeping Beauty knows the setup of the experiment and is a paragon of probabilistic rationality (Fig. 5). There are three centred worlds where Beauty could be: H1 = Monday and Heads T1 = Monday and Tails T2 = Tuesday and Tails Obviously Beauty’s credence in Tails on Sunday should match the objective chance of 1/2. This reasoning employs a probability co-orindation principle (PCP) that connects subjective credences with the objective chances.18 But the question is: when she is woken, what credence should she have that the coin landed Tails? Some say that her credence in Heads should stay at 1/2. Call these Halfers. Some say that her credence in Heads should fall to 1/3. Call these Thirders. What exactly has changed between Sunday and waking up on Monday and Tuesday? Beauty has acquired the temporal belief that [it is now Monday or Tuesday]. This type of belief acquisition is not the type of belief acquisition modelled by conditionalization. Conditionalization models cases in which evidence that was initially uncertain is learnt. But Sleeping Beauty has not learnt anything about which she was
Fig. 5 The boxes represent days when Beauty is awake
Monday
Heads
Tails
Tuesday
H1
T1
T2
18 The terminology is from Strevens (1995). The most well known PCP is that of Lewis (1980, 1994). But
I’m not using his Principal Principle because I do not wish to give the impression that my arguments, or those of Elga (2000), depend on the details of Lewis’s account. Elga doesn’t even mention Lewis in this context, and Lewis himself coyly speaks of ‘a well-known principle which says that credences about future chance events should equal the known chances’ (2001, p. 175).
123
Synthese (2011) 182:393–411
403
uncertain; this is not Discovery. Instead, she has acquired a new belief because the content of that belief (it is now Monday or Tuesday) has changed from being false to true. This is Belief Mutation. Thirders claim that such Belief Mutation should shift Beauty’s credence in the eternal belief that Heads landed. This would again be a violation of conditionalization and indicate that a new rule of belief update for eternal beliefs is needed. I will defend conditionalization. I cannot address every argument, which include Arntzenius (2002), Arntzenius (2003), Elga (2000), Dorr (2002), Draper and Pust (2008), Hitchcock (2004), Horgan (2004), Horgan (2007), Monton (2002), Titelbaum (ibid.) and Weintraub (2004). All these arguments are controversial and most have been challenged in print (see Bradley 2003, 2010, forthcoming; Bradley and Leitgeb 2006; Briggs ms; Lewis 2001; Jenkins 2005; Schwarz ms; White 2006 for a range of responses). But probably the most important thirder argument has not yet been adequately challenged. Both in print and in conversation, Elga’s (2000) original argument seems most influential in persuading people of the thirder position. I will argue in this paper that Elga’s argument is unpersuasive. It is important to understand that I am not aiming to refute the thirder position— I have no knock-down arguments that being a thirder is irrational. I am merely trying to show that those persuaded by Elga’s arguments should not have been. 6 Elga’s argument Elga’s argument requires modifying Sleeping Beauty a little. But the modifications are harmless and the result is very interesting. The first modification is based on the fact that it doesn’t matter when the coin is tossed. The experimenters could toss the coin on Sunday night, and then wake Beauty either once or twice. Or they could wait until Monday night, toss the coin, and wake her on Tuesday only if it lands Tails. So let’s assume that they do the latter. The coin is tossed on Monday night, and Beauty is only woken on Tuesday if it lands Tails. Assume Beauty knows this. The second modification is that after waking on each day Beauty is told what day it is. What should Beauty think after she is told that today is Monday? She knows that a coin is going to be tossed tonight. She also knows that none of her memories have been erased. If there are to be any cognitive mishaps, they lie in the future. Given this situation, Elga argues that Beauty’s credence should match the objective chances, so she should assign a probability of 1/2 to the coin landing heads. The argument for being a thirder then runs as follows. Let P be her subjective probabilities just after she is woken on Monday. Let P+ be her probabilities after she is told it’s Monday. Let P− be her probabilities on Sunday night. Elga claims that P+(H1) = 1/2 from the PCP. Then he argues backwards to the situation before Beauty found out it was Monday. After learning it was Monday, her credence that the coin will land Heads ought to be the same as the conditional credence P(H1 | H1 or T1).19 So P(H1 | H1 or T1) = 1/2, and hence P(H1) = P(T1). Elga then applies his Restricted 19 This is a case where a temporal belief is learnt from a position of uncertainty i.e. Discovery. But the
shift we are concerned with here, from Sunday to [Monday or Tuesday], is Mutation.
123
404
Synthese (2011) 182:393–411
Principle of Indifference (defended in his 2004), which says that agents in subjectively indistinguishable states within the same possible world should have equal credences.20 The agents at T1 and T2 satisfy these conditions, so P(T1) = P(T2). So we have P(H1) = P(T1) = P(T2). As these are mutually exclusive and exhaustive, P(H1) = 1/3. So runs Elga’s argument for 1/3. I think this argument has been refuted by Lewis (2001). But Lewis was not as clear as usual, and his argument has not been widely accepted or even discussed.21 I will defend and expand upon Lewis’s argument. Due to the brevity of Lewis’s argument, it may be extravagant to attribute my argument to him. So I will more cautiously claim that the following is an argument that I have been led to by Lewis’s paper. I should emphasize that I am not aiming to give a positive argument for the halfer position, but merely to undermine Elga’s argument for the thirder position. Lewis rejects Elga’s premise that P+(Heads) = 1/2. Elga’s justification for the premise is a PCP. But there are some situations in which credence should diverge from objective chance. For example, suppose we have a crystal ball known to be reliable that predicts that this flip of a fair coin will land Tails. What should we believe about the outcome of the coin flip? Should we follow the crystal ball, or should we stick with the objective chance? We should follow the crystal ball. Evidence about the future can trump the objective chance. Call evidence which can justify an agent in having credences that diverge from the chances inadmissible evidence. The terminology is from Lewis (1980, 1994); I am not endorsing Lewis’s version of the PCP though, just his terminology. 7 Beauty’s inadmissible evidence Does Sleeping Beauty have inadmissible evidence (relative to the coin toss22 )? I say yes. The paradigm sources of inadmissible evidence are crystal balls, oracles, and suchlike. Sleeping Beauty has nothing as obviously inadmissible as this. But I will argue that Sleeping Beauty has inadmissible evidence when she is told that today is Monday. The way to see this is to consider the alternative evidence she might have received. From the state of being awake in the Sleeping Beauty setup, there are two pieces of evidence she might have found. She might have been told it’s Monday or she might have been told it’s Tuesday. So the evidence space is the following: {Today is Monday, Today is Tuesday} My argument that Beauty has inadmissible evidence when she learns it is Monday will proceed in two steps: 1.
‘Today is Tuesday’ is inadmissible.
20 Equal credences of what? Specifically, equal credences of whether each of them is one of the agents rather than the other. It follows that they must have equal credence in everything, so I have left the main text unqualified. 21 Dieks (2007) is an exception. 22 All inadmissible evidence is relative to an event, and possibly to other variables too. I will leave these
variables implicit in future.
123
Synthese (2011) 182:393–411
2.
405
If an agent with only admissible evidence has two possible pieces of evidence in her evidence space and one piece of evidence is inadmissible, then the other is inadmissible.
Argument for 1: Suppose Beauty learns that today is Tuesday. Should her degree of belief in Heads be 50%? No. Her degree of belief in Heads should be zero, because if Heads landed, she would sleep through Tuesday. As this evidence justifies Beauty not setting her degree of belief to match the objective chances, it is inadmissible evidence.23 Argument for 2: If there are two possible pieces of evidence, E1 and E2, an agent’s prior degree of belief in hypothesis H must be a weighted average of P(H|E1) and P(H|E2). Assume that E1 is inadmissible. Then P(H|E1) need not be equal to the objective chance of H. Perhaps P(H|E1) is less than P(H). Then P(H|E2) must be more than P(H) (otherwise P(H) won’t be the weighted average). As the agent initially had only admissible evidence, P(H) should equal the objective chance of H. So P(H|E2) should be greater than the objective chance of H. Which means E2 is inadmissible. It follows from 1 and 2 that Beauty has inadmissible evidence when she is told that today is Monday. This is why she is not bound by the PCP, so Elga’s premise that P+ = 1/2 is unsupported and his argument is inconclusive. Intuitively, as ‘it is Tuesday’ would confirm Tails (absolutely), ‘it is Monday’ confirms Heads. The halfer can claim that credence in Heads after learning it’s Monday is 2/3 (the result of conditionalizing on ‘it is Monday’ from a prior probability of Heads of 1/2). Thirders might object that Beauty has inadmissible evidence when she wakes up on Monday, so she does not satisfy the antecedent of 1 (Dieks ibid.). Thirders might continue that merely being woken gives Beauty evidence that favours Tails, and is therefore inadmissible evidence. But no reason has been offered for the halfer to accept this—and this should be the conclusion of the argument, not a premise. Halfers do not think Beauty gets inadmissible evidence on waking; thirders need to offer an argument that she does. They cannot simply assume that she gets inadmissible evidence, for that is the very point that the argument is supposed to show. I’m not saying that no argument could possibly be produced of course; again, I’m not offering a knock-down argument that being a thirder is irrational. I’m just arguing that those persuaded by Elga’s argument should not have been. The PCP does not bare the weight that Elga’s argument places upon it. Keep in mind that Elga is arguing that we should do something radical i.e. substantially revise our best theory of confirmation, so we should require a strong argument to do so. A different objection to my argument is to point out that it is only valid if the evidence doesn’t alter the agent’s beliefs regarding the objective chance. This is true—‘Today is Monday’ and ‘Today is Tuesday’ mustn’t change Beauty’s beliefs about what the 23 One might object that if Beauty had learnt it was Tuesday and evidence must be true, then it would be Tuesday. If on Tuesday the objective chance of Heads is zero, then Beauty’s credence would match the objective chances, so her evidence would be admissible. This objection could be blocked by denying that evidence in this context need be true or denying time-dependent chances (Hoefer 2007). I would endorse both moves. I am grateful to Wolfgang Schwarz for discussion here.
123
406
Synthese (2011) 182:393–411
chances are. This is satisfied—it is stipulated that Beauty knows the coin is fair. So she should only doubt that it is fair if she receives evidence connected to it being biased, which she has not. A third objection is to point out that the argument is only valid if the pieces of evidence are mutually exclusive and exhaustive.24 This can be granted as it is obviously satisfied by ‘Today is Monday’ and ‘Today is Tuesday’. We can conclude that ‘today is Monday’ is inadmissible evidence. One might still worry that there must be something wrong with saying that Beauty’s credence in a future coin flip should diverge from the objective chances. This should only happen if she has evidence about the future. I respond that Sleeping Beauty does have evidence about the future. Where did it come from? There are no prophets or crystal balls around. It came from the evidence space. One of the possible pieces of evidence is about the future—the possible evidence that today is Tuesday. This, I think, is what Lewis meant when he said that Beauty has evidence about the future, ‘namely that she is not now in it’25 (p. 175). It should not be surprising that finding out the time gives you evidence about the future. How could it not? Finding out the time involves finding out what time is present and what times are future. There are of course many other arguments for the thirder position that I cannot address here, but I hope to have shown that the most influential argument for the thirder position does not succeed. Thus the thesis that conditionalization is the only rule of update for eternal beliefs survives. Before drawing some general morals, I want to bring out the connection between Sleeping Beauty and the Prisoner. 8 (Dis)Analogy There is a striking similarity between Sleeping Beauty and the Prisoner. Both involve two possible worlds, and an extra temporal possibility in one of them (Figs. 6 and 7). Yet I have argued that the Prisoner gets confirmation of Tails but Beauty doesn’t. What’s the difference? One crucial difference is that the Prisoner has memories that give him some indication of the passage of time. So he has some relevant evidence about whether he is in the earlier or later stage of the experiment (Monday or Tuesday for Beauty, before midnight and after midnight for the Prisoner). His memories, combined with the fact that the light is still on, result in his Discovering the new belief ‘the light is on after my credence that it is after midnight has gone up above 30%’. Beauty has no memories, so she has Discovered no equivalent belief which she can conditionalize on. So the argument for why the Prisoner should change his credences cannot be applied to Beauty.26 Let’s now put these results in a broader context. 24 I am grateful to an anonymous Synthese referee for pointing this out. 25 As Elliott Sober pointed out to me, this is an unfortunate phrase—one always knows that one is not in
the future. This is plausibly true even for those with time-machines. 26 Another disanalogy is that Sleeping Beauty has zero chance of observing Tuesday if Heads, whereas the Prisoner has a non-zero chance of observing between 12am and 1am if Heads. It may be dark, but he will be conscious. So if the Prisoner, like Beauty, has no memories, his observation that the light is on will still confirm Tails. The selection procedures by which they make their observations are different. I discuss the connection between selection procedures and self-location in my (forthcoming).
123
Synthese (2011) 182:393–411
407
Fig. 6 The boxes represent days when Beauty is awake
Monday Heads
H1
Tails
T1
Fig. 7 The boxes represent centred worlds locations where the light is on
11pm
Tuesday
T2
12am
1am
Heads
Tails
9 Comparison Let’s back up. I have distinguished belief change in virtue of learning the truth of something previously uncertain (Discovery) from belief change due to the content of the belief changing in truth-value (Mutation). Can temporal belief change of either type shift credence in eternal beliefs? If they can, we’ll say that they are relevant to eternal beliefs. Those who have addressed the question so far have dealt with Discovery and Mutation together. Titelbaum holds that both types are relevant to eternal belief27 ; Halpern (2004); Bostrom (2007); Meacham (2008)28 argue that neither type is. But Discovery and Mutation are very different and deserve separate treatment. I hold that Discovery is relevant to eternal beliefs, but Mutation is not. We can zoom in and expand Table 1 to compare my position to others. 27 I focus on Titelbaum because among thirders he is most explicit that he is defending general norms of belief change. He attacks the strong thesis that ‘it is never rational for an agent who learns only self-locating [temporal] information to respond by altering a non-self-locating [eternal] degree of belief’ (p. 556). But this is only in conflict with Halpern (ibid.), Bostrom (ibid.) and Meacham (ibid.). The more popular halfer view accepts that Discovery is relevant e.g. Lewis (2001), as Titelbaum (p. 585) notes. I discuss Titelbaum’s analysis in Bradley (forthcoming). 28 Meacham is the most explicit that he is defending general norms of belief change, but I think it natural to read Halpern and Bostrom as committed to the same conclusion. There are differences in their theories that I will largely ignore in the discussion that follows. I think they each face at least two of the problems I raise.
123
408
Synthese (2011) 182:393–411
2a) Mutation is relevant 2b) Discovery is relevant
Titelbaum
Bradley
Yes Yes
No Yes
Halpern, Bostrom, Meacham No No
My arguments above attempted to show that the arguments in favour of the relevance of Mutation do not succeed. I have not given arguments against the relevance of Mutation in this paper; the argument is that it commits us to a violation of conditionalization. Indeed the initial arguments for the relevance of Mutation were put forward with an explicit acknowledgement of this (see Arntzenius and Elga in particular). Although this position is popular, giving up a principle as defensible and reasonable as conditionalization is a heavy cost. One might object that I’ve already admitted we must give up conditionalization and reflection for temporal beliefs, so why not admit the relevance of Mutation? Because the issue at stake is whether we should give up conditionalization for eternal beliefs. It is not controversial that beliefs such as ‘today is Monday’ must change in ways that don’t conform to conditionalization. The issue is whether eternal beliefs such as ‘the coin landed Heads’ can change in ways that violate conditionalization. I have argued that the arguments in favour of such a violation do not succeed. I have not said anything to undermine arguments from the other side (i.e. Halpern (ibid.), Bostrom (ibid.) and Meacham (ibid.)) who seem to claim that temporal beliefs can never have any effect on eternal beliefs. That is, that Discovery of temporal beliefs is not relevant either. So I will now offer a brief rear guard action before concluding.29 The main problem with the position that temporal beliefs are never relevant is the implausibility of its commitments. These result from denying conditionalization, and, moreover, from denying it in cases where we have strong intuitions that it should hold (in contrast to cases like Sleeping Beauty and the Prisoner in which we do not have clear intuitions). If Discovery of temporal beliefs isn’t relevant, then when Beauty learns it is Monday, her credence in Heads should not shift, even though the conditional probability of Heads given Monday is different from the unconditional probability. So proponents are committed to the following two constraints: Constraint a Constraint b
P(Heads|Monday) = 2/3 PMonday (Heads) = 1/2
This is bad, but things get worse. Consider a variation in which Beauty is woken 10 times given either Heads or Tails. If Heads lands she sees a red light on 9 days and a blue light on 1; if Tails lands she sees a blue light on 9 days and a red light on 1. As before, her memory is erased between each waking. So for any given day, there is a 90% chance of seeing a red light given Heads and a 10% chance of seeing a red light given Tails. Intuitively, seeing a red light should confirm Heads. But she only learns the temporal ‘There is a red light today’ when she sees the light. She doesn’t learn any eternal evidence on seeing the light, such as ‘There exists some day on which the red light is seen’—she knew that all along. So if temporal evidence is never relevant, seeing the light fails to confirm Heads. 29 I am grateful to Matt Kotzen for an exchange on the issues that follow.
123
Synthese (2011) 182:393–411
409
A second problem is that it is too easy to turn Discovery of a temporal belief into Discovery of an eternal belief. We can do this by supposing that the original Sleeping Beauty has some unique experience on each day. For example, she might see what the clouds look like while knowing that the clouds will not be identical on both days (Dorr ms) or she might wear different coloured pyjamas each day (Monton and Kierland 2005) or she might observe a differently coloured piece of paper on each day (Titelbaum). In each case, Beauty would acquire eternal evidence e.g. there exists some day on which the red paper is seen, but it is implausible to think that different rules of belief change apply to the modified cases with the unique experience than to cases without.30 A third problem is that the main theoretical justification for the irrelevance of Discovery of temporal beliefs, offered by both Bostrom and Halpern, fails. They both argue that there is a familiar difference between ‘E’ and ‘I learn E’, and that this explains how a and b can both be accepted. This would allow them to keep conditionalization after all. Halpern cites the Monty Hall problem as a case in which this distinction makes a difference. But the natural thing to do in response is to include the evidence about how E is learnt in the updating: Constraint c Constraint d
P(Heads|I learn it’s Monday today) = 2/3 PI learn it’s Monday today (Heads) = 1/2
The conflict simply re-emerges one level up. Even if Bostrom and Halpern can block this response, I don’t think the distinction between ‘E’ and ‘I learn E’ can do the work they want it to in this case. The reason is that the distinction only makes a difference if it is epistemically possible that either E can be true without learning E, or learning E can be true without E.31 Otherwise, E and learning E are equivalent. Yet Sleeping Beauty discovers it is Monday iff it is Monday. So there isn’t the required difference between E and learning E that is needed to accept constraints a and b. Bostrom and Halpern might offer a second reply: that I have missed their point, and that the agent should not use his conditional probability of H given E when he later learns E. Bostrom suggests that doing so fails to take into account that the agent learns not just E, but also that he is at a time when he has learnt E. But this move would amount to a whole-scale rejection of conditionalization in any context. Any sufficiently reflective agent can turn ‘E’ into ‘I am at a time when I have learnt E’ just by reflecting on what she has learnt. This can happen in any context, even if E is eternal. So any new norms regarding ‘I have learnt E’ will be global, and infect the whole of confirmation theory.32 I conclude that Discovery of temporal beliefs can be relevant to eternal beliefs.
30 One might object that cases with unique experiences generate an argument via conditionalization for thirding e.g. Titelbaum. I argue in Bradley (forthcoming) that this argument fails. 31 In the Monty Hall problem it is epistemically possible, for each door, that it is empty without you discovering that it is empty. 32 I discuss the distinction between ‘E’ and ‘I learn E’ in Bradley (2010).
123
410
Synthese (2011) 182:393–411
10 Conclusion I have argued that there are importantly different ways in which temporal beliefs can be acquired. Sometimes they are acquired by Discovery, in the same way as has been traditionally discussed by Bayesians. I have argued that such learning can shift credence in eternal beliefs. But temporal beliefs, unlike eternal beliefs, can also be acquired by Mutation. I have argued that Mutation cannot change credence in eternal beliefs, so we need no new norms of belief change for eternal beliefs. The costs of doing so are high, and the arguments so far offered are unconvincing. Acknowledgements I am grateful to Frank Arntzenius, Paul Bartha, Nick Bostrom, Kenny Easwaran, Andy Egan, Adam Elga, Justin Fisher, Branden Fitelson, Hilary Greaves, Alan Hájek, Matt Kotzen, Chris Meacham, John Perry, Teddy Seidenfeld, Elliott Sober, Mike Titelbaum, Jonathan Weisberg and an audience at the ANU for helpful discussion and comments on this material.
References Arntzenius, F. (2002). Reflections on Sleeping Beauty. Analysis, 62(273), 53–62. Arntzenius, F. (2003). Some problems for conditionalization and reflection. Journal of Philosophy, 100, 356–370. Bradley, D. J. (2003). Sleeping Beauty: A note on Dorr’s argument for 1/3. Analysis, 63, 266–268. Bradley, D. J., Leitgeb, H. (2006). When betting odds and credences come apart: More worries for Dutch book arguments, Analysis, 66(2), 119–127. Bradley, D. J. (2010) “Conditionalization and beliefs De Se” Dialectica. Bradley, D. J. (forthcoming). Confirmation in a branching world: The Everett interpretation and Sleeping Beauty. British Journal for the Philosophy of Science. Briggs (ms). Putting a value on Beauty. Bostrom (2007). Sleeping beauty and self-location: A hybrid model. Synthese, 157(1), 59–78. Dorr, C. (2002). Sleeping Beauty: In defence of Elga. Analysis, 62, 292–296. Dieks, D. (2007). Reasoning about the future. Synthese, 156, 427–439. Draper, K., & Pust, J. (2008). Diachronic Dutch books and Sleeping Beauty. Synthese, 164(2), 282–287. Earman, J. (1992). Bayes or bust. Cambridge, MA: MIT Press. Elga, A. (2000). Self-locating belief and the Sleeping Beauty problem. Analysis, 60, 143–147. Elga, A. (2004). Defeating Dr. Evil with self-locating belief. Philosophy and Phenomenological Research, 69(2). Greaves, H., & Wallace, D. (2006). Justifying conditionalization: Conditionalization maximizes expected epistemic utility. Mind, 115(459), 607–632. Halpern, J. (2004). Sleeping Beauty reconsidered: Conditioning and reflection in asynchronous systems. In Proceedings of the Twentieth conference on uncertainty in AI, pp. 226–234. Hitchcock, C. (2004). Beauty and the bets. Synthese, 139(3), 405–420. Hoefer, C. (2007). The third way on objective probability: A sceptic’s guide to objective chance. Mind, 116(463), 549–596. Horgan, T. (2004). Sleeping Beauty awakened: New odds at the dawn of the new day. Analysis, 64, 10–21. Horgan, T. (2007). Synchronic Bayesian updating and the generalized Sleeping Beauty problem. Analysis, 67(293), 50–59. Howson, C., & Urbach, P. (1993). Scientific reasoning: The Bayesian approach (2nd ed.). Chicago: Open Court. Jenkins, C. (2005). Sleeping Beauty: A wake-up call. Philosophia Mathematica, 13(2), 194–201. Kaplan, D. (1989). Demonstratives: An essay on the semantics, logic, metaphysics and epistemology of demonstratives and other indexicals. In J. Almog, J. Perry, & H. Wettstein (Eds.), Themes from Kaplan (pp. 481–566). Oxford: Oxford University Press. Lewis, D. (1979). Attitudes De dicto and De se’. In D. Lewis (Ed.), Philosophical Papers (Vol. 1). Oxford University Press (1983).
123
Synthese (2011) 182:393–411
411
Lewis, D. (1980). A subjectivist’s guide to objective chance. Studies in Inductive Logic and Probability (Vol. 2). Berkeley, CA, USA: University of California Press. Lewis, D. (1994). Chance and credence: Humean supervenience debugged. Mind, 103, 473–490. Lewis, D. (2001). Sleeping Beauty: Reply to Elga. Analysis, 61, 171–176. Meacham, C. (2008). Sleeping Beauty and the dynamics of De se belief. Philosophical Studies, 138(2), 245–269. Monton, B. (2002). Sleeping Beauty and the forgetful Bayesian. Analysis, 62, 47–53. Monton, B., & Kierland, B. (2005). Minimizing inaccuracy for self-locating beliefs. Philosophy and Phenomenological Research, 70(2), 384–395. Perry, J. (1979). The problem of the essential indexical. Nous, 13, 3–21. Schaffer, J. (ms) The Schmentencite way out: Towards an index-free semantics. Schwarz (ms) Changing minds in a changing world. Schervish, M. J., Seidenfeld, T., & Kadane, K. B. (2004). Stopping to reflect. Journal of Philosophy, 101(6), 315–322. Strevens, M. (1995). A closer look at the ‘New Principle’. British Journal for the Philosophy of Science, 46(4), 545–561. Teller, T. (1976). Conditionalization, observation, and change of preference. In William Harper & C. A. Hooker, Foundations of probability theory, statistical inference, and statistical theories of science. Dordrecht: D. Reidel. Titelbaum, M. (2008). The relevance of self-locating beliefs. Philosophical Review, 117(4), 555–606. Van Fraassen, B. C. (1984). Belief and the will. Journal of Philosophy, 81, 235–256. Van Fraassen, B. C. (1999). A new argument for conditionalization. Topoi, 18, 93–96. Weintraub, R. (2004). Sleeping Beauty: A simple solution. Analysis, 64, 8–10. Weisberg, J. (2007). Conditionalization, reflection, and self-knowledge. Philosophical Studies, 135, 179–197. White, R. (2006). The generalized Sleeping Beauty problem: a challenge for thirders. Analysis, 66, 114–119. Williams, P. (1980). Bayesian conditionalisation and the principle of minimum information. The British Journal for the Philosophy of Science, 31(2), 131–144.
123
This page intentionally left blank z
Synthese (2011) 182:413–432 DOI 10.1007/s11229-010-9750-2
Deterministic probability: neither chance nor credence Aidan Lyon
Received: 10 January 2010 / Accepted: 11 May 2010 / Published online: 27 August 2010 © Springer Science+Business Media B.V. 2010
Abstract Some have argued that chance and determinism are compatible in order to account for the objectivity of probabilities in theories that are compatible with determinism, like Classical Statistical Mechanics (CSM) and Evolutionary Theory (ET). Contrarily, some have argued that chance and determinism are incompatible, and so such probabilities are subjective. In this paper, I argue that both of these positions are unsatisfactory. I argue that the probabilities of theories like CSM and ET are not chances, but also that they are not subjective probabilities either. Rather, they are a third type of probability, which I call counterfactual probability. The main distinguishing feature of counterfactual-probability is the role it plays in conveying important counterfactual information in explanations. This distinguishes counterfactual probability from chance as a second concept of objective probability. Keywords Chance · Credence · Determinism · Objective probability · Probability concepts 1 Introduction Some scientific theories are true of some deterministic worlds but nevertheless posit what appear to be objective probabilities.1 Classical Statistical Mechanics (CSM) is a paradigm example of such a theory. According to CSM, thermodynamic systems
1 A deterministic world is a world whose entire history supervenes on the world’s laws of nature with the
complete state of the world at any time (see Earman 1986). A. Lyon (B) Philosophy Program, Research School of Social Sciences, Australian National University, Canberra, Australia e-mail:
[email protected]
123
414
Synthese (2011) 182:413–432
are made up of large numbers of particles, whose behaviour is completely described by the deterministic theory of Hamiltonian mechanics. However, CSM entails that for any gas determined to freely expand (for example), there is some probability that it won’t expand. But how can that be? How can an event be determined to occur and yet have some probability of not occurring? Evolutionary Theory (ET) is another theory compatible with determinism that posits what appear to be objective probabilities. There has been some debate concerning whether ET is compatible with determinism, but there seems to be a growing consensus on this matter—in the affirmative.2 The problem here is similar to the one before: the gene frequencies of a population can be determined to evolve one particular way, but ET assigns some probability to them not evolving that particular way. In both theories (and in others), we have the same general issue: an event is determined to occur, but some probability is assigned to it not occurring. On the face of it, this seems like a strange and important problem. So let us give the problem a name: the paradox of deterministic probabilities.3 In the literature, there are two general strategies for resolving the paradox. The first strategy involves arguing that our concept of chance is in fact compatible with determinism, and so there isn’t any problem here (beyond the general problem of analysing chance). Those who pursue this first strategy include Levi (1990), Loewer (2001), Hoefer (2007), and Frigg and Hoefer (2009).4 The second strategy is to argue that, despite appearances, the probabilities in question are subjective probabilities, and so there is no problem because there is nothing problematic with a positive subjective probability assigned to an event determined not to occur. Those who pursue this second strategy include Rosenberg (1994), Graves et al. (1999), Schaffer (2007), and Frigg (2008).5 These two general strategies coincide with what appears to be an unspoken but often made assumption: that probabilities are only chances or credences. One often sees/hears the argument: the relevant probabilities can’t be credences (for whatever reason), so they must be chances (e.g., Loewer 2001). One also sees/hears the opposite argument: the relevant probabilities can’t be chances (for whatever reason), so they must be credences (e.g., Schaffer 2007). In contrast, the argument in this paper will be: the relevant probabilities can’t be chances (for reasons given in Sect. 2), they can’t be credences either (for reasons given in Sect. 3), so they must be something else. The upshot of the argument is that we need to identify a third concept of probability (to be explained in Sect. 4) that is distinct from chance and credence.
2 See, e.g., Weber (2001, 2005), Rosenberg (2001), Millstein (2003a,b), and Sober (1984, 2010) for dis-
cussion. 3 Loewer uses this name for the problem as it appears in CSM (Loewer 2001, p. 612). I find the problem
just as vexing in ET and any other theory it appears in, so I prefer to use the name for the general problem. 4 Sober (2010) argues for the compatibility of macro objective probability and determinism. As I argue
later, this is different from the compatibility of chance and determinism. 5 Strictly speaking, Rosenberg (1994) and Graves et al. (1999) address not the paradox of deterministic
probabilities, but a very similar problem: how a theory can assign probability values of “1/2” and the like to events that are assigned quantum mechanical probabilities very close to 1 or 0.
123
Synthese (2011) 182:413–432
415
This third concept of probability is what I call counterfactual probability, and is a second type of objective probability. The thesis is therefore that we have at least two distinct concepts of objective probability: chance and counterfactual probability, along with at least one concept of subjective probability: normatively constrained credence.6 The primary goal of this paper is to show that we have this third concept of probability in science, and to explain how it is distinct from chance and credence. (This is not to rule out the possibility that we have even more concepts that need identifying.) It is not a goal of the paper to defend an analysis of counterfactual probability—just as my goal is not to defend an analysis of chance either. That said, I will sketch an analysis of counterfactual probability at the end of Sect. 4 as this will help elucidate the concept. The analysis I sketch is an interpretation of probability based on similarity relations over ensembles of possibilities developed by Bigelow (1976, 1977). 2 Not chances Those who claim that there are no chances in deterministic worlds often do not provide a supporting argument (e.g., Lewis 1986, p. 120; Popper 1982, p. 105). In fact, there seems to be only two such arguments in the literature: the so-called Laplacean demon argument due to Laplace (1814), and an argument due to Schaffer (2007). The Laplacean demon argument has been sufficiently criticised elsewhere (see Sober 1984, 2010), so that leaves Schaffer’s argument as the only tenable one in the literature. I defend (a version of) that argument here. Schaffer identifies certain platitudes we apparently have about chance, and then argues that deterministic conceptions of chance fail to satisfy enough of those platitudes to count as genuine chance. He identifies six such platitudes in the form of six different principles about chance, one of which is the Principal Principle (PP). I’m going to focus solely on the PP here, for three reasons (other than that of brevity). First, the PP is the most famous of the principles and seems to be accepted by many authors. Second, some of the platitudes that the other principles capture are already captured by the PP.7 And third, there is some debate about some of the details of the PP in the literature that is quite pertinent to the debate about deterministic chance, and so it will be worthwhile focusing on clearing that up. Before we get to the argument from the PP, we should see what the PP is about. The PP is the following constraint on initial conditional credences, due to Lewis (1980): Cr (A|Ch tw (A) = x ∧ E) = x
(1)
6 A note on terminology: under the umbrella of “subjective probability”, I include so-called Objective Bayesians. This is because on that view, probability is degree of belief, even though that degree of belief is normatively constrained (to uniqueness or near uniqueness) by what evidence one has. On the other hand, the logical interpretation is an objective interpretation because (on the view) it represents a partial entailment relation between propositions (or sentences) that is meant to be independent of what evidence anyone has. 7 For example, one such principle is called the Future Principle, which basically says that only the future is chancy. This fact about chance is already captured by the PP with the fact that historical propositions are always admissible (see 2). (Some deny that the Future Principle is a basic truth about chance—e.g., Frigg and Hoefer (2009).)
123
416
Synthese (2011) 182:413–432
for any A and for any admissible E—where Cr is a reasonable initial credence function and Ch tw (A) is the chance of A, at time t, at world w. What it means, exactly, for a proposition to be admissible is a tricky issue, and Lewis has no definition of admissibility (as he tells us himself)—just a sketch. When sketching what admissibility amounts to, Lewis gives us some examples and a general job-description: Historical propositions are admissible; so are propositions about the dependence of chance on history [e.g., the laws of nature]. Combinations of the two, of course, are also admissible. More generally, we may assume that any Boolean combination of propositions admissible at a time also is admissible at that time. Admissibility consists in keeping out of a forbidden subject matter—how the chance processes turned out—and there is no way to break into a subject matter by making Boolean combinations of propositions that lie outside it. (Lewis 1980, p. 276) We’ll come back to the issue of how we should understand admissibility later as it is essential to understanding what platitude(s) about chance the PP is meant to capture. But for now, we have enough details to go through the argument from the PP against deterministic chance. The argument is a reductio. First, suppose Ch tw is the chance function of a deterministic world w, and consider a particular chance assignment, Ch tw (A) = x, where 0 < x < 1, and A is true. Let Htw be the entire history of world w through to time t, and let Lw be the deterministic laws of w. From the PP we have: Cr (A|Ch tw (A) = x ∧ Lw ∧ Htw ) = x
(2)
(We can substitute Lw ∧ Htw in for E because Lw ∧ Htw is assumed to be admissible.) Since 0 < x < 1, it follows that: 0 < Cr (A|Ch tw (A) = x ∧ Lw ∧ Htw ) < 1
(3)
But since w is deterministic, Lw ∧ Htw A, and so from the probability axioms: Cr (A|Ch tw (A) = x ∧ Lw ∧ Htw ) = 1
(4)
which contradicts (3). So it looks like deterministic chance cannot satisfy the PP. (As I mentioned earlier, Schaffer gives similar arguments to show that deterministic conceptions of chance fail to satisfy other principles that he takes to be constitutive of chance. The overall strategy to the argument, then, is: indeterministic conceptions of chance can satisfy all of the principles in question, while no deterministic conception of chance can do this, so there are is no such thing as deterministic chance. I will be simply focusing on the PP here, and assuming that indeterministic conceptions of chance can satisfy it.) Hoefer gives the following response to the above argument8 : 8 I have changed Hoefer’s notation slightly, to fit in with mine.
123
Synthese (2011) 182:413–432
417
[… T]his derivation is spurious; there is a violation of the correct understanding of admissibility going on here. For if Lw ∧ Htw entails A, then it has a big (maximal) amount of information pertinent as to whether A, and not by containing information about A’s objective chance! So Lw ∧ Htw , so understood, must be held inadmissible, and the derivation of a contradiction fails. (Hoefer 2007, p. 559) To maintain that Lw ∧ Htw is inadmissible is to deny at least one of the following— contra Lewis: (i) Htw is admissible, (ii) Lw is admissible, and (iii) admissibility is closed under Boolean combinations. There are some technical issues regarding whether the laws are admissible (see Lewis 1994) and whether admissibility is closed under Boolean combinations (see Lyon 2009). Fortunately, for our purposes, we can bypass these issues. It turns out that there is a similar argument against the thesis that CSM probabilities are chances that only relies on the admissibility of historical propositions. This new argument will therefore allow us to focus on whether historical propositions are admissible. For this new argument, assume everything as before, but now assume that Ch tw (A) = x is a probability statement of CSM and that A is entirely about the state of the world at time t. (Such probability statements exist in CSM and I will say more about this shortly.) From the PP we have: Cr (A|Ch tw (A) = x ∧ Htw ) = x
(5)
(We can substitute Htw in for E, because Htw is assumed to be admissible.) Since 0 < x < 1, it follows that: 0 < Cr (A|Ch tw (A) = x ∧ Htw ) < 1
(6)
But since A is entirely about the state of the world at time t, it follows that Htw A, and so from the probability axioms: Cr (A|Ch tw (A) = x ∧ Htw ) = 1
(7)
which contradicts (6). The crucial step in the above argument is the premise that CSM entails that for some A, 0 < Ch tw (A) < 1 and A is entirely about the state of the world at time t. This is true because CSM assigns probabilities at time t over ways a system could be at time t. Consider the standard CSM explanation for why ice cubes in warm water melt. For any such ice cube, it is overwhelmingly likely—but not certain—that its microstate is one that evolves deterministically into a micro-state that corresponds to the “melted” macro-state. Even though the ice cube has micro-state s1 at time t, a positive probability is assigned at time t to the ice cube having a different micro-state, s2 ,
123
418
Synthese (2011) 182:413–432
at time t.9 (So an example of A in the above argument would be any disjunction of micro-states compatible with the macro-state.) This feature of CSM probabilities results from the fact that we start with an initial probability distribution over possible initial conditions and conditionalise it on the macro-history of the world as history unfolds (see Loewer 2001, p. 618). This leaves all the micro-states compatible with the macro-history as open possibilities—even though only one micro-state obtains. Winsberg identifies this as the fundamental problem with understanding CSM probabilities objectively: The fundamental problem with understanding the probabilities in [CSM] to be objective is that we are meant to posit a probability distribution over a set of possible initial states, while we suppose, at the same time, that in fact only one of these initial states actually obtained. (Winsberg 2008, p. 2) For ease of reference, I will call this property of the time index of the probability function being the same as the time of some events that are assigned (non-trivial) probability the synchronicity of CSM probabilities.10 Note that in the above argument from the PP against CSM chances, we made no mention of the laws of nature.11 This means we made no assumption concerning the determinism/indeterminism of world w. So we have an argument for why CSM probabilities are not chances that has nothing to do with determinism. The issue of the compatibility of chance and determinism is therefore somewhat of a red-herring. The literature typically focuses on the determinism of CSM probabilities as a big problem for understanding CSM probabilities as chances. However, a more fundamental problem is their synchronicity. (Incidentally, the probabilities of quantum statistical mechanics also have this synchronicity, but note that the probabilities of quantum mechanics (standardly interpreted) do not.) Some will respond to the above argument with an objection similar to Hoefer’s objection to the original argument: there is a violation of the correct understanding of admissibility here; Htw entails A, so it contains a maximal amount of information pertinent to A, and so must be inadmissible. We therefore need to address the issue of
9 By “positive probability” I mean infinitesimal or measure—zero probability—depending on what the right way to think about such probabilities turns out to be. This is technical issue that isn’t important here—see e.g., Hájek (2003) for further discussion. 10 This is not to say that CSM does not assign diachronic probabilities (i.e., probabilities to events after t)—in fact it must, in order to make predictions. However, these past and future probabilities are derived from the present probabilities (all talk of past, present and future is relative to t). For example, the probability at time t that the entropy of a gas is high at time t (>t) is the probability that the gas has a micro-state at time t that evolves into another micro-state at t that corresponds to a macro-state of high entropy. 11 In personal communication, Roman Frigg has objected that the chance-statement “Ch (A) = x” tw
smuggles in the laws of nature, because CSM probabilities are defined in terms of the phase-flow, which is defined in terms of Hamiltonian mechanics. While it is true that CSM probabilities can be expressed in terms of the phase-flow, it doesn’t follow that a statement involving CSM probabilities has the laws of nature as part of its content. An analogy will help make this clearer. The probability of “heads” of a coin-flip can be expressed in terms of the mechanics that govern the coin-flip (e.g., see Diaconis 1998). However, the statement “the probability of “heads” is 1/2” doesn’t entail anything about those mechanics—one can know that the probability of “heads” is 1/2, without knowing anything about the mechanics of the coin.
123
Synthese (2011) 182:413–432
419
how we ought to understand admissibility (but we can do so without having to worry about the admissibility of laws and the Boolean closure of admissibility). The notion of admissibility does most of the platitude-capturing work in the PP. One of the intuitions it seems we have about chance is that it “locks” credence in a very robust way. For example, when I find out that the chance of a coin-flip landing “heads” is 1/2, my credence in “heads” is 1/2 and it doesn’t change from 1/2 when I acquire new evidence. This is one of the examples Lewis considers in his chance questionnaire at the beginning of Lewis (1980), where he first introduces the PP: A certain coin is scheduled to be tossed at noon today. You are sure that this chosen coin is fair: it has a 50% chance of falling heads and a 50% chance of falling tails. [… But] you have plenty of seemingly relevant evidence tending to lead you to expect that the coin will fall heads. This coin is known to have a displaced center of mass, it has been tossed 100 times before with 86 heads, and many duplicates of it have been tossed thousands of times with about 90% heads. Yet you remain quite sure, despite all this evidence, that the chance of heads this time is 50%. To what degree should you believe the proposition that the coin falls heads […]? Answer. […] 50% […]. To the extent that uncertainty about outcomes is based on certainty about their chances, it is a stable, resilient sort of uncertainty—new evidence won’t get rid of it. (my emphasis) (Lewis 1980, pp. 264–265) The admissibility of historical propositions is meant to capture this fundamental intuition about chance. Once we know the chance is 1/2, our credence is 1/2, and getting any other information about the world up to the current time won’t affect this. As far as intuitions about chance go, they don’t come much more straightforward than this. Nevertheless, no doubt, some will want to deny this—e.g., Strevens (2006) denies that historical propositions are admissible for related reasons. If that is the case, then it seems that the debate cannot proceed in a fruitful way since we have different intuitions about chance. This deadlock becomes obvious when one begins to give an analysis of the probabilities in question. For example, Eagle objects to propensity analyses of chance that rule out deterministic chance, while Hájek objects to frequency analyses of chance that allow deterministic chance: Classical statistical mechanics proposes non-trivial probabilities, and yet is underlaid by a purely deterministic theory. To deny that these probabilities are “real” is simply to come into conflict with one of the starting points of any genuine inquiry into the nature of probability: that it should explain the empirical success of probabilistic theories like statistical mechanics. It is a heavy burden on the propensity theorist to explain why these ‘pseudo-probabilities’, given that they are the best fillers of the role available (as far as explanation and prediction go), should be denied the umbrella of probability. (Eagle 2004, pp. 386–387) Determinism, it would seem, is incompatible with intermediate (objective) probabilities: in a deterministic world, nothing is chancy, and so all objective chances
123
420
Synthese (2011) 182:413–432
are 0 or 1. But determinism is no obstacle to there being relative frequencies that lie between these values. (emphasis in original) (Hájek 1997, p. 81) There is a sense in which this debate is merely terminological. Those who argue that chance and determinism are incompatible seem to intend to refer to a special kind of objective probability—the sort of probability one finds in quantum mechanics (standardly interpreted). Those who argue that chance and determinism are compatible intend to refer to any kind of objective probability, the sort of probability that science can discover—e.g., the probabilities one finds in CSM, ET, and many other theories.12 Perhaps it is best, then, to drop the word “chance” altogether, and instead speak of the probabilities in different theories or applications. This way, we can address the conceptual issues without getting caught up in merely terminological matters. For example, we can ask: “Do we have the same concept of probability in quantum mechanics as we do in statistical mechanics?” without using the term “chance”. Having said that, though, for the rest of this paper I will stick with the usage of “chance” that Lewis, Schaffer and others prefer. This is because once we lay out all the platitudes we seem to have about chance, it appears that indeterministic conceptions of chance satisfy these platitudes better than deterministic ones can (see e.g., Schaffer 2007). The important point, though, is that this doesn’t entail that the probabilities of CSM or ET are not “real” or objective. They are not chances, but they are still objective probabilities. They can constrain credence in a way similar to how chance constrains credence. For example, Loewer describes a version of the PP which he calls PPmacro which says (roughly) that one’s credence in A should be the probability that CSM assigns to A provided one has no macroscopically inadmissible information. The difference between chance and other objective probabilities is captured by the above quote from Lewis: new evidence will never get rid of the uncertainty set by chances, but it will for other objective probabilities. This means that we have the general concept of objective probability, under which many subconcepts fall. Chance is once such concept. How many others are there? I argue in Sect. 4 that we have at least one other such concept: counterfactual probability. To properly understand the applications of probability in our scientific theories, we may need to identify yet more concepts.13 However, I argue that at least the probabilities of CSM and ET are best understood as counterfactual probabilities. All of this, however, assumes that we should understand these probabilities objectively in the first place—i.e., that they are not subjective probabilities. This is a controversial assumption, so in the next section I will argue for it.
12 Frigg and Hoefer (2009) argue for deterministic chance but admit that “deterministic chance” seems like an oxymoron (p. 1), and that chance and determinism seem to be incompatible (p. 2). However, they give no argument for why chance and determinism are compatible—despite our intuitions—that goes beyond the usual argument that the probabilities in question cannot be subjective, therefore they must be chances. They write: “The values of these probabilities are determined by how things are, not by what we believe about them. In other words, these probabilities are chances, not credences” (p. 2). 13 For example, one referee of this article pointed out that the probabilities of coalescence theory ought be
counted as objective probabilities, but may not be chances or counterfactual probabilities.
123
Synthese (2011) 182:413–432
421
3 Not credences Here is an argument for why CSM probabilities are objective: There are certain regularities in the world that are objective facts. CSM explains those regularities, making reference to probabilities. Those probabilities must therefore be objective.14 Here are some notable instances of this argument. First, Popper: Since the physical possibility of this event [the spontaneous reversion of a gas into a bottle] cannot be doubted, we explain the experimental fact that the process is irreversible by the extreme improbability of a spontaneous reversion into the bottle. And since the fact to be explained—the irreversibility of the process—is an objective experimental fact, the probabilities and improbabilities in question must be objective also. (emphasis in original) (Popper 1982, p. 107) Second, Loewer (rhetorically): Consider, for example, the statistical mechanical explanation of why an ice cube placed in warm water melts in a short amount of time. The explanation roughly proceeds by citing that the initial macro-condition of the cup of water + ice (these are objects at different temperatures that are in contact) is one in which on the micro-canonical probability distribution it is overwhelmingly likely that in a short amount of time the cup + ice cube will be near equilibrium; i.e., the ice cube melted. If the probability appealed to in the explanation is merely a subjective degree of belief then how can it account for the melting of the ice cube? What could your ignorance of the initial state of the gas have to do with an explanation of its behaviour? (Loewer 2001, p. 611) And finally, Albert (even more rhetorically): Can anybody seriously think that it is somehow necessary, that it is somehow a priori, that the particles that make up the material world must arrange themselves in accord with what we know, with what we happen to have looked into? Can anybody seriously think that our merely being ignorant of the exact microconditions of thermodynamic systems plays some part in bringing it about, in making it the case, that (say) milk dissolves in coffee? How could that be? (all emphasis in original) (Albert 2000, p. 64) It will be convenient to break this argument up into premise and conclusion form: P1 CSM gives probabilistic explanations of objective facts. P2 The probabilities in probabilistic explanations of objective facts must be objective. C Therefore, the probabilities in CSM probabilistic explanations are objective.
14 See Sober (2010) for another, related argument, which uses a deterministic, but probabilistic model for
a coin flip. That argument relies on there being non-trivial micro-probabilities in a deterministic world, which incompatibilists would deny.
123
422
Synthese (2011) 182:413–432
P1–C is valid, and it seems to adequately capture the argument in the above passages. So we need to examine the premises. Schaffer objects to P1: There just is no probabilistic explanation in the offing here. What explains the melting of the ice cube is the complex deterministic process that runs from the ice cube’s entering the water to its melting, whose myriad details we can only guess. The only probability involved in the deterministic process of an ice cube melting is the measure of our ignorance of the real micro-explanation. (Schaffer 2007, p. 136) And so does Frigg, in response to the above passage from Albert (2000): Of course the cooling down of drinks and the boiling of kettles has nothing to do with what anybody thinks or knows about them; but they have nothing to do with the probabilities attached to these events either. Drinks cool down and kettles boil because the universe’s initial condition is such that under the dynamics of the system it evolves into a state in which this happens. All we need to explain why things happen is the initial condition and the dynamics. (Frigg 2008, p. 680) Schaffer and Frigg seem to make two claims in response to the argument P1–C: (i) (ii)
CSM does not give probabilistic explanations of irreversible phenomena. The initial conditions and the fundamental dynamics of the world are what really explain irreversible phenomena.
It seems they agree that if the explanations in question are probabilistic explanations, then the probabilities involved are objective. But they deny the antecedent of this conditional. In terms of our argument: P2 is not disputed, but P1 is.15 Schaffer goes on to explain why authors have been misled into thinking P1–C is a good argument. He draws a distinction between probabilistic explanation and probability of explanation (Schaffer 2007, p. 119). A probabilistic explanation is an explanation where probability plays an explanatory role. For example, an explanation of some phenomenon that involved a quantum mechanical probability as an explanatory factor is a probabilistic explanation.16 In contrast, a probability of explanation is “merely an ignorance measure over various nonchancy explanatory paths” (ibid.). (“Nonchancy” because we are focusing on deterministic settings.) The real explanation for why, say, an ice cube melted is a complex, non-chancy micro-physical explanation. Because we are ignorant of the micro-physical details, we assign subjective probabilities to various epistemically possible micro-physical explanations. According to Schaffer, it is a mistake to think that this is a genuine probabilistic explanation; all we have here are probabilities of explanations—an ignorance measure over various possible “real” explanations. Schaffer is not alone in drawing this distinction. Railton (1981), for example, draws the same distinction: 15 And so also an epistemic interpretation of explanation (e.g., Hempel’s Inductive-Statistical model) is not on the table here. 16 Assuming a suitable indeterministic interpretation of quantum mechanics.
123
Synthese (2011) 182:413–432
423
At first blush, one might think that whenever statistics or probabilities are involved in explanatory practice one is dealing with a form of probabilistic explanation. However, this illusion is quickly shed once one recognizes the variety of ways in which statistics and probabilities figure in explanatory activities. Perhaps the commonest use of statistics and probabilities in connection with explanation is epistemic: they are used in the process of assembling and assessing evidence for causal and non-causal explanations alike. Somewhat less common, but still important, are [those cases in which] statistics and probabilities are used in providing explanatory information about causal and non-causal processes and their initial conditions. In some cases of the last sort we have genuine probabilistic explanation, specifically, in those cases where information is provided about a physically indeterministic process. (Railton 1981, p. 254) Both Schaffer and Railton claim that only in the cases where we have indeterministic processes can we have genuine probabilistic explanations, and one cannot conclude that an explanation is probabilistic merely from the fact that it involves a probability. In other words, an explanation involving probability is not automatically a probabilistic explanation—it could be a probability of explanation. And that is where those who endorse P1–C have gone wrong: they saw explanations that involved probabilities and concluded that they must be probabilistic explanations, when in fact they are only probabilities of explanations. There are two serious problems with this response to the argument P1–C. The first problem is that once we get clear on what exactly it means for the probability distribution in a CSM explanation to be a “measure of our ignorance”, it becomes clear that CSM probabilities cannot be probabilities of explanations. I will argue that CSM probabilities do not reflect our ignorance in a way that is appropriate for understanding them as probabilities of explanations. I therefore conclude that there must be probabilities in probabilistic explanations, and so there can be probabilistic explanations even for deterministic processes. The second problem is with the idea that the initial conditions plus the deterministic laws form “the real” explanation for why an ice cube melted. I will address the first problem in the remainder of this section and the second problem in the next section as it provides a natural spring-board to identifying some of the conceptual role counterfactual probability plays in explanations. Both Schaffer and Frigg claim that CSM probabilities are measures of our ignorance. We have already seen that Schaffer makes this claim. As for Frigg: The universe has exactly one initial microcondition, and there is nothing chancy about this condition. How, then, can we understand a probability distribution over initial conditions? The only answer seems to be that this distribution reflects our ignorance about the system’s initial microcondition; all we know is the system’s initial macrostate, and so we put a probability distribution over the microconditions compatible with that macrostate that reflects our lack of knowledge. (Frigg 2008, p. 679) But what, exactly, does it mean for a probability distribution to “reflect our ignorance”? One possible answer is that it means that the probability distribution assigns probabilities to propositions that perfectly match our own personal degrees of belief in those
123
424
Synthese (2011) 182:413–432
propositions. This is implausible, though. It has been shown that our personal degrees of belief systematically fail to satisfy the standard probability axioms (Kahneman et al. 1982), and yet the probability distribution of CSM does satisfy them. Another possible answer, then, is: it means that the probability distribution assigns probabilities to propositions that we ought to have as our personal degrees of belief. This is a normative account, in contrast to the descriptive one just entertained. The plausibility of this answer depends on how the normative claim is fleshed out. For instance, if I somehow happen to know the precise micro-physical details of an ice cube, then my credences should not be aligned with the probabilities of CSM. In fact, if the ought-claim is merely a norm of rationality, then, arguably, I don’t even need to know the micro-physical details for the norm not to apply to me; I merely need to have certain beliefs about them. So, for this answer to have any plausibility, the normative claim must be relativised to a certain type of epistemic state. Schematically: if one believes , and nothing stronger, then one’s credences should be aligned with the probabilities of CSM. The question then is: What is ? A seemingly plausible answer is that is everything we actually know. We never actually know the micro-physical details of thermodynamic systems, we only ever get to know their macro-states. So, given this level of ignorance, it seems plausible that our credences should be equal to the CSM probabilities. This proposal seems to be closer to what Schaffer and Frigg are getting at. However, consider the standard CSM explanation for why an ice cube initially frozen at time t is melted at some later time, t . At t, it was highly likely that the ice cube was in a micro-state that would evolve into another micro-state at t that corresponds to the “melted” macro-state; but there is also some probability that this is not the case. Put another way, there is a non-trivial probability assigned to those possible micro-states that evolve into micro-states that correspond to the ice cube not being melted. But we know that the ice cube melted. We therefore know that the ice cube was not in a micro-state that would evolve into a micro-state that corresponds to the “frozen” macro-state. We are not as ignorant as the CSM probability distribution allegedly makes us out to be.17 The point is that while we never know what the micro-state of a thermodynamic system is, we know enough about its macro-states to rule out certain micro-states that the CSM distribution assigns non-trivial probability to. In personal communication, Carl Hoefer has raised the concern that this problem could easily be resolved by simply 17 A similar, but distinct point has been recognised in the literature on probability in evolutionary theory by Roberta Millstein. She writes:
[T]his “ignorance” interpretation overlooks the fact that we are aware of more causal factors than are included in the transition probability equation; for example, we know things about the predator and the color of the butterflies. Thus, we choose to ignore these causal factors, rather than being ignorant of them. (Millstein 2003b, p. 1321) This is slightly different to my point though. The difference is that when we use the probabilities of CSM to make predictions, the probability distribution does seem to appropriately reflect the extent of our ignorance. However, even when we make predictions with the transition probability equations that Millstein writes of, we often know more causal factors than those which the models represent.
123
Synthese (2011) 182:413–432
425
conditionalising the distribution at t—call it ρ—on the macro-history at t , to create a new distribution, ρ . The idea is that ρ assigns probability 0 to all of the problematic micro-states, and therefore adequately reflects our ignorance concerning the system. The claim that ρ reflects our ignorance in such a situation is much more plausible than the claim that ρ reflects our ignorance in the situation. However, in the standard CSM explanation for why an ice cube melts, it is ρ that appears in the explanation—not ρ . And that is the crux of the matter: while we may use ρ in order to predict facts about the world, we use ρ to explain facts about the world, even though ρ does not reflect our ignorance. This point has also been made in the literature concerning the probabilities of ET. Sober (1984) has argued that we have two uses of probabilities in ET: for predictions and explanations, and that the probabilities we use for explanations in ET will sometimes differ from the ones we use to make predictions. Sometimes the probability distribution that we use in an explanation in ET does not account for (i.e., has not been conditionalised on) everything we know, or believe. The same goes for CSM. Also in personal communication, Jonathan Schaffer has suggested that is not all that we know, but only what we know at time t—i.e., the macro-state of the system at time t. This certainly seems plausible, for the purpose of prediction—i.e., assigning credences at time t to ways the system might behave. However, this line of thought would undermine Schaffer’s original proposal regarding explanation: that CSM probabilities are probabilities of explanations, and not probabilities in probabilistic explanations. This is because according to this version of the account, CSM probabilities are no longer of a kind with other canonical examples of probabilities of explanations. For example, consider why the dinosaurs are extinct: an asteroid probably hit the Earth 65 million years ago and killed them all. The “probably” involved in this explanation is clearly a probability of explanation. Given what we know, we can’t rule out with certainty other possible explanations—e.g., that a large increase in volcanic activity 65 million years ago killed the dinosaurs. But note that given what we know, we can rule out the possibility that nothing killed the dinosaurs (i.e., that they are alive and well today), and no probability is assigned to this possibility. In contrast, we know that the ice cube doesn’t remain frozen, but some probability is assigned to this possibility. Or consider another example: Someone was sick, they were given antibiotics, and their health improved. Why did their health improve? Answer: The antibiotics probably did their work. It could have been something else—e.g., that the bacteria were resistant to the antibiotics, the person was naturally immune to the infection and that is why they got better. Here the “probably” in “The antibiotics probably did their work” is a probability of explanation—an ignorance measure over various non-chancy explanatory paths.18 And again, in contrast to the ice cube example, no probability is assigned to the person remaining ill because we know they got better.19
18 Thanks to John Matthewson for this example. 19 This is not to deny that there may be probabilistic explanations in the ballpark. For example, the prob-
ability of the patient getting better, given that they were given antibiotics may figure in a probabilistic explanation for why the patient got better.
123
426
Synthese (2011) 182:413–432
The reference to probability in a probability of explanation reflects the extent of our ignorance concerning what happened—we use this probability to make the “best guess” at what happened. However, on this latest proposal concerning what is, the probabilities of CSM are not our credences; they are not even what our credences should be, given our knowledge about the macro-states of the system. Rather, they are what our credences should be at an earlier time.20 This is radically different to any other canonical example of a probability of explanation. In fact, CSM probabilities play a role in explanations more analogous to the role chances play in canonical examples of probabilistic explanations. Consider the quantum mechanical probability that figures in the explanation for why half of a collection uranium-235 atoms have decayed after 704 million years (roughly the half-life of uranium-235). This probability clearly does not represent our current credences (i.e., after half of the atoms have decayed): according to the distribution, it is possible that all of the atoms decay. But we know that they don’t all decay. This is completely analogous to how probability figures in the explanation for why an ice cube melted. So while Schaffer is correct to draw a distinction between probabilistic explanations and probabilities of explanations, he is incorrect in placing CSM probabilities in the latter category. The claim that CSM probabilities are probabilities of explanations would be plausible if they could somehow be understood as our “ignorance measures”, but they are clearly not our “ignorance measures”—we are not as ignorant as they would make us out to be. Since we have only distinguished between probabilistic explanations and probability of explanations, the only option left is to understand CSM probabilities as probabilities that figure in probabilistic explanations. And given that they play a role similar to the one that chances play in other canonical examples of probabilistic explanations, it seems appropriate not to worry about a third possibility.
4 Explanatory ecumenism and counterfactual probability So far, I have argued that there can be probabilities in theories like CSM and ET that are not chances (Sect. 2), and not credences either (Sect. 3). These probabilities are therefore of a third type. While such a third concept of probability is typically missed by the literature, I am not alone in recognising that it exists. For example, Batterman (1992) clearly recognises that we need to identify a third concept of probability (and for similar reasons): On the old classical view, probabilities are due entirely to our ignorance of the system’s true exact state. On the quantum theory, probabilities are irreducible, where this is understood in terms of propensities and the nonexistence of hidden variables. The simple examples discussed in the early sections, as well as the later discussion of explanation in [C]SM, indicate that probabilities may arise in some third way, via instabilities and symmetries in the dynamics of the theory. (emphasis in original) (Batterman 1992, p. 347) 20 Given, presumably, what we knew or would have known about the macro-states at that time.
123
Synthese (2011) 182:413–432
427
Having argued that we need to identify such a third concept of probability, I now want to say something positive about the concept, by outlining some of the conceptual role this third concept of probability plays. What I think is the distinguishing feature of this third concept is its role in conveying a certain type of counterfactual information in explanations. A brief detour through the philosophy of explanation will help elucidate this role. As we saw before (Sect. 3), Schaffer claims that the real explanation of any given irreversible phenomenon is a micro-physical explanation. One reason to think this claim is true is if reductionism about explanation is true. However, there are good reasons not to be such a reductionist. For example, Putnam (1975) famously argues that the micro-physical explanation for why a square peg didn’t fit through a round hole in a board misses something very important about why the peg didn’t fit. The microphysical explanation doesn’t capture the fact that the exact micro-physical details do not matter: it is the squareness of the peg and the roundness of the hole that matter. Putnam goes further and argues that the micro-explanation is not even an explanation, or is a terrible one. Jackson and Pettit (1992) and Sterelny (1996) give similar arguments for why there can be multiple explanations of some phenomena, but they stop short of Putnam’s extra claim, adopting a more ecumenical approach to explanation.21 For them, any given phenomenon can have multiple, equally good, explanations for why it occurred. In the case of Putnam’s example, the micro and macro explanations are not competing explanations. Rather, they are quite complementary to each other. They provide two types of important information about the peg-board system. The micro-physical explanation tells us in full detail what stopped the peg from going through the hole; it tells which particles and which forces are actually responsible. The macro-physical explanation—the one in terms of the squareness of the peg and the roundness of the hole—tells us that it didn’t matter which particles actually stopped the peg passing through the hole; no matter how the peg was rotated, there would be some set of particles and forces stopping it from passing through the hole. Jackson and Pettit (1992) call these two types of information modally contrastive information and modally comparative information, respectively. The micro-physical explanation gives us modally contrastive information: it contrasts our world from other worlds, it identifies our world by specifying the actual micro-physical details of the peg-board system. The macro-physical explanation gives us modally comparative information: it compares our world with other worlds, it unites our world with other worlds by generalising the micro-physical details of the peg-board system. Or, more accurately, the micro-explanation gives more contrastive information than the macro-explanation does, and the macro-explanation gives more comparative information than the micro-explanation does. Both types of information can be important to an explanation. If our goal is to convey as much modally contrastive information as possible when explaining some phenomenon, then in the ideal case, we give a micro-physical explanation. In a deterministic world, such an explanation will not involve (non-trivial)
21 See also Sober (1999) for another pluralist approach to such explanations.
123
428
Synthese (2011) 182:413–432
probabilities, for if it did, there would always be some other explanation, with more information about what the actual world is like. However, in an indeterministic world—like our quantum world—such explanations will typically involve (non-trivial) probabilities for there is no more information available to contrast the actual world with other worlds.22 These facts allow us to identify an important aspect of the conceptual role chance plays in explanations: chances are those probabilities in explanations that maximise modally contrastive information. However, often enough, our goal is not to maximise modally contrastive information. Often our goal is to convey some level of modally comparative information. When this is the case, we end up giving a macro-physical explanation. Typical examples of this in the literature are cases where the modally comparative information comes in an all-or-nothing form. For instance, in Putnam’s example, the modally comparative information is the proposition that the peg is square and the hole is round. This tells us that the micro-physical details did not matter at all. However, there are cases where modally comparative information comes in degree form. These are cases where the micro-physical details matter or do not matter to some degree. Jackson and Pettit (1992) happen to give such an example: a flask full of water cracked because the water was boiling. When discussing this example, they write: … in being made aware of the boiling-water explanation, we learn something new: we learn that in more or less all possible worlds where the relevant causal process is characterized by involving boiling water, the process will lead to the flask cracking. (my emphasis) (Jackson and Pettit 1992, p. 15) The CSM explanation for why an ice cube melted (for example) does something very similar: it tells us that the micro-physical details of the ice cube matter very little. This “modal robustness” is an important, contingent physical fact about thermodynamic systems. To not recognise this fact, and to focus only on the micro-physical details of the world, is to miss an important fact about the world. Railton (1981) has also pointed out this fact about CSM in connection with explanation: This illuminates a modal feature of the causal processes involved and therefore a modal feature of the relevant ideal explanatory texts: this sort of causal process is such that its macroscopic outcomes are remarkably insensitive (in the limit) to wide variations in initial microstates. The stability of an outcome of a causal process in spite of significant variation in initial conditions can be informative about an ideal causal explanatory text in the same way that it is informative to learn, regarding a given causal explanation of the First World War, that a world war would have come about (according to this explanation) even if no bomb had exploded in Sarajevo. This sort of robustness or resilience of a process is important to grasp in coming to know explanations based upon it. (Railton 1981, p. 251)
22 They will not necessarily involve probabilities as there can be non-probabilistic indeterministic worlds
(see Norton 2006).
123
Synthese (2011) 182:413–432
429
However, Railton ranks explanations according to their quality (or what he calls their “explanitoriness”) (Railton 1981, p. 240). At one end of the ranking, we have the best explanations (the most explanatory), which maximise what Railton calls explanatory information. At the other end are the worst explanations, those that convey no explanatory information. By “explanatory information”, Railton appears to intend to refer to the same concept that we have been using “modally contrastive information” to refer to. So, on this picture, CSM explanations are not as good as explanations involving initial conditions and dynamical laws. However, I prefer to distinguish between two types of explanatory information: modally contrastive information, and modally comparative information. On this picture, CSM explanations are explanatory in a way that is different from the way that the explanations involving the initial conditions and dynamical laws are explanatory, and so the former are neither better nor worse than the latter. Probability plays a similar conceptual role in the explanations of ET, for ET is also a theory that conveys modally comparative information in its explanations. As Sterelny and Kitcher write: In principle, we could relate the biography of each organism in the population, explaining in full detail how it developed, reproduced, and survived, just as we could track the motion of each molecule of a sample of gas. But evolutionary theory, like statistical mechanics, has no use for such a fine grain of description: the aim is to make clear the central tendencies in the history of evolving populations […]. (Sterelny and Kitcher 1988, p. 345) Both theories are in the business of abstracting away from particular details of their systems of interest and capturing and explaining the general behaviour of those systems.23 In the case of ET, it may have been that a population evolved one particular way because of some detailed sequence of organism and environment interactions, but that sequence may have been largely irrelevant to the final outcome. That is, the population would have still evolved the way it did, had the sequence been slightly different, because of (say) fitness differences in the population. So sometimes we use probabilities to express modally comparative information about a system, which is a certain kind of counterfactual information about the system. Such probabilities are objective since they express objective facts about the system in question, and they play a role in explanations that is distinct from the role chances play in explanations. For lack of a better term, I call these probabilities counterfactual probabilities. Counterfactual probability is not probability in some counterfactual situation; rather it is a measure of how robust a proposition is under a class of counterfactual situations.24 (Sterelny (1996) calls explanations that convey modally comparative information robust process explanations. In this terminology, counterfactual probability is a measure of how robust a robust process is.) A rough analogy with logical probability may make it clearer how these probabilities are counterfactual. Logical probabilities (if they exist) generalise logical entailment: “P(A|B) = x” generalises
23 Sober (1984, pp. 118–134) also makes this point. 24 One referee for this article suggested calling chances “maximally fine-grained” and counterfactual prob-
abilities “more coarse-grained”, which is a useful way of thinking about the difference.
123
430
Synthese (2011) 182:413–432
“B A”. In a similar way, counterfactual probabilities generalise counterfactuals: “P(A|B) = x” generalises “B−→ A”. Now that chance and counterfactual probability are distinguished, the next question is how to analyse them.25 Answering this question in full detail requires the space of another paper. However, I would like to suggest one approach to answering this question, as I think it may help make the distinction between chance and counterfactual probability clearer. To help fix our ideas, let us assume that a propensity interpretation (e.g., Popper 1959) or a best-system analysis (e.g., Lewis 1994) is correct for chance. How, then, should we analyse counterfactual probability? The main distinguishing role that counterfactual probability plays is that it conveys important counterfactual information in explanations. So an analysis that is closely connected to the analysis of counterfactuals seems like a promising first start. Fortunately, an analysis along these lines has already been developed by Bigelow (1976, 1977). Roughly speaking, Bigelow takes the similarity relation that appears in popular analyses of counterfactuals and uses it to induce a similarity metric over an ensemble of possibilities. This metric is then used to measure the size of propositions in that ensemble, and then this measure is used to build a probability function. Roughly speaking, on this account, the probability of a proposition is proportional to the size of the proposition (measured according to the similarity relation, which is taken as primitive), which is precisely the sort of thing we are after. I think this is a very promising approach, and it would be worth exploring in further detail in another paper. 5 Conclusion In this paper I have not developed an analysis of counterfactual probability, just as I have not developed an analysis of chance. My goal here has been primarily to show that we need to identify and distinguish these two concepts of objective probability in our scientific theories. How we should analyse these two concepts—that is, what theory we should give as to what in the world makes chance statements and counterfactual probability statements true—is another matter. It may be that propensity facts make chance statements true, and frequency facts make counterfactual probability statements true. Or it may be that frequency facts make both sorts of statements true. My suspicion is that a best-system analysis of chance (Lewis 1994) and a similarityrelation analysis of counterfactual probability (Bigelow 1976, 1977) will be the best approach to take. Identifying more than one concept of objective probability has two significant advantages over not doing so. First, we can avoid the unacceptable conclusion that the probabilities of some our scientific theories are subjective just because there are 25 The two concepts differ in other respects as well—not just by the role they play in probabilistic explanations. For example, counterfactual probabilities won’t satisfy the PP and yet chances do. Counterfactual probabilities will satisfy some of the platitudes identified by Schaffer (2007), but not all of them (e.g., the counterfactual probabilities of ET will not satisfy the Lawful Magnitude Principle if there are no laws of biology). I lack the space here for a full discussion of how counterfactual probabilities satisfy or fail to satisfy Schaffer’s platitudes. However, all I need for the purposes of this paper is that chance and counterfactual probability differ in at least some respects.
123
Synthese (2011) 182:413–432
431
no deterministic chances. And second, we can avoid the unacceptable conclusion that chance and determinism are compatible just because the probabilities of our scientific theories are not subjective. In other words, by distinguishing two types of objective probabilities, we can avoid the paradox of deterministic probabilities (see Sect. 1). I have argued that what distinguishes two particular types of objective probabilities are the roles they play in probabilistic explanations. Chances are those probabilities in explanations that maximise modally contrastive information. Counterfactual probabilities are those probabilities in explanations that give some level of modally comparative information in degree form.26 To account for probability of explanations (see Sect. 3), we need the concept of subjective probability. Therefore, to account for explanations in science that involve probabilities generally, we need to identify at least three concepts of probability: subjective probability, chance, and counterfactual probability. Acknowledgements I’d like to especially thank Alan Hajek and Jonathan Schaffer for incredibly helpful and extensive comments on earlier versions of this paper. I’d also like to thank Marshall Abrams, Jeremy Butterfield, Fabrizio Cariani, David Chalmers, Mark Colyvan, Anthony Eagle, Kenny Easwaran, Christopher Eliot, Branden Fitelson, Roman Frigg, Carl Hoefer, Jack Justus, John Matthewson, Daniel Nolan, Mike Titelbaum, and audiences at the LSE Sigma Club, 2009 Foundations of Uncertainty Conference in Prague, 2009 Meeting of the Australasian Association for Philosophy, University of Sydney Current Projects Seminar, and the Australian National University PhilSoc Seminar.
References Albert, D. (2000). Time and chance. Cambridge, MA: Harvard University Press. Batterman, R. (1992). Explanatory instability. Noûs, 26, 325–348. Bigelow , J. (1976). Possible worlds foundations for probability. Journal of Philosophical Logic, 5(3), 299–320. Bigelow, J. (1977). Semantics of probability. Synthese, 36(4), 459–472. Diaconis, P. (1998). A place for philosophy? The rise of modeling in statistical science. Quarterly of Applied Mathematics, 56(4), 797–805. Eagle, A. (2004). Twenty-one arguments against propensity analyses of probability. Erkenntnis, 60, 371–416. Earman, J. (1986). A primer on determinism. Dordrecht: Reidel. Frigg, R. (2008). Chance in Boltzmannian statistical mechanics. Philosophy of Science, 75(5), 670–681. Frigg R. & Hoefer C. (2009) Determinism and chance from a Humean perspective . In D. Dieks, W. Gonzalez, S. Hartmann, M. Weber, F. Stadler, & T. Uebel (Eds.), The present situation in the philosophy of science. Berlin: Springer. Graves, L., Horan, B., & Rosenberg, A. (1999). Is indeterminism the source of the statistical character of evolutionary theory?. Philosophy of Science, 66, 140–157. Hájek, A. (1997). ‘Mises Redux’-redux: Fifteen arguments against finite frequentism. Erkenntnis, 45, 209–227. Hájek, A. (2003). What conditional probability could not be. Synthese, 137(3), 273–323. Hoefer, C. (2007). The third way on objective probability: A sceptic’s guide to objective chance. Mind, 116(463), 549–596. Jackson, F., & Pettit, P. (1992). In defense of explanatory ecumenism. Economics and Philosophy, 8(1), 1–21. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgement under uncertainty: Heuristics and biases. Cambridge: Cambridge University Press. 26 This is not to say that these roles are all that there is to chance and counterfactual probability. For
example, both are also guides to credences.
123
432
Synthese (2011) 182:413–432
Laplace, P. (1814). A philosophical essay on probabilities. New York: Dover (1951) Levi, I. (1990). Chance. Philosophical Topics, 18(2), 117–149. Lewis, D. (1980). A subjectivist’s guide to objective chance. In R. C. Jeffrey (Ed.), Studies in inductive logic and probability (Vol. II). Berkeley: University of California Press. Lewis, D. (1986). A subjectivist’s guide to objective chance. In Philosophical papers (Vol. II, pp. 83–132). Oxford: Oxford University Press. Lewis, D. (1994). Humean supervenience debugged. Mind, 103, 473–490. Loewer, B (2001). Determinism and chance. Studies in History and Philosophy of Modern Physics, 32, 609–620. Lyon, A. (2009). Three concepts of probability. Ph.D. thesis, Australian National University. Millstein, R. (2003a). How not to argue for the indeterminism of evolution: A look at two recent attempts to settle the issue. In A. Hüttemann (Ed.), Determinism in physics and biology (pp. 91–107). Paderborn: Mentis. Millstein, R. L. (2003b). Interpretations of probability in evolutionary theory. Philosophy of Science, 70, 1317–1328. Norton, J. (2006). The dome: An unexpectedly simple failure of determinism. Philsci-Archive. Popper, K. R. (1959). The propensity interpretation of probability. The British Journal for the Philosophy of Science, 10(37), 25–42. Popper, K. (1982). Quantum theory and the schism in physics. Totowa, NJ: Rowman and Littlefield. Putnam, H. (1975). Philosophy and our mental life. In Readings in philosophy of psychology. Cambridge, MA: Harvard University Press. Railton, P. (1981). Probability, explanation, and information. Synthese, 48(2), 233–256. Rosenberg, A. (1994). Instrumental biology or the disunity of science. Chicago: University of Chicago Press. Rosenberg, A. (2001). Discussion note: Indeterminism, probability, and randomness in evolutionary theory. Philosophy of Science, 68(4), 536–544. Schaffer, J. (2007). Deterministic chance?. The British Journal for the Philosophy of Science, 58(2), 113–140. Sober, E. (1984). The nature of selection. Chicago: University of Chicago Press. Sober, E. (1999). The multiple realizability argument against reductionism. Philosophy of Science, 66(4), 542–564. Sober, E. (2010). Evolutionary theory and the reality of macro probabilities. In E. Eells & J. Fetzer (Eds.), Probability in science. La Salle, IL: Open Court. Sterelny, K. (1996). Explanatory pluralism in evolutionary biology. Biology and Philosophy, 11, 193–214. Sterelny, K., & Kitcher, P. (1988). The return of the gene. The Journal of Philosophy, 85(7), 339–361. Strevens, M. (2006). Probability and chance. Encyclopedia of philosophy (2nd ed.). Detroit: Macmillan Reference USA. Weber, M. (2001). Determinism, realism, and probability in evolutionary theory. Philosophy of Science, 68(3), S213–S224. Weber, M. (2005). Darwinism as a theory for finite beings. In V. Hösle & C. Illies (Eds.), Darwinism and philosophy (pp. 275–297). Notre Dame: University of Notre Dame Press. Winsberg, E. (2006). Probability and chance. Studies in History and Philosophy of Modern Physics. doi:10.1016/j.shpsb.2008.05.005.
123
Synthese (2011) 182:433–447 DOI 10.1007/s11229-010-9752-0
Proper function and defeating experiences Daniel M. Johnson
Received: 22 February 2010 / Accepted: 14 May 2010 / Published online: 23 July 2010 © Springer Science+Business Media B.V. 2010
Abstract Jonathan Kvanvig has argued that what he terms “doxastic” theories of epistemic justification fail to account for certain epistemic features having to do with evidence. I’m going to give an argument roughly along these lines, but I’m going to focus specifically on proper function theories of justification or warrant. In particular, I’ll focus on Michael Bergmann’s recent proper function account of justification, though the argument applies also to Alvin Plantinga’s proper function account of warrant. The epistemic features I’m concerned about are experiences that should generate a believed defeater but don’t. I’ll argue that proper functionalism as it stands cannot account for the epistemic effects of these defeating experiences—or, at least, that it can only do so by embracing a deeply implausible view of our cognitive faculties. I’ll conclude by arguing that the only plausible option Bergmann has for modifying his theory undercuts the consideration that motivates proper functionalism in the first place. Keywords
Proper function · Defeaters · Evidence · Justification
Jonathan Kvanvig has argued in a number of places that what he terms “doxastic” theories of epistemic justification (which include all externalist and some internalist theories) fail to account for certain epistemic features having to do with evidence. Some of those features include propositional (as opposed to doxastic) justification, defeater theory in general, defeater–defeaters, and the Quine/Duhem problem.1 I’m going to
1 See Kvanvig and Menzel (1990) and Kvanvig (1992, 2000, 2003, 2006, 2007a, b).
D. M. Johnson (B) Baylor University, Waco, TX, USA e-mail:
[email protected]
123
434
Synthese (2011) 182:433–447
give an argument roughly along these lines, but instead of targeting doxasticism in general, I’m going to focus specifically on proper function theories of justification or warrant. The epistemic features I’m concerned about are experiences that should generate a believed defeater but don’t. I’ll argue that proper functionalism cannot account for the epistemic effects of these defeating experiences—or, at least, that it can only do so by embracing a deeply implausible view of our cognitive faculties.2 For the sake of simplicity, I’ll focus on Michael Bergmann’s recent proper function account of justification, though my argument can be applied as well to Alvin Plantinga’s proper function account of warrant. In Sect. 1, I’ll explain the motivation for Bergmann’s two conditions on epistemic justification, the proper-function condition and the no-believed-defeater condition. In Sect. 2, I’ll present a dilemma designed to show that Bergmann’s account, as it stands, cannot account for the epistemic effects of defeating experiences. In Sect. 3, I’ll argue that the only way Bergmann can preserve his account unmodified is by embracing a deeply implausible view of cognitive faculties and defeater systems. Bergmann has two options for modifying his theory to take these sorts of defeaters into account. In Sects. 4 and 5, I’ll argue that one option just fails outright and that the other undercuts the consideration that motivates proper functionalism in the first place. Finally, in Sect. 6, I’ll discuss and rebut two attempts to attack my setup of the case which causes trouble for proper functionalism. 1 Bergmann’s analysis of justification Bergmann argues that two conditions are severally necessary and jointly sufficient for epistemic justification. The first and most important is the proper function condition: a belief B is justified only if “the cognitive faculties producing B are (a) functioning properly, (b) truth-aimed and (c) reliable in the environments for which they were ‘designed.”’3 To motivate this condition, he begins by provisionally accepting an evidentialist thesis: “S’s belief B is justified iff B is a fitting doxastic response to S’s evidence.”4 Evidentialists usually endorse three claims about what makes a belief a “fitting” doxastic response to evidence: first, that the fittingness of a doxastic response to evidence is not contingent on that evidence being a reliable indicator of the belief’s truth (the nonreliability claim); second, that the fittingness of a doxastic response to evidence is objective fittingness, in the sense that a blameless subjective sense that the belief fits the evidence is not sufficient for it (the objectivity claim); and third, that the fittingness of the doxastic response to evidence is an essential feature of the relation of that response to that evidence (the necessity claim). Bergmann accepts the nonreliability claim, on the basis of demon world counterexamples to reliabilism, and his proper function account handles these examples because proper function does not 2 These defeaters are a subset of what Jennifer Lackey calls “normative” defeaters. Someone has a “psychological” defeater, what Bergmann calls a “believed” defeater, for a belief when they take that belief to be defeated. Someone has a “normative” defeater for a belief when they should take their belief to be defeated because of a belief or an experience they have, but don’t. What I call defeating experiences are normative defeaters that are experiences. See Lackey and Sosa (2006). 3 Bergmann (2006, p. 133). 4 Bergmann (2006, p. 110).
123
Synthese (2011) 182:433–447
435
entail reliability. He also accepts the objectivity claim because he doesn’t think that simply thinking one’s evidence is good evidence for a belief makes that evidence good evidence. The motivation for proper functionalism, then, turns on Bergmann’s rejection of the necessity claim. His reason for rejecting the necessity claim is a kind of Reidian counterexample to it. Consider the following belief B1: there is a smallish, hard, round object in my hand. Consider also the following two experiences: ME1, a tactile sensations of the type you experience when you grab a billiard ball, and ME2, an olfactory sensation of the type you experience when you smell a meadow full of flowers. Now, for actual humans, B1 is a fitting doxastic response to ME1 but not to ME2. Bergmann insists, though, that it is conceivable that there be a species of cognizers for whom the evidential relationship is the other way around, for whom B1 is a fitting doxastic response to ME2 but not to ME1. Because we can imagine such a species, thinks Bergmann, it must be the case that there is nothing about either type of experience (ME1 or ME2) that it is necessarily connected to the belief (B1), and so the fittingness of a doxastic response to evidence need not be a necessary relation. (Bergmann’s example is much more nuanced and extensive than my presentation of it, employing key distinctions between learned and unlearned doxastic responses and between primary and secondary qualities. Since I do not intend to challenge this example directly, it is unimportant for my purposes that I do justice to these nuances.) Since the necessity claim is false—since the fittingness of a belief to evidence is not a necessary truth—we need some account of justification that indexes evidential fittingness to species with different design plans to account for the Reidiantype examples like the one above. The proper function condition, thinks Bergmann, is the most natural way to achieve this indexing while preserving both the nonreliability and objectivity claims.5 The second condition that Bergmann imposes on justification is a no-believeddefeater condition: S’s belief B is justified only if “S does not take B to be defeated.”6 A defeater is a belief or experience that renders a belief of mine unjustified that was justified or would be justified in the absence of the defeater. A believed defeater is not a defeater that has the property of being believed; instead, it is a belief that there is a defeater present—I have a believed defeater for a belief B when I believe that B is epistemically inappropriate. Bergmann appeals to intuitions on cases in order to argue that believed defeaters are defeaters, and that whenever I believe that my belief B is epistemically inappropriate, I am not justified in believing B. This no-believeddefeater condition has to be a separate condition from the proper function condition, because it is possible for there to be a species that is designed to have believed defeaters for many of their beliefs but to retain those beliefs in the face of the believed defeaters. In this case, the cognizers would be functioning properly but would intuitively still not be justified in believing as they do. Therefore, the no-believed-defeater condition 5 Once he establishes the need for the proper function condition, Bergmann argues that we lose the motivation for requiring that proper inputs to belief-forming processes be “evidence” at all (mental or accessible states), since it could be that properly functioning cognitive processes could respond to non-evidential external inputs as well. This, though, is not important for the purposes of my argument. 6 Bergmann (2006, p. 133).
123
436
Synthese (2011) 182:433–447
cannot be indexed to species design plans as other evidential relations can, and so must be separate from the proper function condition. This completes my summary of Bergmann’s analysis of justification and the motivations for that analysis. “S’s belief B is justified iff (i) S does not take B to be defeated and (ii) the cognitive faculties producing B are (a) functioning properly, (b) truth-aimed and (c) reliable in the environments for which they were ‘designed.’ ”7
2 The problem: defeating experiences The phenomenon that I want to argue causes problems for this analysis of justification is the experience that should generate a believed defeater but doesn’t. Consider the following case (modified from one of Bergmann’s examples).8 I am on a hike in the mountains, and read in my guidebook (which I justifiably believe is reliable) that you cannot see any lakes from the peak. I thereby form the justified belief that I won’t be able to see any lakes from the peak. I then, an hour later, reach the peak, look down, and see what is obviously a lake. For whatever reason, though, I fail to put two and two together and do not come to believe that there is a lake there. I continue to believe that you cannot see any lakes from the peak and do not regard my belief that you cannot see any lakes from the peak to be defeated. (There is nothing really obscure or artificial about this case; it is the sort of thing we experience often. The lake experience just doesn’t register on me, though I form many other beliefs normally while at the top of the peak. When I come down from the peak, perhaps a stranger asks me whether you can see any lakes from the peak. I blithely answer “no” because I’m still holding rather ridiculously to the testimony of the guidebook. Perhaps I have a friend who accompanied me to the peak, and when I tell the stranger that you can’t see any lakes from the peak, my friend stares at me incredulously. He asks with a laugh, “Dan, are you sure you can’t see any lakes from the peak?” and gives me a knowing look. I, confused, think back—and sure enough, remember my visual experience of a lake and rather shamefacedly abandon my no-seen lakes belief.) While I am at the peak, I form other beliefs—I see a rock next to me, and form the belief that there is a rock at the top of the peak; I see a tree on the mountainside and form the belief that there is a tree there. Clearly, my belief that you cannot see any lakes from the peak is no longer justified after my trip to the peak—it has been defeated by my visual experience of seeing a lake. Equally as clearly, my beliefs that there is a rock and a tree at the peak are not defeated by my visual experience of a lake, and so they remain justified.9 Bergmann’s analysis, I will argue, cannot get this result—he must either affirm that the lake-belief remains justified, or he must deny that the rock- and tree-beliefs remain justified. In my example, I have not violated Bergmann’s no-believed-defeater condition, since I do not take my belief to be defeated. Therefore, if Bergmann’s analysis of 7 Bergmann (2006, p. 133). 8 Bergmann (2006, pp. 155–156). 9 I’ll discuss possible strategies for denying these intuitions on the case in Sect. 6.
123
Synthese (2011) 182:433–447
437
justification is to account for this should-be-believed defeater, I must have violated the proper function condition. Sure enough, this is Bergmann’s response: But [Bergmann’s account of justification] also requires for a belief B’s justification that the cognitive faculties involved in the production of B be functioning properly. These cognitive faculties will include any defeater systems whose operation is relevant to whether B is held. They too must be functioning properly in order for B to be justified. So, even in a case where the person holding B has no believed defeater for B, it might be the case that she epistemically should have a believed defeater for B and that the reason she doesn’t is that her defeater systems involved in the production of B aren’t functioning properly.10 In the case of my trip to the peak, my belief that you can’t see any lakes from the peak, though it was originally formed by properly functioning cognitive faculties, is not sustained by properly functioning cognitive faculties, since the faculty that would generate a belief that there is a lake from the visual experience of a lake is malfunctioning. Therefore, I do not meet the proper function condition on justification. I’m not satisfied with this response, and the remainder of this section is an attempt to refute it. Let us be clear as to what Bergmann needs in order to get the right result on this case. There are three sets of cognitive faculties operating here: (1) The cognitive faculties which produce and sustain the lake-belief. (2) The cognitive faculties which produce and sustain the rock- and tree-beliefs. (3) The cognitive faculties which malfunction in failing to generate a belief that there is lake and that you can see lakes from the peak. Since Bergmann must handle this case with his proper function condition, in order to get the right result, it must be the case that the faculties producing the lake-belief (1) are malfunctioning, while the faculties producing the rock- and tree-beliefs (2) are not malfunctioning. Therefore, it must be the case that the malfunctioning faculties (3) are the same faculties as those producing the lake-belief (1), but not the same as those producing the rock- and tree-beliefs (2). So Bergmann needs to find a way to identify or individuate faculties such that (1) and (3) count as the same faculties, while (2) and (3) do not count as the same faculties. This is what I contend the proper functionalist cannot do. Think for a moment about the kinds of faculties involved. I get my lake-belief from reading my guidebook, and so (1) is a set of faculties that involves language abilities and the assimilation of testimony, and this set of faculties is exercised an hour before I get to the peak (and so the faculties which sustain my belief involve my memory). I get my rockand tree-beliefs from visual experiences while at the top of the peak, so (2) involves my visual abilities. The faculties which malfunction (3) in my failing to register the existence of a lake are also visual abilities, and, like (2), are exercised at the top of the peak. Clearly, (3) is no closer to (1) than it is to (2). Therefore, this case presents Bergmann with a dilemma. Bergmann isn’t going to be able to identify or individuate the malfunctioning faculties (3) in such a way that they turn out to be the same as the 10 Bergmann (2006, p. 170).
123
438
Synthese (2011) 182:433–447
faculties producing the lake-belief (1) without also counting them as the same as the faculties producing the rock- and tree-beliefs (2). On the one hand, if he individuates the malfunctioning faculties (3) broadly enough to count as the same faculties as (1), then they will also count as the same faculties as (2); on the other hand, if he individuates the malfunctioning faculties (3) narrowly enough to count as different than (2), then they will also count as different than (1). He therefore fails to get the right result on the case: either the (intuitively justified) rock- and tree-beliefs are counted as unjustified, or the (intuitively unjustified) lake-belief is counted as justified. So there is an absurdity either way. If the cognitive faculties relevant to justification are individuated broadly enough to count the faculties that generated the testimonially-based no-seen-lakes belief as the same as the faculties that malfunctioned by failing to generate a perceptually-based lake-belief, then those faculties are individuated broadly enough to count as the same faculties that generated the other perceptual beliefs (the rock- and tree-beliefs). In this case, the justification of the other perceptual beliefs is undercut by the faculties’ malfunctioning in the case of the would-be lake-belief, which is intuitively absurd. On the other hand, if the cognitive faculties are individuated narrowly enough to differentiate the malfunctioning faculties from the faculties producing the other perceptual beliefs, then they are individuated narrowly enough to also differentiate the malfunctioning faculties from the faculties that produced the testimonially-based no-seen-lakes belief. In this case, the justification for the no-seen-lakes belief is not undercut by the visual experience of the lake, which is also absurd. Here is an objection: does my argument assume that if there is a malfunction that results in the production or non-production of a belief, then all the cognitive faculties involved are malfunctioning such that all the beliefs they are involved in producing and sustaining lack justification?11 It would be a mistake to assume this, because Bergmann (following Plantinga) individuates cognitive faculties finely enough that a malfunction of (for example) my visual faculties in one case doesn’t render all my visually-produced beliefs unjustified.12 My argument does not assume this, however. I grant that it is possible to individuate faculties finely enough so that, in the example, the malfunctioning of my visual systems in failing to produce a lake-belief in me doesn’t affect the justification for my other visually-based beliefs (the rock- and tree-beliefs). My argument is that, if you individuate cognitive faculties this finely, you individuate them too finely for the malfunction to defeat what it should defeat, my no-seen-lakes belief. This argument bears a certain similarity to the well-known generality problem for reliabilism, in that it asserts that the proper functionalist has a problem in rightly individuating the faculties which are relevant for the justification of a belief. The generality argument merely claims, however, that reliabilism is necessarily arbitrary in how it individuates justificationally-relevant processes. I claim, more strongly, that however the proper functionalist individuates the justificationally-relevant cognitive faculties, the theory ends up ruling justified beliefs unjustified or vice versa.
11 This objection was suggested to me by Michael Bergmann in correspondence. 12 Bergmann (2006, p. 133, note 48).
123
Synthese (2011) 182:433–447
439
The heart of my argument is the point that we cannot account for the fact that the visual experience of a lake is relevant for the justification of the no-seen-lakes belief but not my other beliefs solely in terms of proper function. There is no greater distance in terms of cognitive faculties and processes between the visual experience of the lake and the rock- and tree-beliefs than there is between the visual experience of the lake and the no-seen-lakes belief. However you individuate cognitive faculties and processes, you aren’t going to be able to connect the visual experience of the lake and the no-seen-lakes belief without also connecting it to a bunch of other beliefs that it shouldn’t be justificationally relevant for. This suggests that the justificational connection between the visual experience of the lake and the no-seenlakes view will have to be accounted for in some other way than properly functioning cognitive faculties or processes. At least, it means that Bergmann needs to alter his analysis of justification to account for this phenomenon. There are two ways he can do so: he can alter or expand the proper function condition or he can alter or expand the no-defeater condition. I’ll examine each (in Sects. 4 and 5), but first I’ll take a look at one way proper functionalism could escape between the horns of the dilemma argument I gave in this section. 3 The first option: individuating cognitive faculties finely There is one way in which Bergmann could individuate cognitive faculties that would allow him to escape the above argument.13 In this section, I’ll detail that way and argue that it is deeply implausible. In order to avoid the dilemma, the proper functionalist needs a way to individuate cognitive faculties so that (in the example case) the malfunction I experienced at the peak when having the lake-like visual experience involves the same faculties that produced and sustained my no-seen-lakes belief (from reading the guidebook), but does not involve the same faculties that produced the rockand tree-beliefs I formed from visual experiences at the peak. Here is the only way I know to individuate cognitive faculties to get this result. Deny altogether that what malfunctioned at the peak were my “visual” faculties or any such general faculty. Suppose instead that I have a unique faculty producing and sustaining each belief I have—one belief per faculty. Then suppose that there is a defeater system for the belief built into each one of these faculties, an exhaustive specification of the conditions under which the belief should be given up. (This defeater system would be a kind of proper functionalist correlate to an evidentialist’s set of necessary truths about evidential relations.) If this were true, if there were an individual faculty for every belief with a defeater system built into each one of those faculties, the proper function theorist would have a way out of the dilemma. The proper function theorist could simply assert that what is malfunctioning in the case of my lake-like visual experience is not my visual faculties (utilized on the peak) or my testimonial faculties (utilized when reading the guidebook), but instead the defeater system built into the faculty that produces and sustains my no-seen-lakes belief (and no other belief). In 13 Bergmann in correspondence offered a suggestion that led to the broad outline of this response, but I
have fleshed it out in ways to which I am sure he would be unwilling to agree.
123
440
Synthese (2011) 182:433–447
this case, the malfunctioning of the defeater system only makes the faculty producing and sustaining this unique belief a malfunctioning one. So the malfunctioning of the defeater system would undercut (render unjustified) my no-seen-lakes belief and no other belief. There is even some reason for the proper function theorist to accept this account independent of the desire to avoid my argument. It does seem that the malfunction is not the mere fact that I didn’t form the belief that you can see a lake from the peak from the visual experience of the lake; it is not a requirement of rationality that I form a belief in response to every sense experience I have. The problem is that I retained my no-seen-lakes belief in the face of this experience—I failed to see that the experience undercut my other belief in some way. So the failure seems to be of a defeater system, not of a general capacity like vision. However, the defeater system which is malfunctioning cannot be a purely general defeater system, a system which governs all of my beliefs, because then its malfunction would undercut my rock- and tree-beliefs as well as my lake belief (indeed, it would undercut every one of my beliefs). The defeater system is subject to the same individuation problems I pointed out for all the other systems; therefore, to get the case right, there must be a unique defeater system for every one of my beliefs. It is important to get clear on what precisely this response requires. First, it is necessary that there be a unique faculty producing and sustaining each and every belief that I have, with a unique defeater system built into each faculty. If there is any generality of faculties at all—if there is a faculty producing two different beliefs—then the door is opened to my kind of counterexample, where the malfunctioning of a defeater system undercuts more beliefs than it should, because the conditions under which the beliefs should be given up will presumably be different for each belief. If, for instance, the same faculty that produces and sustains my no-seen-lakes belief from my reading of my guidebook also produces and sustains another belief resulting from my reading of the guidebook—say, a belief that you can see another mountain from the peak—then the malfunctioning of the defeater system in the case of the no-seen-lakes belief will, in virtue of rendering the faculty a malfunctioning one, will also render the mountain-belief unjustified, which it shouldn’t. So there has to be a unique faculty for each and every belief, with a numerically different defeater system built into each faculty. Second, what is more, there will actually have to be multiple different faculties for each proposition I could believe. After all, I can come to believe that you can’t see any lakes from the peak on the basis of reading my guidebook, or on the basis of going to the peak, looking around, and not seeing any lakes, or in any number of other ways—and I don’t want to have to say that it was the same faculty in each case operative in the production of that belief. So there will have to be a unique faculty for each belief and each way that the belief can be arrived at—multiple unique faculties in the case of each individual proposition I could believe. Third, each of these faculties will have to have its own defeater system built into it, because otherwise the malfunctioning of the defeater system will render some faculties malfunctioning that it shouldn’t, and because there may be some slight difference in the conditions under which I should give up the belief for the various ways I could have arrived at it. So there will be many duplicate or near-duplicate defeater systems repeated in my
123
Synthese (2011) 182:433–447
441
cognitive structure for each proposition I believe, built into the individual faculties for the various ways I could arrive at that belief. I find this account of our cognitive faculties, which is necessary for the proper functionalist to avoid the dilemma over defeating experiences, deeply implausible for a number of reasons. First, this account loses out big time to evidentialist theories on parsimony considerations. The idea that I have multiple individual faculties uniquely producing each and every proposition I believe or could believe, and many defeater systems duplicated or nearly duplicated many times all over my cognitive structure, is (to say the least) ontologically extravagant. Even the many necessary truths of evidence that Chisholmian evidentialists allow for are more parsimonious than this account of our cognitive structure because of the multiple faculties per belief and the duplication of defeater systems. Second, this account needs the defeater systems built into each individual belief-faculty to themselves be designed to detect all of the conditions under which the belief should be given up. If they are dependent on other systems to detect these conditions (e.g., a problematic experience), then these conditions won’t always defeat the belief—because those other systems may malfunction and not inform the defeater systems of the obtaining of the condition, which would mean that it wouldn’t be the defeater system that would be malfunctioning. In this case, the belief would still be justified, even in the presence of the defeating condition, because it wouldn’t be the faculty producing and sustaining it which is malfunctioning. So each and every defeater system built into each and every of the infinite or near-infinite individual belief-faculties must have its “fingers” all through my cognitive system, designed to detect, on its own, a whole host of possible cognitive states that should trigger the giving up of the belief. This makes the parsimony situation even worse. Third, this account, by denying that it is the proper functioning or malfunctioning of more general cognitive faculties that is relevant to justification, contravenes the spirit of faculty-centered epistemology. Much of the explanatory power of faculty-centered epistemology is its ability to achieve a certain level of generality in explaining the justification of beliefs in terms of the faculties that produced them. This account, by losing basically all notion of justification-determining general faculties, loses a good deal of that explanatory power. How strong you think these last three arguments are will probably depend on how strong you think Bergmann’s Reidian argument against the necessity of evidential relations; if you think his argument strong (along with his arguments against other externalist theories like reliabilism which also reject the necessity claim), you may be inclined to bite the bullet on these arguments. There are two other arguments that aren’t as easily dismissed, though. Fourth, this account makes the proper function analysis hostage to empirical findings. It may turn out that neuroscientists will be able to investigate and find out whether we do in fact have something like a faculty for each individual belief that produces and sustains it, and a defeater system built into each such faculty, or whether our belief-producing faculties are more general than that. This, I take it, is a mark against a theory that is supposed to be a conceptual analysis of justification. Conceptual analyses probably shouldn’t be hostage to empirical findings in this way. Here is another way to make this same (or a similar) point. It seems to me that the following scenario is possible because I can conceive of it: (1) a being exists, very much like myself, who
123
442
Synthese (2011) 182:433–447
encounters the above scenario with the guidebook and the lake and has the same justification I do in the scenario (that is, has his lake-belief defeated while his rockand tree-beliefs remain undefeated); (2) this being, however, does not have an individual defeater system for each one of his beliefs, but is designed with a more general defeater system. If this is true—that is, if this scenario is possible—then it constitutes a counterexample to Bergmann’s proper function analysis of justification. You can run the argument I gave on this possible world, independently of whether human beings are built this way in the actual world, and this will be relevant for Bergmann’s analysis because it is supposed to be a conceptual analysis of justification. And it seems to me that this scenario is possible, not least because we cannot rule out that it is actual, true of actual human beings. Another (ad hominem) reason to accept this scenario as possible is that it seems to be at least as conceivable as Bergmann’s Reidian counterexample to the evidentialists’ necessity claim, which is supposed to motivate his proper functionalism. Fifth, and most importantly, it looks like this account is committed to saying that I have a faculty, or rather, multiple faculties (and corresponding defeater systems), built into me for every proposition I could ever possibly believe. Propositions for which I don’t have such a faculty couldn’t ever be justifiably believed or properly defeated for me. This is not objectionable merely on grounds of parsimony. It is objectionable by virtue of the finitude of my mind—there seems to be an infinite or near-infinite number of propositions that I could possibly come to believe, and is it really plausible to think that I have multiple faculties and defeater systems for each one wired into me? Moreover, the fact that I can gain and lose conceptual resources over the course of my lifetime strongly suggests that I don’t have individual faculties for each belief, but that I instead have more general, and more malleable, faculties which produce and sustain my beliefs. I conclude that this view of our cognitive faculties—which, recall, is necessary for the proper functionalist to escape the dilemma about defeating experiences—carries with it costs that render it deeply implausible.
4 The second option: modifying the proper function condition If proper functionalism is to avoid the implausible view of cognitive faculties and defeater systems sketched in the previous section, Bergmann will have to alter either the proper function condition or the no-defeater condition to account for defeating experiences. I’ll discuss each in turn—the former possibility in this section, the latter in the next. The proper function condition, as it stands, cannot handle defeating experiences because it cannot connect defeating experiences with the beliefs that they are supposed to defeat without also connecting those experiences with other beliefs that they shouldn’t be able to defeat. The only option I can think of for modifying or adding to the proper function condition to get the right sort of connection between defeating experiences and the beliefs they should defeat is to add some sort of counterfactual proper function condition on justification. For instance, we could add the following condition: a belief B is justified only if, were my cognitive faculties functioning properly, I wouldn’t have a believed defeater for B. This condition would handle the
123
Synthese (2011) 182:433–447
443
above example. If all my cognitive faculties were functioning properly, I would have a believed defeater for my no-seen-lakes belief (rendering it unjustified), but I wouldn’t have a believed defeater for my rock- and tree-beliefs (leaving them untouched). Those familiar with the history of counterfactual analyses in philosophy, though, should expect there to be an easily constructed counterexample—and there is. Consider the following modified version of the case. As before, I am hiking through the mountains, and justifiably believe that you can’t see any lakes from the peak on the basis of the testimony provided me by my guidebook, which I justifiably believe to be reliable. Before I reach the peak, I see a sign which tells hikers to enjoy their hike to the peak. I, however, malfunction and misinterpret the sign to be a warning to hikers to not proceed to the peak because of danger. Because of my misinterpretation of the sign, I do not proceed to the peak and so do not ever have the visual experience of a lake. Intuitively, I remain justified in believing that you can’t see any lakes from the peak—I never had the visual experience that would defeat that belief. According to the proposed counterfactual proper function condition, though, my belief is not justified, because if all my cognitive faculties had been working properly, I never would have misinterpreted the sign and would have continued on to the peak, had the visual experience of the lake, and would have a believed defeater for my no-seen-lakes belief. So it turns out that this counterfactual version of a proper function condition declares beliefs unjustified that aren’t. We could go on interminably proposing different and more specific counterfactuals, hoping to get one immune to counterexamples. Instead, though, we should learn a lesson from the history of counterfactual analyses and be rather less than optimistic as to the possibility of finding a successful counterfactual proper function condition in this case.14 The attempt to account for the effects of defeating experiences by modifying the proper function account therefore is likely doomed to failure. 5 The third option: modifying the no-defeater condition The other option is to modify or add to the no-defeater condition. Bergmann has expressed a willingness to do so. He recognizes that there may be some kinds of defeaters in addition to believed defeaters that can’t be handled by the proper function condition by itself (though he says he is not aware of any), and he is willing to add clauses to his analysis of justification that stipulate that such defeaters are absent.15 Perhaps he could do so in the case of defeating experiences. It is my contention that such a move would undercut the Reidian counterexample that Bergmann uses to motivate proper functionalism in the first place. First of all, it is important to notice that Bergmann can’t just add a clause that says “a belief is justified only if there are no defeaters for it.” The reason is that a defeater is simply defined (roughly) as a belief or experience that, if added to a cognitive situation, would render unjustified some other belief. Since the concept of justification is used essentially in 14 For a seminal paper on the particular vulnerability of counterfactual analyses to counterexamples, see Shope (1978). 15 Bergmann (2006, p. 172).
123
444
Synthese (2011) 182:433–447
the definition of a defeater, the concept of a defeater can’t be used irreducibly in an analysis of justification, or we get a viciously circular analysis.16 So the clauses which are added to account for should-be-believed defeaters need to refer to some feature of the threatening beliefs or experiences other than simply the fact that they are defeaters, in order to explain why they are defeaters (why they threaten justification). Recall that it is the proper function condition that allows justification to be indexed to design plans of species. The no-believed-defeater condition, since it is not covered by the proper function condition, expresses a necessary truth about justification—justification for a belief is necessarily incompatible with believing that belief is unjustified. This is necessary because it expresses a relation between the propositional contents of the defeater and the defeated belief—the propositional content of the defeater is that the defeated belief is unjustified, and this is why the defeater is a defeater. The clauses that Bergmann will have to adduce to handle should-be-believed defeaters will likewise have to express necessary truths about justification, since the proper function condition cannot handle such defeaters. They will therefore have to express relations between the propositional contents of the defeater and the defeated belief. But this undercuts the Reidian motivation for proper functionalism. Bergmann grants that some evidential relations are necessary; what he needs to motivate proper functionalism is there to be some evidential relations that are not necessary, that can be indexed to the contingent design plans of species. To handle all the possible should-bebelieved defeaters, though, he is going to have to adduce clauses that make necessary all sorts of evidential connections between the contents of experiences (like the visual lake-like experience) with the contents of beliefs that they defeat (like the belief that you can’t see lakes from the peak). Since every sort of belief to which evidence can plausibly have a contingent relation is also vulnerable to defeat, the necessity of the evidential relations that are relevant for defeat is going to undercut the idea that any evidential relations relevant to justification simpliciter are contingent. Consider the following example. If Bergmann adduces a clause independent of the proper function clause to account for the fact that my visual experience of a lake defeats my no-seen-lakes belief, that clause will establish a necessary connection between the content of my visual experience of a lake and the content of my beliefs involving lakes. And this clause, because it is independent of the proper function condition, will have to apply to all cognizers regardless of species or design plan. But consider species of cognizers who are designed to respond to lake-like visual experiences with a belief that there is a mountain (and who produce lake-beliefs in response to an experience of a different sort, perhaps an olfactory experience). Because the no-defeat clause expressing the evidential relation between lake-like visual experiences and lake-beliefs is not indexed to species like the proper function condition is, we are forced to conclude that this species’ lake-beliefs could be defeated by lake-like visual experiences, even though they are designed in such a way that their doxastic responses to lake-like visual 16 It is important to notice that Bergmann does not fall into this vicious circularity in his own analysis, even though he does use the term “defeated” in his no-believed-defeater clause. The reason is that the clause can be cashed out without any mention of defeat—the clause simply says that I can’t be justified if I take my belief to be epistemically unjustified. This analyzes defeat in terms of the subject’s beliefs, not in terms of justification. See Bergmann (2006, pp. 168–169).
123
Synthese (2011) 182:433–447
445
experiences have nothing to do with lakes. Clearly, this is absurd. What this shows is that the evidential relations not indexed to species that Bergmann has to admit to handle should-be-believed defeaters undercut the whole notion of any sort of evidential relation being indexed to species—which is Bergmann’s stated motivation for proper functionalism. This strategy for modifying Bergmann’s analysis, therefore, fails as well. 6 The fourth option: attacking the case The only option remaining for the proper functionalist is to somehow attack the problematic example and argue that the intuitions I have about it are mistaken. I’ll consider two ways that this could be done, and argue that neither is likely to succeed. First, if the proper functionalist could argue that sensations or experiences by themselves never provide evidence or defeat justification, then the case could be handled easily. If this were true, then if the lake-experience isn’t “registered” in a belief in the existence of a lake, the experience doesn’t in fact defeat the no-seen-lakes belief, and my argument can’t get off the ground. This would be a Reidian thing to argue—Reid thinks that belief is essential to perception—and given Bergmann’s Reidian leanings, he might be sympathetic. However, it is clear that experiences do provide evidence even in the absence of belief. Consider the following (rather complicated) case: I am wandering in New York City and see a hot dog stand. The hot dog stand is in fact there, and I do perceive it. However, I (irrationally) think it a mirage, and do not form the belief that there is a hot dog stand in front of me, because I am convinced that it is a mirage. I also believe that there are no hot dog stands in New York City. My perception of a hot dog stand, though unaccompanied by a belief that there is a hot dog stand, does defeat my belief that there aren’t any hot dog stands in New York—it seems obvious to me that my belief that there aren’t any hot dog stands is unjustified, and it is unjustified because I am seeing one. So experiences unaccompanied by beliefs can defeat. I admit that there are some kinds of experiences unaccompanied by belief which don’t give defeating evidence. I simply insist that there are some which do. The difference, I suspect, has something to do with memory: the sorts of experiences which don’t give evidence are cases where we can’t remember what just passed through our visual field, whereas the kinds of cases I have in mind are cases where we can remember what we saw. I am inclined to tell a kind of neo-Reidian dual-component story about these two kinds of perception: experiences which can’t be remembered are cases where sensation only is present, whereas experiences which can be remembered involve sensation and some kind of cognitive state (other than belief) involved in perception, and it is the cognitive component which provides the evidence relevant for justification.17 However, I do not want to commit myself to any particular account 17 To get a picture of the sort of non-doxastic cognitive state I am thinking of, consider the Wittgensteinian duck-rabbit. I can fix my eyes on the figure, and without changing my sensations (without moving my eyes) and without changing my beliefs (I believe all the time that it is both a duck and a rabbit), I can flip back and forth between seeing a duck and seeing a rabbit. This extra mental state, distinct from sensation and from belief, is the sort of thing I would suggest is the difference between my cases (where experiences give evidence) and cases where they don’t.
123
446
Synthese (2011) 182:433–447
of perception here. All I need is that there are some experiences unaccompanied by belief which provide defeating evidence, which I have already established. Second, the proper functionalist could try to argue that my lake-experience only defeats my no-seen-lakes belief under special circumstances, circumstances which won’t cause a problem for the proper function analysis.18 Perhaps the (unregistered in belief) lake-experience only defeats my no-seen-lakes belief if that belief actually is the reason that I don’t register the lake-experience—if my no-seen-lakes belief interferes with my formation of a lake-belief because I don’t expect to see a lake. In this case, the malfunctioning faculties may just be the faculties sustaining my no-seenlakes belief, and the individuation problem is solved. If the reason I don’t respond doxastically to the lake-experience is something other than interference from the no-seen-lakes belief, so this line of response goes, then the lake experience in fact has no defeating power. Why, though, would the interference of my no-seen-lakes belief with my noticing my lake-experience count as a “malfunction” (which it must, to get the right result on the case) unless the lake-experience were already a defeater for the no-seen-lakes belief? It is precisely because my belief is causing me to ignore defeating evidence that the belief’s interference counts as a malfunction, and it follows from this that the experience must itself be a defeater. Therefore, there must be experiences unaccompanied by belief which defeat beliefs regardless of whether the defeated belief interferes with the doxastic response to the experience. I conclude that the case stands and that the proper function account of justification or warrant cannot account for the epistemic impact of experiences that should generate believed defeaters but don’t. The only plausible modification of the proper function analysis of justification that can account for them undercuts the motivation for proper functionalism, the conviction that evidential relations should be indexed to the design plans of species.
References Bergmann, M. (2006). Justification without awareness: A defense of epistemic externalism. Oxford: Oxford University Press. Kvanvig, J. (1992). The intellectual virtues and the life of the mind: On the place of the virtues in contemporary epistemology. Savage, MD: Rowman & Littlefield. Kvanvig, J. (2000). Zagzebski on justification. Philosophy and Phenomenological Research, 60, 191–196. Kvanvig, J. (2003). Propositionalism and the perspectival character of justification. American Philosophical Quarterly, 40(1), 3–18. Kvanvig, J. (2006). On denying a presupposition of sellars’ problem: A defense of propositionalism. In C. de Almeida (Ed.), Veritas, 50(4), 173–190. Kvanvig, J. (2007a). Two approaches to epistemic defeat. In D.-P. Baker (Ed.), Alvin Plantinga: Contemporary philosophy in focus (pp. 107–124). Cambridge: Cambridge University Press. Kvanvig, J. (2007b). Propositionalism and the metaphysics of experience. In E. Sosa & E. Villanueva (Eds.), Philosophical issues, 17, 165–178. Kvanvig, J., & Menzel, C. (1990). The basic notion of justification. Philosophical Studies, 59, 235–261. Lackey, J., & Sosa, E. (Eds.). (2006). The Epistemology of testimony. Oxford: Oxford University Press.
18 The outlines of the following argument were suggested to me in conversation by Evan Fales.
123
Synthese (2011) 182:433–447
447
Plantinga, A. (1993a). Warrant: The current debate. Oxford: Oxford University Press. Plantinga, A. (1993b). Warrant and proper function. Oxford: Oxford University Press. Shope, R. K. (1978). The conditional fallacy in contemporary philosophy. The Journal of Philosophy, 75, 397–413.
123
This page intentionally left blank z
Synthese (2011) 182:449–473 DOI 10.1007/s11229-010-9753-z
Does Kantian mental content externalism help metaphysical realists? Axel Mueller
Received: 4 March 2010 / Accepted: 26 May 2010 / Published online: 22 August 2010 © Springer Science+Business Media B.V. 2010
Abstract Standard interpretations of Kant’s transcendental idealism take it as a commitment to the view that the objects of cognition are structured or made by conditions imposed by the mind, and therefore to what Van Cleve calls “honest-to-God idealism”. Against this view, many more recent investigations of Kant’s theory of representation and cognitive significance have been able to show that Kant is committed to a certain form of Mental Content Externalism, and therefore to the realist view that the objects involved in experience and empirical knowledge are mind-independent particulars. Some of these recent interpreters have taken this result to demonstrate an internal incompatibility between Kant‘s transcendental idealism and his own model of cognitive content and the environmental conditions of empirical knowledge. Against this suggestion, this article argues that, while Kant’s theory of content is indeed best construed as externalist, an adequately adjusted form of transcendental idealism is not only compatible with this externalism, but in fact supports it. More generally, the article develops the position that mental content externalism cannot force the adoption of metaphysical realism. Keywords Externalism · Transcendental idealism · Mental content · Kant · Intuition · Appearance From the very moment that Kant proposed his critical method of examining the conditions and limits of empirical knowledge and his transcendental idealism as a conception of the objects of cognition fitting the conditions and limitations that this critical method identifies, interpreters have taken transcendental idealism as an
A. Mueller (B) Northwestern University, Evanston, IL, USA e-mail:
[email protected]
123
450
Synthese (2011) 182:449–473
expression of Kant’s commitment to the view that the objects of cognition are structured or made by conditions imposed by the mind, and therefore as Kant’s commitment to what Van Cleve calls “honest-to-God idealism”.1 Particularly one of Kant’s slogans—that we can know only appearances and cannot ever know things in themselves—served such interpretations as ample proof that Kant thinks that human cognition only reaches what things appear to be to us. In this paper, I will defend an interpretive strategy that shows against this tradition that the results of Kant’s theory of cognition and its contents are incompatible with traditional idealism, just as Kant thought. In doing so, I rely on the results of another line of Kant-scholarship, represented in the work of scholars as Kemp-Smith, Brittan, Strawson, Guyer and others, who emphasize the anti-idealistic import of Kant’s theory of cognition. But whereas the mentioned approaches often felt forced to repudiate Kant’s TI to the same extent that they endorse his theory of cognition, I will argue that Kant’s own TI is not only compatible with, but in fact supportive of his non-idealist account of the conditions of objective empirical knowledge. 1 “Nothing but appearances”: alleged tensions between MCE and TI I will stake out the space for such a position by discussing one of the richest and most innovative recent readings of Kant’s critical philosophy, that of Kenneth Westphal.2 My reason for choosing this way is that Westphal’s interpretation on the one hand offers very powerful new arguments to demonstrate the commitment of Kant’s theory of cognition to realist presuppositions, but on the other follows the tradition of Kantscholarship in which the anti-idealistic potential of Kant’s critical reconstruction of the conditions of experience, which issues in a theory of experience or an “inventory of empirical cognition”,3 is pitted against its purportedly idealistic self-understanding. But Westphal’s proposed interpretation is more ambitious than most of the work in this tradition because he is not satisfied with presenting Kant’s critical philosophy as incoherent but pursues the strategy of an internal critique of Kant’s TI, that is, a critique that is based on the very resources of Kantian transcendental philosophy.4 As I said, TI is notorious for dismaying even sympathetic interpreters. Their dismay is precipitated by features of TI like Kant’s insistence that TI entails that ordinary objects are “nothing but appearances” and “only representations”5 because they are entities in space and time, both of which are said to be ‘transcendentally ideal’ and ‘in us’, while things in themselves are not determinately spatio-temporal and we consequently cannot know them, constrained as we are to experiencing only spatiotemporally structured entities as obtruding realities. On the assumption that these claims contrast with ordinary things’ being ‘real’, ‘actual’, or quite simply ‘things as they are’, this is 1 Van Cleeve (1999, p. 14). 2 This reading is developed in detail in Westphal (2004). Further illuminating and relevant material can be
found Westphal (2005, 2003a,b). 3 I borrow this term from Bird (2006, pp. 28–29). 4 Westphal (2004, 4 et passim). 5 Cf., representatively, Kant (1996, A 492/B 520), in the following cited in the standard fashion as CPR.
123
Synthese (2011) 182:449–473
451
indeed a view at odds with any sane—and Kant’s own—commonsense realism about objects of experience. In defense of the latter, Westphal concurs with Strawson, Stroud and many others in finding TI “repulsive”, deems TI outright “false”, and “aim[s] to dispense with” it.6 Consequently, he also endorses Guyer’s view that Kant’s most important insights do not depend on and are separable from TI.7 I want to focus on one particular way in which Westphal plays off the resources of Kant’s theory of cognition against TI to show the latter as untenable in light of the former. He extracts from Kant’s theory of representational content an irreducible commitment to the existence of and the necessary cognitive access to extra-mental particulars. Key for identifying this commitment is the view (shared by a number of recent interpreters in the wake of Sellars, like Hanna (2001, 2006a,b), Rosenberg (2005)) that Kant defends a kind of mental content externalism (MCE), i.e. the view that mental representations could not be contentful and have the content they do unless they and their users are systematically connected to extra-mental particulars. As Kant’s theory of representation constitutes an essential result of his transcendental reconstruction of the structure of empirical cognition, MCE has to count an integral component of Kant’s transcendental philosophy. Westphal’s strategy against TI then unfolds as a defense of two claims: First, that proper attention to the method and claims of Kant’s analysis of the conditions of empirical cognition reveals, thanks to MCE, resources for “transcendental proofs for (not ‘from’)” realism.8 Second, Kant’s semantically generated realist commitments directly undermine the very repulsive doctrine of TI that Kant himself held as partly responsible for the success of his own arguments. Westphal says: “Kant proves that we perceive rather than merely imagine physical objects in space and time. (…) [But] Kant’s proof succeeds in ways, and to an extent, that even Kant did not appreciate. (…) Indeed, parts of Kant’s proof refute his key arguments for transcendental idealism.”9 The upshot is Westphal’s general claim that the kind of realism contained in the most important parts of Kant’s analysis of cognition, MCE, is strictly incompatible with TI and empirical realism (ER) as both positions need to be construed by Kant.10 In consequence, Westphal more generally suggests that adopting MCE forces a realism that is stronger than ER, i.e. a more ‘metaphysical’ or ‘transcendental’ realism, which he calls “realism sans phrase”. In the following, I grant without criticism Westphal’s first claim that Kant’s semantics for mental representations as presented in his transcendental analysis of the conditions of cognition is a form of externalism that entails a certain form of realism (§2.1). I will defend this view with a new argument that Westphal has not made, which lends decisive support to MCE directly from Kant’s transcendental reflection 6 Westphal (2005, p. 321, fn 37). 7 Westphal (2003a, p. 157, fn 45); cf. e.g., Guyer (1987, p. 335). 8 Westphal (2006, p. 785/806). 9 Westphal (2006, p. 782). He puts the point more strongly in (2003b, p. 160): “A sound version of the
standard objection to Kant’s arguments for transcendental idealism (…) can be deduced from Kant’s own principles and analysis in the first Critique.” 10 Westphal (2006, p. 802), speaks of an “unqualified realism about molar objects in our environs (…) not
some transcendentally qualified, merely ‘empirical’ realism.”
123
452
Synthese (2011) 182:449–473
on requirements for the contentfulness of representations. This argument will help to bring out what exactly the metaphysical requirements of MCE are (§2.2). I will then argue against Westphal’s second claim by sketching a methodology-centered version of the requirements of TI, in particular of the distinction between appearances and things in themselves (§3). It will turn out that the objects satisfying the requirements of MCE can simultaneously satisfy the requirements of methodological TI (§4). This shows that Kant’s own TI is compatible with MCE, and therefore that Westphal’s second claim is incorrect. I also briefly argue for the additional claim that the possible world in which MCE and TI/ER are compatible is relevantly similar to the commonsense world of sensorily detected objects of everyday experience and scientific knowledge (§5). But then much of the warrant for the general claim that being an externalist about mental content forces being a metaphysical or more-than-empirical realist is also undermined. It thus seems to me that the import of Westphal-style arguments is more limited but nonetheless important. They show that Kant’s theory of cognition is incompatible with what Collins calls idealist readings of TI,11 i.e. interpretations that saddle Kant with traditional idealist preconceptions by ‘mentalizing’ the objects to which we are related in experience.12 In the terms of Bird’s recent study,13 such readings tend to underestimate or overlook the revolutionary character of Kant’s externalist theory of cognition and its objects including the meta-theory, TI, the combination of which provide an alternative picture to both, traditionally internalist conceptions of cognitive content and traditionally idealist conceptions of the objects of cognition. The contention that accepting TI entails regarding the objects of experience as mind-dependent in a problematic way (which seems to be taken for granted in Westphal’s general claim, too) seems thus rather forced by traditionalist interpretive background assumptions than by Kant’s theory of cognition itself.14 I hope to display by my argument that, once Kant’s claims about ER(TI) are properly embedded in the context of Kant’s externalist theory of experience and representation, Kant’s own ER-conception of objects of experience (‘appearances’) is anti-epistemic (or ‘realist’) enough to adequately characterize the particulars required by Kant’s transcendental analysis of cognition and its externalist conception of content. In fact, Kant’s ER actually can then be seen as an attractive proposal for externalists who find metaphysical realism as unattractive as 11 See Collins (1996). In Graham Bird’s fitting term, this interpretive tendency can be described as ascribing a “traditionalist” project to Kant, particularly including his TI, as opposed to the “revolutionary” one that commentators like Bird and Collins see Kant as pursuing (see Bird 2006, pp. 15–18). As will become clear, I side with the latter, against Westphal’s bifurcation between ascribing a revolutionary strategy to Kant’s theory of cognition, and a ’traditionalist’ tendency to his metatheory, TI. 12 Allais (2003) uses the term ‘mentalization’ in this apt way to describe an idealist understanding of the
objects of experience, i.e. appearances, which she rejects. Westphal, however, would say that the illicitness of mentalizing the objects we are related to in experience, hence via sensation, shows that they are not (merely) appearances but (also?) things in themselves (where Westphal assumes the standard, ‘mentalized’ reading of ‘appearance’). Both would agree that ‘mentalizing’ the objects involved in experience is illicit because of the role of extra-mental elements in cognition and thought. For a decidedly externalist interpretation of ‘appearance’, see Collins (1996). 13 See Bird (2006, pp. 15–18). 14 My proposal here has similarities with that found in the literature in Strawson’s or Bird’s interpretations,
but also in the appropriation of Kant in, e.g., the pragmatist tradition.
123
Synthese (2011) 182:449–473
453
traditional idealism. This allows a more general lesson, namely that MCE does not require acceptance of overly ambitious metaphysical forms of realism.15 2 For MCE: the argument from cognitive reference and its place within TI One of the arguments outlined by Westphal to the effect that Kant provides “transcendental proofs for (not ‘from’)” realism could be called the argument from cognitive reference16 (or from MCE). It proceeds from the observation that Kant’s theory of content—epitomized in the famous slogan that concepts without intuitions are empty, while intuitions without concepts are blind—essentially requires that the subjects entertaining representations be in cognitive contact to extra-mental particulars for representations to be determinable in content and to be differentiable according to relations of content (sameness and difference). This follows from Kant’s account of the referential properties of intuitions (particularly empirical intuitions, i.e. perceptions) and their pervasive cognitive functions. Differences in cognitive content, according to Kant, can be retraced to possible differences in the subject matter of judgment, and differences in subject matter require ultimately differences in intuition-based or referential relations established by demonstrative or other indexical means that involve sensations. The latter, in turn, only occur as a consequence of contacts between cognizers and extra-mental environs, so that differences in subject matter ultimately require cognitive contact via sensations to extra-mental particulars. Thus, the externalism in Kant’s theory of cognition does not follow from intuitions (means of singular reference) per se, but from the combined theses that our capacity for intuitions is essentially receptive and that their particular subject matter has to come, as Westphal puts it, ab extra. Kant’s theory of cognition thus becomes externalism by linking a basically semantic doctrine—that all differences in content (not ‘meaning’) are to be traced back to differences in referential relations of representations to particulars other than themselves—to a doctrine of cognitive contact between cognizers and extra-mental particulars (which Westphal terms Kant’s “sensationism”17 ), which specifies the kind of entities that empirical intuitions refer to. According to MCE, there are no differences in cognitive content (not even among the categories, i.e. a priori concepts18 ) without differences in some relation of representations to extramental particulars. Since without differences in content, no mental state could count as a differentiable representation, and without such differences of representational 15 For the opposite view, cf. Goldberg (forthcoming). 16 Westphal (2006, pp. 783–785, continued for concepts at pp. 797–799). 17 Following George (1981). 18 The extraordinary and mostly overlooked way in which Kant claims a referential element in the deter-
mination of truth-conditions for judgments is that the very content of concepts (i.e. possible predicates) remains indeterminate unless it encompasses actual intuitional references to objects the words expressing them refer to (i.e. of parts of their ‘extension’). A passage that can count as programmatic of this, but is seldom so taken is the following: “the object cannot be given to a concept otherwise than in intuition; and if a pure intuition is possible (…) still this pure intuition itself also can acquire its object (…) only through empirical intuition, whose mere form [as opposed to matter] the pure intuition is. Therefore all concepts, however possible they may be a priori, refer nonetheless to empirical intuitions, i.e. to data for possible experience. Without this reference, they (…) are mere play” (CPR, A239/B298).
123
454
Synthese (2011) 182:449–473
value among mental states, there’d be no synthetic activity of cognition, and without such synthetic activity of cognition, there’d be no self-consciousness,19 the conditions of cognitive differentiability according to content among mental states (MCE) are conditions of self-conscious cognition, hence of experience, and therefore enjoy transcendental status.20 Since MCE requires cognitive contact to extra-mental particulars and is a transcendental condition, it is a consequence of Kant’s theory of cognitive representation that (a) there are not only mental entities, and that (b) we are, by virtue of being self-conscious thinkers, in cognitive contact to some such extra-mental particulars. Realism about extra-mental particulars is thereby transcendentally vindicated. More generally, it follows that, contrary to defenses of TI that infer the epistemic nature of a condition of experience from its transcendentality (like Allison’s), (c) not all transcendental conditions are purely formal, or mind-contributed or even subjective elements of cognition. Global anti-realism with regard to transcendental conditions is thereby undermined. According to this, MCE conflicts with TI insofar as the latter implies global anti-realism with regard to transcendental conditions. 2.1 MCE vs. TI: an idealist rejoinder At this point, we face an obvious objection: if Kant indeed developed his theory of cognition assuming MCE, and if MCE indeed is incompatible with TI, why do we not find any sign of doubt about either in Kant’s work? The fact that Westphal’s critique is internal bears on this question. Most of the work in Westphal’s proofs of content externalism is done by Kant’s own insistence on the ab extra character of the matter of sensation and therefore the objects underlying perception. This insistence also forms the backbone for his rejection of all the arguments he sees at work in favor of an idealist version of TI in Kant himself.21 As Westphal brilliantly formulates it, “all these arguments are invalid. The reason is the same in each case: If the matter of sensation is given us ab extra (this too defines Kant’s transcendental idealism), then ex hypothesi we cannot generate its content.”22 Now, we clearly get the ab extra insight from MCE, but it is also itself the result of a transcendental investigation. In being ab extra, the 19 With regard to the dependency of self-consciousness on differences in content, cf. CPR: “only because I can combine a manifold of given presentations in one consciousness is it possible for me to present the identity itself of the consciousness in these presentations” (CPR, B133). This means that we can only realize the identity through various tokenings of ‘I’ that accompany each individual awareness of each presentation as something over and above an aspect of each of these presentations themselves if the content of the latter is not continually the same, whereas the content of ‘I’ that takes them up is taken to be the same. With regard to the dependency of self-consciousness on the extra-mental conditions of differences in content, cf. CPR: “I distinguish my own existence, as that of a thinking being, from other things outside me—this is likewise an analytic proposition. (…) But from this I do not in any way know whether this consciousness of myself is possible without things outside me whereby presentations are given to me, and hence whether I can exist merely a thinking being (i.e. without being human).” (CPR, B409, emphasis added) 20 Westphal (2006, pp. 794–796). 21 A kindred line of argument is followed by Robert Hanna in his (2006b), where objects of experience are
construed as triply constrained by conditions of sensibility, namely by space and time, as well as “affection” (cf. p. 20ff.), and the latter is seen as an additional, non-formal transcendental condition. 22 Westphal (2005, pp. 321–322).
123
Synthese (2011) 182:449–473
455
objects sensations respond to are portrayed by Kant as clearly not mind-dependent; as Kant says, whatever sensations respond to is the “matter (or the things themselves as they appear).”23 This insistence on the centrality of sensations for differences in cognitive content, and the doctrine of receptivity according to which sensations are not mentally produced but externally stimulated representations goes some way toward forestalling an idealist re-interpretation of the indispensability of singular, intuitive reference for cognitive determinacy in the form of saying, for example, that the particulars in question could very well be independent of the representation at hand, while still remaining a (different) mental entity. For, this response now would have to reduce all sensations and the mechanism of their differentiation to inner sense, something clearly regarded as neither possible nor attractive by Kant, as particularly the clarifying Refutation of Idealism and the elements of Kant’s transcendental inventory it uses (such as the transcendental deduction, large parts of the Aesthetic) display. At any rate, it is clear that the ab extra character of the objects underlying sensations is at the same time, in being shown necessary for the determinacy of mental content and thus experience, vindicated by Kant as part of our transcendental equipment. Their latter status, and Kant’s answer to a possible Berkleyian hostile takeover is further supported by the fact that, in being required for outer sensations, they are required for the realization of outer sense, without which, according to the Analogies of Experience, there would be no subjective time order, another condition of outer and inner self-conscious experience.24 Finally, in being required for the existence of outer sensations, and because without the latter, no intuition would have any determinate empirical content, they are what representations that essentially involve sensations are about, and thus ultimately, the objects of experience, i.e. of judgments that essentially involve sensations. As Westphal’s own remark indicates, the need for ab extra referents of sensation and the indispensability of objects for outer sense (i.e., according to Kant, referents spatially distinct from the location of the mind) accruing from MCE is, for these and more reasons, one integral moment of Kant’s very own TI. Since in the ultimate instance, they cannot be characterized as other than mind-independent, MCE and idealist readings of TI—which claim that the objects of experience are conceived by Kant to be mind-constituted—are indeed prima facie incompatible. According to Westphal, Kant or idealist defenders of TI overlooked this tension due to a confusion of the trivially recognition-dependent fact that we could not recognize thought as self-conscious experience without assuming that a certain condition holds, with the possibly mind-independent nature of the circumstances satisfying that condition. By confusing the transcendentality of a given condition with its subjectivity, they illicitly but unwittingly came to lump together mind-dependent and mind-independent conditions. 23 CPR, A268/B324. 24 This point is forcefully and convincingly argued in Westphal (2004, pp. 29–31). Taken together with the
corresponding analysis of the three Analogies (ibid., pp. 146–166), this indicates that Kant’s transcendental system in the CPR allows the construal of the main premise of the Refutation of Idealism, which thus, pace Guyer, cannot be taken as a crucial but otherwise unentailed substantive addition to the transcendental system of the CPR, but should rather be seen as a crucial clarification of the whole revolutionary import of the system vis-à-vis Cartesian conceptions of the mind, the traditional mind-world dualism and all the problems associated with both.
123
456
Synthese (2011) 182:449–473
However, it seems that an idealist defender of TI could turn the tables on Westphal and argue that one also ought not to confuse externality and mind-independence. For example, defending and stating Kant’s TI including MCE might require an idealist conception of the objects of cognition. For, Kant’s MCE as presented so far could be construed as compatible with saying that the individuals that are empirically accessed through intuitions involving sensations must be, transcendentally viewed, fully conceptually determinable in order to determine the objects of cognition that are capable of being ‘known’ and of acting on (or registered by) our senses as such individuals. Such a view would claim that Kant’s TI suggests that, while the referents of each intuitive referential act appear to us as individuals, a condition of their being individuals, or of asserting truly that they are individuals is their transcendental identifiability through concepts. The idea would be that there can be no reference to particulars unless they are recognized as the individuals they are.25 In this case, little would be won by pointing to MCE, since the referents that empirically (i.e. at the level of sensation) appear as ab extra are not entirely ab extra things from the transcendental point of view, because their ontological individuation depends on their conceptual individuation. Thus, even if MCE could be granted as part of TI, that would not show that TI does not portray the objects of cognition and those of intuitive and sensation-dependent reference as importantly mind-dependently constituted. 2.2 A Kantian response to the rejoinder: MCE’s need for mind-independently individuated particulars Fortunately, this is not Kant’s view. Adequately placed in Kant’s specifically semantic analysis of intuitional reference, we can find a supplementary argument that excludes this rejoinder. Westphal mentions the point several times but does not attribute it to Kant or develop it. Kant’s argument establishes that, if there is so much as determinate reference to particulars or individuals, then the objects of reference cannot be determinate in virtue of any conceptual or descriptive conditions as the individuals they are when successfully referred to, but they have to be seen as irreducible individual things. This is a transcendental reflection on the conditions of possibly determinate or successful reference to individuals, which is required by the semantics of intuitions.26 Its result is the requirement that the universe of discourse for intuitive reference must contain determinate individuals. The tendency of the idealist rejoinder is to take for granted that we have to answer the question as to what or who does the individuating of entities that it is we who individuate (either by conceptually identifying or by identifying via sortal identity),27 given Kant’s agnosticism about knowledge of 25 Such a view seems to be at work, e.g., in Strawson’s influential interpretation according to whose semantics nothing can be referred to as an individual unless it is verifiable that it is an individual that settles the question “which of all?”. 26 Very clear on this point is Rosenberg (2005, pp. 83–87). 27 Thus, I include, in the rejoinder, as much descriptionist views that require identifying knowledge of a
definite description in order for us to be in a position that warrants assuming the existence of individuals as weaker views like those inspired by Peter Geach’s or David Wiggins’ work that require knowledge of, or at least preparedness of applying a sortal concept. The main problem is the same for both versions of the
123
Synthese (2011) 182:449–473
457
(even at the transcendental level) things in themselves and their identity conditions. This seems to invite understanding him as saying that, if it is not things in themselves that self-individuate, then it has to be us. However, according to the argument needed at this juncture, and given by Kant as I will eventually explicate it, the corresponding referents are quite simply individuals on account of what they are, no matter whether anyone could descriptively (or sortally) individuate them or, for that matter, no matter whether anyone would think they are individuals. We are confronted with a piece of the metaphysical underpinnings or background-conditions consciously taken for granted by—or even excavated through—Kant’s epistemology (his theory of experience),28 not with a further piece of his epistemology. This background-condition is indeed ‘transcendental’ insofar as it is necessary for experience and its enabling distinction between mere appearances and how things are, but it is not merely formal, since it concerns a set of material particulars as objects of sensory interaction, not, as the categories, a set of structures the sensory realization of which might have remained merely possible but not actual (but, given experience, happily can be proven to be necessarily actual). The reason Kant gives for the irreducibility of this background-condition to any exercise of our spontaneous conceptual abilities is that whatever concept-aided cognitive means we would try to make responsible for their individuality would not suffice for their actually being particular individuals because of the essential generality of concepts. But it is only actually existing things that provide the particulars required by and taken for granted in successful acts of intuitive reference. The actual existence of particulars to refer to in intuitive reference is therefore mind-independent. Kant simply puts the answer to the apparently damaging question where individuation comes from to one side because it can be seen, in the context of the problem of singular reference, as a red herring. He replaces it with an account of the conditions of singular reference required by the semantics of intuitions. His deflationist suggestion is that it is simply one and the same thing to put the difference between intuitions and concepts on a semantically sound basis and to assume mind-independent individuals. We could say that, according to the interpretation here proposed, the fact that intuitive reference is reference to individuals merely exploits the existence of things the individuality (i.e. availability as particulars) of which is not owed to any determinative activity by any mind. It is only given the assumption of such objects of experience that we can expect the success of individuative practices, that is, of identifying descriptive knowledge and the applicability of (often various and multiple) concepts of sortal
Footnote 27 continued rejoinder: to explain how objects the assumption of which depends on an epistemic fact like the knowledge of a description or the belief in a sortal identity can qualify as mind-independent in the sense required by externalism. If in the following I concentrate on decriptionist versions of the rejoinder, this is for reasons of perspicuity and assuming that analogous problems arise, mutatis mutandis for sortal views as well. (I thank Quassim Cassam for indicating the need for this specifying remark.) 28 I take this term for quasi-transcendental states of affairs in the sense of Cassam (2007, pp. 40–41), while in contrast to Cassam (ibid., pp. 124–125), I attribute to Kant himself insight in the indispensability and inevitability of exploiting such ‘realist’ conditions (i.e. such that crucially involve employment of mindindependent circumstances and entities) as resources in epistemology, and thus do not use reference to such conditions as an occasion to criticize Kant’s approach.
123
458
Synthese (2011) 182:449–473
identity. Likewise, given such (possible) individuals—that is, objects amenable to our individuative practices and intuitive references—our epistemic practices can exploit, we can explain the success of these practices. In other words, the spirit of the rejoinder gets things characteristically in reverse order. 2.3 “Thoroughgoing determination”: the irreducibility of given particulars to results of cognitive operations The starting point of Kant’s argument is a critique of the idea that it might be possible, from the point of view of a fully complete, conceptually articulated but intuition-free, absolute and complete representation of the world (i.e. a representation that could be what it is and mean what it does irrespective of whether and how we ever might have contact with extra-representational objects), to individuate anything as a distinct, particular referent. This starting point recommends itself because if this idea can be shown to be flawed, then any less perfect, intuition-free description will not be eligible as supplying a means of successful individual reference either. According to Kant’s criticism, the mentioned idea rests on illicitly attributing properties of things, namely being ‘thoroughgoingly determined’,29 to mental representations. In his remarks on the margins of the first edition of the CPR, Kant succinctly expresses the strong point “against idealism” precisely in this way: “That which is determined in time and space is actual. […] That which exists, thus in other things outside our thoughts, is thoroughly determined.”30 Kant’s aim here is to demonstrate that if referential access to particulars, i.e. thoroughgoingly determined objects, is nonetheless possible, then it must be irreducible to intuition-free descriptive conditions because the idea of an aintuitional thoroughgoingly determinative representation does not cohere with what concepts can do (generalize, not select or uniquely pick out). The clearest statement of this irreducibility of referential access to particulars to attributive, conceptually facilitated reference can be found in §§11–15 of Kant’s Logic (Jaesche). Here, Kant notes that (1) any description that in fact applies only to one thing can apply to more than one thing in other possible circumstance s, due to the fact that concepts are essentially general means of reference, and (2) any object that is specified by some description and in fact, under some circumstances, sufficiently individuated by this description, may no longer be sufficiently individuated by this same description when other features become relevant that apply to more objects than the described one. Therefore, descriptive or otherwise concept-dependent individuation (and reference to particulars derived from it) is arbitrarily expandable and never ‘complete’. For both reasons, referring to individuals is only possible by means of direct, i.e. not conceptually mediated means of reference. According to Kant, it is “only particular things or individuals that are thoroughgoingly determined”31 (§15), 29 Kant classifies this assumption as a transcendental material presupposition “of the matter for all possibility (…) that is to contain the data for the particular possibility of every thing.” (A573/B601) 30 Refl, E XCII, p. 36; 23:32, and Refl, E XCIV, p. 36; 23:32 (quoted according to Kant (1998, p. 322);
emphasis added). 31 Kant (1968, §15, A155).
123
Synthese (2011) 182:449–473
459
not concepts, because “a lowest concept (…) is impossible to determine” (§11), such that “even when we have a concept that we apply to individuals immediately, it is still possible that with regard to it [the individual] there remain specific differences that we either do not notice or leave aside. It is only comparatively (…) that there are lowest concepts that, as it were, have acquired this meaning by convention” (ibid.). Therefore, “there are only thoroughgoingly determined cognitions as intuitions, but not as concepts; regarding the latter, logical determination can never be considered accomplished” (§15). These remarks are extremely consequential. For once, since it is only things and all existing things,32 but not concepts or conceptual cognitions that are thoroughgoingly determined, reference to individuals is importantly non-epistemic, since no descriptive or otherwise concept-dependent conditions possessed by a thinker are sufficient for the fact that her representations refer to a given individual. An example Kant uses to demonstrate the irreducibility of spatio-temporal conditions of demonstrative reference to conceptual conditions of identification can be modified to illustrate the point. When we designate the same actual raindrop as ‘this raindrop’ or ‘the raindrop left of the tree’, the referent of the latter can always be said to possibly not have been anywhere (in possible worlds where there’s no raindrop left of the tree) while the former cannot be said to possibly not have been there without a breakdown in reference.33 The truth-conditional contribution of description and directly referring intuitions is thus, according to Kant’s semantics, dramatically different. In particular, this supports the further point that the truth-conditions or propositions expressed in truth-evaluable judgments about individuals cannot be specified without the things themselves. In first-order language, this means that, similar to the views of Kaplan or Perry, for a judgment to be correctly considered to be about particulars, the things referred to, not identifying descriptions thereof, or sortal identity conditions, have to be part of what is expressed in the judgment, or of its content.34 The semantic value of the corresponding representation-types (intuitions) is the object of reference accessed in their tokenings. This means, in turn, that judgments about them, which are specific ways of representing and therefore appearances, contain the intuitional referents themselves. Accordingly, at least these appearances (propositions) are not mental entities but composite entities consisting of mind-related and non-epistemic,
32 CPR, A573/B601. 33 For this example, cf. CPR, A372/B328. 34 In associating Kant’s emphasis on the central importance, and the genuine irreducibility of conditions of (intuitional) reference to particulars with recent developments of ‘direct reference’-approaches to truthconditional semantics, I am not only for the sake of the argument agreeing with Westphal’s own sympathies. I am also cautiously endorsing what Hanna calls “cognitive-semantic” (Hanna 2001, passim; Hanna 2006a,b, p. 7) approaches to Kant. Their strongest point seems to be the attempt to explicate the role of intuitions in Kant’s epistemology in terms of his awareness of the need for a thorough semantic analysis of the conditions of truth-aptness for propositionally structured and empirically contentful cognitions (judgments) and their anchoring in conditions of singular object-reference, a connection pioneeringly explored and related to recent developments in semantics by Thompson (1972–1973), Howell (1973), Hanna (2001, 2006a,b), as well as Willaschek (1997), and investigated in its relation to Kant since the 1960’s by Hintikka, Parsons and Bird. More recently, Schönrich (2003) combines a recognition of the central role of singular reference and the importance of Kant’s semantics with an Peircean, internalist view of semantics. For an explicit rejection of attributing semantic views to Kant, cf. Waxman (2005, pp. 100–110).
123
460
Synthese (2011) 182:449–473
extra-mental components.35 Kant calls the latter the matter of appearance and speaks of it as “the real in appearance (what corresponds to sensation)”, which he explicitly specifies as “matter (or the things themselves as they appear).”36 According to Kant, the matter for judgments (as for any other contentful presentations) is not produced or dependent on any of the mental or doxastic operations presupposed in judging, but it “must be given, for without being given it could in no way even be thought, and hence its possibility could not be presented.”37 Kant suggests not only that successfully referring to individuals (i.e. throughgoingly determined objects) is possible prior to conceptualizing them,38 but more importantly also that being able to so much as represent a certain individual in some circumstance of application as satisfying a description presupposes accessing (i.e. referring to) this very individual by means that are not constituted by the successful use of descriptions or any other mental or doxastic operations.39 It is important that this does not mean that intuitive access to such particulars would have to be construed by Kant as not requiring further conditions or as being, as it were, presuppositionless or backgroundfree.40 On the contrary, Kant leaves no doubt that he thinks that, e.g in perception, 35 In putting things like this, I side, as Westphal (2004, 60 fn 42), with what Howell (1992) has charac-
terized as an ‘appearing theory’ of appearances (Howell 1992, pp. 36–40; 347 fn 18, 347 fn 19). However, I disagree with Howell’s contention (ibid., p. 41) that appearing theories require a ‘two-realms’ view of appearances and things in themselves. First, because Kant is committed to the composite nature of appearances (cf. Brandt 1998, p. 85), and second, because it is all but clear that the alleged disjunction between a ‘two-realms’ and a ‘two aspect’ construal of Kant’s multiple use of the contrast between things in themselves and appearances is exhaustive, or even only whether its disjuncts are uniquely and adequately related to Kant’s varying purposes and contextual specifications of the contrast (cf. Willaschek 1998, 2001). 36 CPR, A268/B324. 37 CPR, A581/B609. On account of his semantics, Kant affirms here generally that appearances, insofar
as they are contentful representations, are not mental entities. Kant reaffirms this later: “in appearance, through which all objects are given to us, there are two components: the form of intuition (space and time) (…) and the matter (the physical) or content, which signifies a something encountered in space and time and hence a something containing an existence and corresponding to sensation” (CPR, A723/B751, emphasis added) One of the few commentators to have fully acknowledged this is Collins (1996, pp. 143–152, esp. 144). Melnick (2004) considers it as part of Kant’s theory of representation that we might find reason not to think of representations as purely mental affairs with no spatially distal components (p. 149). Similar ideas have been put forward in McDowell (1994). I will come back to this complex below, in §2.4. 38 CPR, B132. 39 Metalinguistically, Kant’s point can be summarized by saying that characterizing the range of reference
of the description through possible worlds requires referential access to the individuals in these possible worlds first, to see then, second, whether or not the satisfier in a possible world w is the same thing as satisfier in world w . In still other terms: in order to trace lines of trans-world-identity, we need standard naming devices that refer to the same thing across possible worlds, no matter what description they satisfy in these worlds, respectively. 40 In the latter formulation, I am siding with the view of Cassam (2007, pp. 40–41) of the enabling conditions of, e.g., perceptual reference to environing particulars as given cognitive background-conditions that do not determine in and of themselves any particular content but are nonetheless needed for yielding determinate results on occasions of an encounter. If, for example, the relevant background condition were to be a somewhat developed system of concepts with their rules of application to individual outputs of the sensory system, saying that the system constitutes a background condition but not a determiner of contents means that there is, given the system, for each such output some way of generating a full-fledged truth-apt claim about objects of experience, while what claim this is, and what objects will figure in it as referents is not entailed by the system and the output of the sensory system alone.
123
Synthese (2011) 182:449–473
461
certain spatio-temporal relations between the perceiver and the object need to be in place,41 as well as passing a certain threshold by the object to be noticed and a certain attention on the part of the perceiver, among others. But, and this is Kant’s point, it is not the description of space and time or a conception of the other conditions, or the perceiver’s being in cognitive command of these conditions, or even only the perceiver’s possessing the requisite concepts for the construal or determination of these conditions that could make the reference successful and the thing appear as it in fact does to the perceiver, but the (a-epistemic, non-doxastic, non-mental) fact that the thing and the perceiver are under these conditions.
2.4 MCE defended = Transcendental realism? Before embedding the upshot of the argument in my inquiry in the compatibility of MCE and TI, it seems to me worth answering one worry that a traditional idealist reader of TI might voice at this juncture, a worry that is, ironically, exactly what Kant-interpreters like Westphal regard as proving the point that Kant’s MCE forces acceptance of a form of realism stronger than ER. The worry is that the particulars invoked in Kant’s argument seem to be postulated in quite a direct way as metaphysically necessary denizens of the universe of experience. Since they are, moreover, said to be available as particulars of experience without prior individuative cognitive activity but nonetheless necessary for self-conscious experience, while we can only know of them through application of our apparatus of individuation, this postulate seems to be a clear case of a postulate of transcendental realism. I do not think that this worry is well-motivated. Kant’s defense of mind- and description-independent particulars in the argument developed here is derived directly from an analysis of the distinctive and fundamentally different semantic functions that intuitions and concepts perform and the corresponding requirements on a universe of discourse accruing from these semantic structures. The argument builds the case for extra-mental particulars in three steps. Since first, intuitions are not definable or substitutable by either definite descriptions or purported conceptually enriched identifying relations, and second, sensations are occasionsensitive, not generalizable and object-dependent items within empirical intuitions, and third, more generally, reference to particulars via intuitions is not reducible to 41 Cf. the joint product of CPR, A263/B319, where the difference in locations is presented as a ‘sufficient basis for the numerical difference’ between otherwise sensorily indistinguishable objects, plus CPR, A272/B328, where Kant presents difference and sameness of location as a necessary condition of ‘plurality and distinction’ between objects, and finally CPR, A282/B338, where he says that locations are “conditions of the intuition wherein the object (…) is given (…) although these conditions do not belong to the concept, they belong to all sensibility”. Taken together, these remarks make clear that a thing’s being at a suitable spatio-temporal location to be accessed and picked out by a human intuition is a non-conceptual transcendental condition of any object’s being given in intuition at all. There is nothing mysterious about this kind of general condition pertaining to all possible successful exercises of sensibility that is nonetheless, in spite of its generality, not of a conceptual nature or constituted by concepts. Kant here describes simply a contextual constraint on successful reference with means of singular, direct reference that they only acquire a determinate content (= object as semantical value) in circumstances in which the thinker or perceiver and the object are adequately spatiotemporally related.
123
462
Synthese (2011) 182:449–473
conceptual operations of any kind (i.e. the semantic phenomena reference and discursive meaningfulness are distinct), it follows that, if we are capable of cognitive operations on particulars, this is possible only because over and above the semantic, epistemological and intentional conditions mentioned in the three steps, such particulars are in fact available to thought, not from it, and we have the means of contact with them. While clearly performing a metaphysical task, Kant’s three step argument does not need to claim special metaphysical knowledge of how objects in general are, independent of the structure of our experience. For, this reconstruction of a material transcendental condition of experience only uses materials that are accessible to any user of Kant’s conceptual apparatus for the explication of the semantics of mental representations that are needed by an organism that is at the same time sensitive to changes in its environment and capable of learning from experience and of organizing the resulting information in conceptually articulate cognitive systems. That is, his argument does not leave behind the reflection on ‘our way of cognizing’ or, to put it differently, on the conditions accessible to and exploited by experience. One could say that the indispensability of sensorily available mind-independent particulars is an aspect of Kant’s semantic analysis, in particular of his clear distinction between intuitions and concepts. Kant thus converts the resolution of a metaphysical question, whether there are particulars, in one of the irreducibility of semantic mechanisms, namely the irreducibility of determinate reference to conceptual operations. It is from here that it is a short step to endorse the irreducibility of the referents of sensationbased experiential claims to concept-dependent constructs, and thus the rejection of traditional idealism as the theory of objects of experience. The argument thus does seem to follow Kant’s methodological precepts to develop whatever general philosophical claims from a reflection on the (semantic, epistemological, logical) conditions of experience but not from putative reaches beyond experience. Equipping MCE successfully with the mind-independent particulars it requires thus does not demand our conversion to transcendental realism. On the contrary, precisely on the background of this argument, typical passages in Kant’s explanation of the possibility of distinct content can be seen to express an explicit commitment to MCE. Such a commitment becomes explicit when Kant says: “our kind of intuition is dependent on the existence of the object, and hence is possible only by the object’s affecting the subject’s capacity to present”,42 and specifies the requirements of distinct mental content with the help of this as follows: “the object cannot be given to a concept otherwise than in intuition; and if a pure intuition is possible (…) still this pure intuition itself also can acquire its object (…) only through empirical intuition, whose mere form [as opposed to matter] the pure intuition is. Therefore all concepts, however possible they may be a priori, refer nonetheless to empirical intuitions, i.e. to data for possible experience. Without this reference, they (…) are mere play.”43 Passages as these taken together with Kant’s irreducibility claims articulate with precision the requirements on objects flowing from the acceptance of MCE.
42 CPR, B72. 43 CPR, A239/B298.
123
Synthese (2011) 182:449–473
463
According to this view of Kant regarding the possibility of distinct mental contents, there are things (as opposed to ‘mere representations’) required for the intuitional components of all mental contents to achieve being so much as contentful, and they have the following features: (MCEa ) they are mind-independently individuated, (MCEb ) extra-mental, (MCEc ) spatio-temporally accessible (MCEd ) actual particulars. 3 Transcendental Idealism, traditional and other versions I now want to examine whether entities with these characteristics can satisfy essential constraints that must be accepted by any form of TI. In this examination, I take Kant’s identification of TI and ER for granted. This allows me to answer two questions at once, namely whether MCE is incompatible with TI, and whether MCE requires a realism stronger than ER. Recall that I already agreed, and gave additional arguments for the contention that MCE is indeed incompatible with idealist readings of TI. But if there is a plausible non-idealist construal of objects that simultaneously satisfy TI and MCE, Westphal’s more ambitious (and more damaging) claim that any acceptance of TI is ruled out by MCE is false. In light of the equivalence of TI and ER, finding such a construal of objects would likewise allow to question the warrant for his still more general third claim that accepting MCE (in Kant or elsewhere) requires a realism stronger than ER. As to the constraints that an account has to satisfy to qualify as TI, I expand a proposal recently developed by Lucy Allais44 (partly building on Langton 1998) and require with her that a position, in order to count as a minimally faithful version of TI, has to contain (TIa ) the distinction between appearances and things in themselves, (TIb ) Kant’s humility or ‘critical agnosticism’45 (that we can’t know things as they are in themselves), (TIc ) a minimal idealism (that appearances cannot be characterized entirely mindindependently). In addition, I would add two commitments that we could call constraints of representational objectivity: (TId ) the distinction within the realm of experience between mere appearances, appearances, and things as they are,46 and 44 Allais (2004, p. 656/667), as well as Allais (2003, pp. 369–370). 45 I take the former term from Allais’ article who borrows it from Langton; the latter is Allison’s (Allison
1983, p. 241). 46 This constraint is actually the product of superposing another crucial distinction of Kant’s transcendental philosophy, that between an empirical sense and a transcendental sense in which certain concepts or contrasts can be used (or not), with the contrast between things in themselves and appearances. Kant himself follows this procedure when he explains the distinction between the way a thing happens to appear to us and how the thing itself is as the product of applying the contrast between appearance and things in
123
464
Synthese (2011) 182:449–473
(TIe ) the distinction between representation and what is represented.47 It is by imposing constraints (TId ) and (TIe ), not by his adherence to things in themselves, that Kant’s TI claims both, to be distinguishable from (empirical) idealists48 Footnote 46 continued themselves under the conditions of experience. In this case, when a thing x appears in a certain way F to someone but turns out on account of other experiences to be different (say, G), the representation ‘x is F’ is a “mere appearance”, the content of the judgment ‘x is G’ is how the thing appears in experience (i.e. the appearance), and what the latter judgment represents in virtue of being true of x and one of its traits (or ‘objectively real’) is the thing x as it is. As long as we only have the former judgment at our disposal, we are under these conditions in the position of having to say that even though x appears to be F, the thing itself is not F. This is the concept ‘thing in itself’ in application to things we cognitively access under conditions of experience, i.e. in its empirical use. What Kant denies is that the intelligibility and even indispensability of this use warrants the expectation that the same concept yields truth-evaluable contents under any whatever circumstances, for example in the absence of spatio-temporal locations or in the absence of any means of accessing particulars intuitively. The latter would be the transcendental employments of the same concept, which Kant terms as “no use”, yielding “nothing” and being “empty”. The reason why I do not list the contrast between empirical and transcendental as part of TI is that I think that it belongs to the apparatus that Kant develops to investigate the semantics of certain philosophical assertions, and thus rather to MCE. But the product of applying this apparatus to the distinction that characterizes TI is, of course, an element of Kant’s own version of TI. The fruitfulness of Kant’s distinction between appearances, things as they are and mere appearance is, of course, the dominant theme in McDowell’s reading of Kant. His conclusions are, however, different from those reached here. 47 Cf. Kant’s clarification of his use of the expression “appearance” for referring to objects of experience in the empirical sense to the effect that “we must be able at least to think , even if not cognize, the same objects also as things in themselves. For otherwise an absurd proposition would follow, viz. that there is appearance without anything that appears” (CPR, Bxxvii). That this is not an occasional slip of the pen is clear from the fact that without this proviso, the contrast between appearances and things in themselves would not be applicable to objects of experience, i.e. lack significance at the empirical level. But it is precisely at the empirical level that Kant makes essential and conscious use of the distinction to separate his account from Berkeleian idealism (see also next footnote). 48 Cf. Kant’s poignant objection to Berkeley in the Aesthetic, where he insists “that the intuition of external objects and the self-intuition of the mind both present these objects and the mind in space and in time as they affect our senses, i.e. as they appear. But I do not mean by this that these objects are a mere illusion. For when we deal with appearance [at the empirical level, A.M.], the objects […] are always regarded as something actually given—except that […] we do also distinguish this object as appearance from the same object as object in itself. […] But in asserting this, I am not saying that the bodies merely seem to be outside me, or that my soul only seems to be given in my self-consciousness. It would be my own fault if I turned into mere illusion what I ought to class with appearance” (CPR, B69). This leaves no doubt that the contrast between appearance and things in themselves (a) applies at the empirical level and (b) is not compatible with a classification of appearances as object-independent, merely mental or subjective fictions or constructs. On the contrary, according to Kant, it is precisely the ability of Kant’s conceptual apparatus to draw the distinction between fact and fiction that distinguishes it from the less precise Berkeleian framework, in which we cannot draw the distinction between a straw submerged in water seeming to us to be bent and this seeming’s role as indicating an actually straight straw submerged in water. It is the latter case in which the appearance of the straw (i.e. the way it must present itself to our senses, given their structure and the circumstances) can be (and can be taken to be by us as) a reliable indicator of the straightness of the straw itself, given how straight straws, water and the laws of optics interact in such a case. The component ‘straight straw’ is only extractable from the appearance if we have the conceptual means of referring to it not as it appears, but as it functions, being what it is, and the laws of nature being what they are, in these circumstances, and correspondingly to refer to the mental representation of the situation as, taken literally and without further information about our position as perceivers in the circumstances, misleading or ‘illusory’. Both contrasts thus allow us to determine the objectivity of the testimony of the senses on the background of the properties of our conceptual and cognitive equipment. In this way, the distinction enables precisely a realist conception of the objects of experience as being as they are independent of
123
Synthese (2011) 182:449–473
465
and to qualify as a kind of (empirical) realism.49 It almost isn’t worth mentioning that they do not suffice to establish metaphysical realism in the sense of Kant’s transcendental realism, or any other ambitious sense. But we should take note that according to this, things as they are in themselves, no matter how we may describe them on an occasion, are nonetheless never out of the purview of experience, while clearly distinct and independent of the way we happen to represent them. This qualifies them as mind-independent or at least not mind-constituted but nonetheless accessible to cognitive operations. What is excluded as objects of knowledge are only such that in some principled way are (and remain) impossible to cognitively access (i.e. noumena in the ‘positive’ sense50 ). 3.1 A consideration: a world for methodological E R With these criteria and distinctions in place, I now want to propose a methodologyoriented explication of the point of a central distinction of TI, that between appearances and things in themselves, in the case that Kant takes as its basic application, viz. the empirical sense. This will allow me to specify constraints that things have to satisfy to be objects according to TI. We could call the resulting picture of the world of experience methodological ER.51 A comparison as to whether the same things that Footnote 48 continued how they may, on occasion, appear. This is what the tripartite distinction between things in themselves— appearances—illusions achieves at the empirical level. Given the understanding of the contrast at the empirical level, Kant can then propose applications of it to philosophical cases at the transcendental (second order) level in which, as is well known, the objects of experience as they actually are contrast with things as they are merely thought, on the one hand, and illusory constructs (fictional entities) on the other (cf. his discussion of ‘figments of the brain’ and ‘fictions’, i.e. empirically unconstrained yet coherent constructs, in the elucidation of the ’Postulates of Empirical Thinking in General’, A219/B266-A226/B274). At both levels, then, the contrast does crucial work in enabling Kant to distinguish his approach regarding the objects of empirical knowledge, propositional attitudes and information encoded in simple indicative assertions from positions that in one (phenomenalist) way or another (contructivist) support traditional forms of idealism. 49 Cf. Kant’s frequent explication of objects of experience or the subject matter of judgments of experience as things that are what they are “independently of what the subject’s state is.” (e.g. CPR, B142) 50 For a recent clear statement that these are the only inaccessibles postulated by Kant, cf. Hanna (2006a), Hanna (2006b, p. 21). 51 The broad type of interpretive stance towards TI that I want to use in examining whether MCE and TI must be in conflict is thus a methodological or Copernican understanding of Kant’s TI, similar as that guiding the interpretations proposed by Bird and Melnick. According to it, Kant’s point in defending TI is that we can only learn what general structural features of the world we can know from the most rational reconstruction of the basic traits of the operations and conditions under which our cognitive faculties issue empirical knowledge. This reading is inspired by Kant’s famous description of his method as similar to the hypothetico-deductive procedures of the empirical sciences (cf. Bxix, fn.). Just as we may infer lawful behavior of empirical objects from the hypothetical truth of the laws of an empirical theory, so we may, if our best empirical knowledge commits us to certain general features, take the statements expressing them as also simply true of the world. But trying to say what the world is like “anyway” or “from the view from nowhere”, i.e. irrespective of any experience, fails to generate any (further) truth claims at all. It is important to note that this reading is non-subjectivist, since it is open to the possibility that some of the conditions of knowledge might, though asserting them requires reflection on requirements of our cognitive apparatus, be of a factual, mind-independent nature. Excluding this would require confusing the epistemic conditions of arriving at an assertion with the ontological status of what is thus asserted.
123
466
Synthese (2011) 182:449–473
satisfy these constraints also satisfy (MCEa )–(MCEd ) will enable us to know whether MCE and TI are compatible. Let me illustrate the distinction between appearances and things in themselves by one of Kant’s examples.52 According to Kant, it is one thing to say that ‘we cannot know the intrinsic character of nature’, when we describe the state of ignorance in our empirical knowledge about hidden features of the objects of experience in anticipation of future scientific progress. In this connection, we mean that, if scientific research (‘observation and dissection of appearances’, as Kant puts it) progresses, it will turn up many new insights we don’t yet possess, and therefore we cannot say now that we already know all there is to know about non-obvious traits of these empirical objects. This would be a use of ‘intrinsic nature’ in a methodological consideration about empirical knowledge and its limits. For a methodological empirical realist, saying that ‘we cannot know the intrinsic character of nature’ means that, given what we know, there is an open-ended class of things that we might not know regarding the same object of knowledge that we are already acquainted with and have some knowledge about. In this methodological perspective, Kant’s distinction between appearances and things in themselves marks the contrast between the objects of experience that we access in perception or other circumstances of intuitional reference, insofar as we (already) know them and these same objects of experience insofar as we do not (yet) know them.53 Affirming the existence of things in themselves here comes to making the following assumptions: (A) Whenever we have empirical knowledge regarding certain objects, we cannot, by the fact that we know what we know, assert that we know all there is to know and (B) We cannot exclude, by the fact that we have knowledge of some objects, that there are more objects in the humanly accessible universe that we do not know. 52 CPR, B334ff. This example also seems to me to undermine the metaphysical, Lockean interpretation of Kant’s difference between ‘things in themselves’ and appearances in terms of ‘intrinsic natures of things’ versus ‘things as presented in space and time’, as it underlies the explanations given in, e.g., Van Cleeve (1999), Allais (2004) or Langton (1998). Cf. the criticism of Langton’s relevant views in Bird (2006, pp. 547–552). 53 This way of putting the contrast is motivated by Kant’s way of drawing the distinction in the methodological part of the B-Preface, where he describes his hypothesis, TI, as that “the unconditioned is not to be met with in things insofar as we are acquainted with them (i.e. insofar as they are given to us), but is to be met with in them only insofar as we are not acquainted with them” (CPR, Bxx). The deflationary spirit I detect here in Kant and try to express in my proposal is similar to what Strawson proposed to be the “necessary, and not very advanced limit of sympathy with the metaphysics of transcendental idealism” that one should acknowledge (cf. Strawson 1999, 42). Bird (2006) also stresses the methodological character of the distinction in opposition to its received reception as ontological. Famously, Nagel criticizes this line in The View From Nowhere as not sufficiently realist. He urges the acceptance of a special class of things in themselves that is not available as the extension of one of the terms in the contrast as used on an occasion for the purposes of spelling out a stronger or absolute notion of objecthood. However, Nagel does not give stronger reasons for this urge than that there is no contradiction or countersense in constructing such objects, and that it is, given the fact that we don’t know these objects, likewise impossible to deny that they are spatio-temporally structured. (Westphal, 2004, pp. 52–67) in fact has an extensive detailed argument to support the latter view, and like Nagel thinks that this establishes a ‘stronger realism’ as compatible with Kant’s theory of cognition. For present purposes, I need not decide whether this is so because my argument is directed at establishing that such stronger realisms are not required for giving conditions for the effective and cognitively significant use of Kant’s contrast between appearance and things as they are in themselves.
123
Synthese (2011) 182:449–473
(C)
467
For any thing we encounter at some time in some region in space, if it obeys natural laws and has certain properties, it is possible that this thing with these properties could have been obeying the same natural laws but have been located anywhere else at that time, or could have been at this region with the properties it has at some other time.54
Assumption (A) could be called the assumption of the cognitive inexhaustibility of empirically real objects, assumption (B) could be called the assumption of the indefinite cardinality of empirical reality as such,55 and assumption (C) could be called the assumption of the non-essentiality of space-time location for the type-identity of empirically real individuals. 3.2 No news, good news: ER’s world satisfies TI Let me now first verify that objects from a world satisfying these assumptions satisfy the criteria (TIa )– (TIe ). If they do, then (A)–(C) characterize a world for TI. Given this world, we can then see whether objects in this world satisfy MCE. If they do, then there is one world of which both MCE and TI are true. The mentioned assumptions in combination go smoothly with many of the things Kant says about things in themselves, in particular, his claims that “we can never know things in themselves”, that ‘the categories don’t apply to them’, and that they are not determinately spatio-temporal.56 If those things that are empirically real are in fact cognitively inexhaustible, then, whatever the traits of them we don’t know yet, we can never claim to know them merely in virtue of what we know the objects to be. (TIb ) is thus already satisfied. On the other hand, those things that we do not yet know according to (B), we cannot now know to exist, and things and sets of things insofar as we don’t know them according to (A), we cannot know to fall under the categories and behave according to general laws of nature merely because we know them to do so in respects that we do know of them. For both reasons, we cannot directly apply the categories to things as we don’t know them. At the same time, (A) satisfies a constraint Kant imposes of empirical objects, namely that they be accessible intuitionally and knowable in the sense that they are, in principle, conceptually determinable to an arbitrary degree of complexity. Thus, cognitively inexhaustible objects in a universe of unknown cardinality qualify, since nothing speaks against their accessibility, as possible components of appearances. But this doesn’t make them subject-dependent. On the contrary, we saw that Kant says that it is things, ‘the real in appearance’ that are ‘thoroughgoingly determined’ even when our cognition of them isn’t. Cognitive 54 That is, in abstracting from the spatio-temporal location of an individual with these properties, we abstract from a particular’s being that particular thing of a type but still refer to things of that type and their regular behavior in spatio-temporal conditions. In abstracting from a particular thing’s being at certain regions at a certain time, however, we abstract also from the conditions under which it is possible to intuitionally refer to it, as opposed to all other things with the same properties. 55 With this proposal, I side with what Melnick has called the “sheer limiting account” of things in themselves, who also considers it to be exactly what Copernicanism (i.e. the methodological view I recommend) requires. Cf. (Melnick, 2004, p. 162). Cf. also (Hanna, 2006b, p. 21). 56 These are the three tenets to be met by any account of things in themselves according to Melnick (2004).
123
468
Synthese (2011) 182:449–473
inexhaustibility entails that, whatever a full account of the objects of knowledge may be, indeed, whether there be such an account or not, the properties of objects that we do not yet know cannot depend on our minds. (C) expands this latter feature to those things that we have in fact accessed intuitionally by licensing the counterfactual that even though we in fact did so access them, we might not have, such that our accessing them is not a necessary condition of their existence and their being the way they are. They could have been just like that if we hadn’t accessed them. Their being in particular spatio-temporal regions so that we may access them is therefore not an essential feature of the things we perceive. There is thus a clear sense in which we can say that those things our experience deals with as we don’t know them are not necessarily spatio-temporal. We can, in hypothesizing about them, abstract from space and time. This is certainly not speaking about these objects as we know them, since we know them, with all the properties over and above their spatio-temporal locations and movements, by perceptually accessing them. But there is no reason why in so hypothesizing, we would necessarily be failing to characterize things that are like the ones we perceptually, i.e. intuitionally access. What Kant seems to claim is that when we hypothesize about the objects that we actually access, they do not necessarily disappear from our cognitive purview when abstracting from their spatio-temporal nature. However, since we cannot access objects under the hypothetical conditions of the abstraction by way of our sensibility, we can also not be confident that we do indeed refer to anything, since our only way of referentially relating cognition to thought is by empirical intuition. The scenario with things that are exactly like the ones we in fact access but not under spatio-temporal conditions is thus one we can think by using the very same concepts that are true of the objects as we know them, but it cannot be determinate what the content of our thoughts regarding this world would be because the determinacy of mental content requires intuitional access to particulars under spatio-temporal conditions. Objects of experience, having the non-spatio-temporal properties they do, thus allow the development of their own counterparts that share all their non-spatio-temporal properties under spatio-temporally deprived conditions. While this shows that these counterparts are “merely thought” or, in contemporary language, mere constructs, it is also clear that these specific constructs are what the very objects of our experience become under the hypothetical suspension of their spatio-temporality. They are, in this precise sense, not extra-objects but aspects of our objects of experience: our objects of experience simply have the property of also satisfying sets of non-spatiotemporal concepts the totality of which generates mere constructs but no actual things under a-spatio-temporal conditions. Thus, objects of cognition obeying (C) satisfy a condition for strong Kantian humility (TIb ), the non-spatio-temporality of things in themselves. The objects of experience are such that what they are is not constituted or fully determined by any actual properties of our minds, neither conceptually nor intuitionally. Therefore, the objects of experience are mind-independent not only in their existence, but also with regard to their properties.57 (TIe ) is satisfied. Further, if things are the 57 For those prepared to protest that appearances cannot be considered mind-independent in any way, here
is a quote from Kant to the contrary: “from the concept of appearance as such, too, it follows naturally that there must correspond something that is not in itself appearance. For appearance cannot be anything
123
Synthese (2011) 182:449–473
469
real in appearance and appearances composite items, then things in themselves and appearances cannot be identical. (TIa ) is satisfied. On the other hand, (A) and (B) also satisfy the idealism-constraint (TIc ), since appearances, i.e. things as we (can) know them to be, and the contrast between appearances and things in themselves are both mind-related because the distinction recurs to contingent facts about us. Firstly, the content of the distinction varies with how much, what and in what way we know these things, and what determinations of the real in appearance we attempt to add successively to our existing knowledge depends also on what questions we ask. (TIc ) is sustained. Secondly, which of the things in the universe of unknown cardinality we happen to encounter and to be able to intuitionally access depends, according to (C), on contextual features like our own location and the expansion of sensitivities we are able to devise. Moreover, since we cannot convert a geometrical system into a system of locations without demonstratively privileging some particular region as the origin of the geometry, the locations of things in space cannot be specified without reference to some selection of origin or other. With both these contingencies on features of our cognitive situation, (TIc ) is satisfied, because we cannot characterize the universe of objects of experience, i.e. the content of our experience, without reference to facts about our own spatio-temporal location and about our particular cognitive interactions with things. Finally, (A) and (B) also satisfy the other objectivity constraint, since what determinations we can successfully add depends on which judgments are true of these things, not on whether any of us would like the object to be so determined. (TId ) is thus also satisfied. Since all the constraints on TI are satisfied in the world of ER as characterized by assumptions (A)–(C), such a world is a world of which TI/ER is true. 4 Bad news for the incompatibility-claim: methodological ER’s world also satisfies MCE Although it is fairly obvious from the foregoing, let me quickly demonstrate that the world characterized by (A)–(C) also satisfies the constraints on MCE from §2.4. The key element in this move is, of course, the fact that the (A)–(C)-world satisfies all the constraints on TI, and in particular, the distinction between appearances and things in themselves. This means that this world contains a domain of things in themselves when and always when it supplies a domain of appearances. The remaining task is then to see whether these things can function as the ab extra particulars required for sensation-based intuitional reference. If they do, then this world offers a condition under which MCE can be true. Recall, MCE requires (MCEa ) mind-independently individuated, Footnote 57 continued by itself (…) the word appearance already indicates a reference to something the direct presentation of which is indeed sensible, but which is in itself—even without the character of our sensibility (…)—must be something, i.e., an object independent of sensibility” (CPR, A251–252, emphasis added). Kant does not (always) make the mistake to conclude from the fact that appearances, objects of experience, cannot be characterized independent of our representational resources that the objects so characterized cannot be mind-independent. On the contrary, in this passage, Kant makes the fundamental semantic distinction between sign and reference, as well as the independence of one from the other as clear as we can wish.
123
470
Synthese (2011) 182:449–473
(MCEb ) extra-mental, (MCEc ) spatio-temporally accessible (MCEd ) actual particulars. The (A)–(C) world offers, as we saw, cognitively inexhaustible individuals. I argued in §3.1 that this entails that, no matter whether there be a complete, fully determinative and doxastically accessible account of them or not, the properties of objects that we do not yet know cannot depend on our minds. Thus, (MCEa ) is satisfied. On the other hand, (A) and (C) together entail that the denizens of this world are, although cognitively inexhaustible, not cognitively inaccessible, in particular, that they are, as objects of particular experiences, spatio-temporally located and therefore possibly accessible. In case of access, they are actually referred to. (MCEc ) and (MCEd ) are satisfied. (A) and (B) together entail that, first, any accessed individual in this world is what it is not in virtue of what it is known as, since it is not fully known in all respects that can be known of it, i.e. that are truly attributable to it by some knower, and that, second, this world is assumed to contain an arbitrarily large number of things not (yet) known to any knower which, since those things accessed in this world are actually accessed, are actual as well. In other words, the world under consideration actually contains more entities than those possibly construable by the mind, which means that these denizens (past and present entities to-be-discovered) are actual particulars and extra-mental or independent in their existence and properties of the activities of the mind. Given that (C) denies the essentiality of the particular spatio-temporal location of individuals for their possession of law-like properties, and given that the concept-dependent methods of individuation are exhausted, we can see that the objects are also not taken by Kant’s semantics to be constituted or individuated by the only remaining candidate for (token-by-token) mind-dependent individuation, viz. actually performed intuitive access. The particulars taken for granted by MCE are just not in any way mind dependent, be it for concept-dependency or be it for dependency on forms of intuition. The argument as reconstructed here does thus also not depend on a potentially problematic identification of mind-dependence and concept-dependence, because for being the particular individuals they are, the particulars taken for granted in MCE are also not essentially dependent on being identified in a particular spatio-temporal way.58 While they have to be at some spatio-temporal location or other to be accessible, their being identified as being at a particular location by a mind equipped with the forms of intuition is not essential to their being where they are in this structure. Therefore, (MCEb ) and (MCEd ) are fully satisfied. In sum, the entities in a world characterized by assumptions (A)–(C) satisfy all the requirements of MCE. 58 This is exactly as it should be, as there are good arguments to the effect that determining which system
of locations of particular spatio-temporal entities a formal space–time geometry is intended to represent essentially depends on fixing at least one point of reference through non-conceptual, indexical reference to an environing particular before being able to locate other entities relative to this fixed reference point (an origin of sorts). It is after such fixing that the same object can then be itself explicitly spatio-temporally located in terms of relations within the system, namely relative to other, then fixed entities. This clearly lends the same priority to object-dependent reference vis-a-vis spatio-temporal locatability that Kant seems to be so adamant about in his construal of space and time as based on intuitions, i.e. successful direct singular reference, not concepts. Regarding this irreducibility of determinate locations to purely conceptually defined spatio-temporal relations in light of recent developments in physics, cf. Mittelstaedt (2003).
123
Synthese (2011) 182:449–473
471
5 Conclusion In §3.2, we saw that the (A)–(C) world characterized in §3.1 satisfies all constraints on TI, (TIa )–(TIe ). In §IV., we saw that the same world satisfies all the constraints on MCE. Therefore, the (A)–(C) world simultaneously satisfies MCE and TI. My first conclusion is thus that it is incorrect to believe that TI and MCE are incompatible. They are not, in a world characterized by assumptions (A)–(C). Now, the question might arise whether (A)–(C) are some sort of exotic metaphysical contraption to construct a counterexample to a given philosophical position, or whether it is, apart from yielding a possible interpretation of Kant’s TI, also a plausible set of assumptions to make when one engages in empirical and philosophical research. An answer to this question will crucially turn on whether we believe of the things around us that they are ‘objects we encounter’ as denizens in a universe with unknown cardinality that are capable of being actually accessed in contexts of (intuitionally achieved) direct reference and of being successively though never exhaustively conceptually determined. If we regard things around us in this way, then we also accept that what objects turn out to be like, whether they exist, and whether our classifications as we have them so far actually capture important commonalities among these denizens does not depend on facts about our mental or doxastic operations alone. But all those classifications and accesses that we successfully perform have the status of cognitive operations on actually existing mind-independent objects and therefore afford objective information. Objects thus are in the purview of our cognitive systems as constraint and as target. In my opinion, a world characterized in this way resembles that underlying scientific and everyday cognitive and practical affairs quite closely. In fact, the (A)–(C) world seems to me not only to satisfy TI and MCE, it actually is equivalent to a commonsense-realist conception of the world (give or take a little).59 I would therefore regard this conception of the world of ‘objects we encounter’ as a not merely possible but also very defensible version of ER, that is, of a reconstruction of the ontological assumptions required by our best objectivity-targeted cognitive practices. This is, incidentally, precisely what Kant’s transcendental philosophy, understood as a reconstructive enterprise in the epistemology of scientific and everyday knowledge of things, sets out to capture. At the same time, the methodological ER characterized by (A)–(C) is even able to perform one of the important (meta-)philosophical functions that TI is assigned in Kant’s critical philosophy. For it allows the critical use Kant makes of the notion ‘thing in itself’ by rejecting truth claims composed of categories and things as such—i.e. as we merely think them (‘noumena in the positive sense’). My treatment of the example in §3.1 should make this intuitively clear. Methodological
59 I take this to refer to a relatively unsophisticated view of objects of experience and their relation to subjects of experience, along the lines spelled out, e.g., in Strawson (1988), where he terms the view “our pre-theoretical scheme” (p. 102) and ascribes to us (the subjects of experience) the ability to normally distinguish between experiences of seeing (etc.) objects and the objects themselves, between the way our impressions represent the objects we experience and the way the objects actually are, and the ability to be, in the case of actual perception, immediately aware of the objects (where the latter does not entail, in our pre-theoretical scheme, any claim as to the infallibility of our attributions of properties to that which we are immediately aware of).
123
472
Synthese (2011) 182:449–473
ER therefore not only satisfies all constraints on TI but also appears to have other desirable features. Interpreters sympathetic with the rough line taken here, like Westphal (2004) and Hanna (2006a,b), offer construals of similar presuppositions of MCE as a form of ‘metaphysical’ or ‘transcendental realism’ (TR). They defend the view that MCE allows the articulation of a coherent form of what is known as the “neglected alternative”, that spatio-temporal properties and categorial constitution of objects might be traits of things in themselves that our cognitive capacities ‘pick up’. Now, one of Kant’s main reasons to develop TI was its supposed incompatibility with all forms of “transcendental realism”, to undermine in one (philosophical) swoop the idea that traditional metaphysical topics like sciences of the soul, the cosmos and the divine in fact have special objects (in themselves) as their subject matter. TI as developed here, however, includes MCE and its background condition of mind-independent particulars, and thus might seem not to be entirely incompatible with all forms of TR. This may be true, but I also believe we can leave this worry to one side as long as MCE does not force such stronger forms of realism.60 My second conclusion is thus that the realism required by MCE is no stronger than ER. In sum, MCE does not require a realism exceeding the confines of ER, while it is compatible with an interpretation of TI that incorporates the conceptual adjustments precipitated by the assumption that MCE is a more adequate theory of conceptual content than those fueled by traditional idealisms. But ER also does not reduce to any other form of non-realism. It thus seems premature to toss out TI or ER on the strength of Kant’s semantics. I rather think that, once we enrich our understanding of the conceptual proposals and distinctions of TI with the lessons from MCE, TI as ER might yield a very fresh series of insights in the requirements of externalism and commonsense realism and, indeed, in the structure of the ways in which we succeed in representing reality. References Allais, L. (2003). Kant’s transcendental idealism and contemporary anti-realism. International Journal of Philosophical Studies, 11(4), 369–392. Allais, L. (2004). Kant’s one world: Interpreting ‘transcendental idealism’. British Journal for the History of Philosophy, 12(4), 655–684. Allison, H. E. (1983). Kant’s transcendental idealism. An interpretation and defense. New Haven: Yale University Press. Bird, G. (2006). The revolutionary Kant. Chicago: Open Court. Brandt, R. (1998). Transzendentale Ästhetik, §§1-3. In G. Mohr & M. Willaschek (Eds.), Immanuel Kant: Kritik der reinen Vernunft (pp. 81–106). Berlin: Akademie-Verlag. Cassam, Q. (2007). The possibility of knowledge. Oxford: Oxford University Press. 60 Kant’s mistake would then not have been so much to claim the explanatory superiority of TI/ER over traditional forms of realism. If any, the mistake would consist in having taken the anti-idealist, empirical realism in his counterproposal as strictly incompatible with all forms of metaphysical realism, not to have overlooked the “neglected alternative”. As such a stronger form is not necessary for MCE, Kant was fully justified in neglecting an alternative, inflationary construal of the grounds of experience. We may grant that such an alternative could be compatible with the requirements of Kant’s methodological and epistemological approach, but we must also observe that it adds metaphysical burdens and forms of argument beyond need, and moreover precisely of the sort that Kant’s approach was designed to disabuse us of.
123
Synthese (2011) 182:449–473
473
Collins, A. (1996). Possible experience. Berkeley: University of California Press. Falkenstein, L. (1995). Kant’s intuitionism. Toronto: University of Toronto Press. George, R. (1981). Kant’s sensationism. Synthese, 47(2), 229–255. Goldberg, S. (forthcoming). Externalism and metaphysical realism. American Philosophical Quarterly. Guyer, P. (1987). The claims of reason. Cambridge: Cambridge University Press. Hanna, R. (2001). Kant and the foundations of analytic philosophy. Oxford: Oxford University Press. Hanna, R. (2006a). Kant’s theory of judgment. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Winter 2006 Edition). http://plato.stanford.edu/entries/kant-judgment/ Hanna, R. (2006b). Kant, science, and human nature. Oxford: Oxford University Press. Howell, R. (1973). Intuition, synthesis, and individuation in the critique of pure reason. Nous, 7, 207–232. Howell, R. (1992). Kant’s transcendental deduction. Dordrecht: Kluwer. Kant, I. (1968). Logik (Jäsche). In W. Weischedel (Ed.), Immanuel Kant, Werkausgabe, Bd. VI/2 (pp. 417–582). Frankfurt: Suhrkamp. Kant, I. (1996). Critique of pure reason (W. S. Pluhar, Trans.) Indianapolis: Hackett. Kant, I. (1998). Critique of pure reason (P. Guyer & A. Wood, Trans.). Cambridge, MA: Cambridge University Press. Langton, R. (1998). Kantian humility. Oxford: Clarendon Press. McDowell, J. (1994). Mind and world. Cambridge, MA: Harvard University Press. Melnick, A. (2004). On things in themselves. In A. Melnick (Ed.), Themes in Kant’s metaphysics and ethics (pp. 147–163). Washington, DC: The Catholic University of America Press. Mittelstaedt, P. (2003). Der Objektbegriff bei Kant und in der gegenwärtigen Physik. In D. Heidemann & K. Engelhard (Eds.), Warum Kant heute? (pp. 207–230). Berlin: De Gruyter. Rosenberg, J. F. (2005). Accessing Kant. Oxford: Oxford University Press. Schönrich, G. (2003). Externalisierung des Geistes? Kants usualistische Repräsentationstheorie. In D. Heidemann & K. Engelhard (Eds.), Warum Kant heute? (pp. 126–149). Berlin: De Gruyter. Strawson, P. (1988). Perception and its objects. In J. Dancy (Ed.), Perceptual knowledge (pp. 92– 112). Oxford: Oxford University Press. Strawson, P. (1999). The bounds of sense. London, NY: Routledge (Orig. 1966). Thompson, M. (1972–1973). Singular terms and intuitions in Kant’s epistemology. Review of Metaphysics, 26, 314–343 Van Cleeve, J. (1999). Problems from Kant. Oxford: Oxford University Press. Waxman, W. (2005). Kant and the empiricists. Oxford: Oxford University Press. Westphal, K. R. (2003a). Epistemic reflection and cognitive significance in Kant’s transcendental response to skepticism. Kant-Studien, 94, 135–171. Westphal, K. R. (2003b). Can pragmatic realists argue transcendentally?. In J. R. Shook (Ed.), Pragmatic naturalism and realism (pp. 151–175). Amherst, NY: Prometheus. Westphal, K. R. (2004). Kant’s transcendental proof of realism. Cambridge/London: Cambridge University Press. Westphal, K. R. (2005). Kant, Wittgenstein, and Transcendental Chaos. Inquiry, 28(4), 303–323. Westphal, K. R. (2006). How does Kant prove that we perceive, and not merely imagine, physical objects?. Review of Metaphysics, 59, 781–806. Willaschek, M. (1997). Der transzendentale Idealismus und die Idealität von Raum und Zeit. Deutsche Zeitschrift für Philosophische Forschung, 51, 537–564. Willaschek, M. (1998). Phaenomena/Nouomena und die Amphibolie der Reflexionsbegriffe. In G. Mohr & M. Willaschek (Eds.), Immanuel Kant: Kritik der reinen Vernunft (pp. 325–351). Berlin: Akademie-Verlag. Willaschek, M. (2001). Die Mehrdeutigkeit der Unterscheidung zwischen Dingen an sich und Erscheinungen. In V. Gerhard, R.-P. Horstmann, R. Schumacher (Eds.), Kant und die Berliner Aufklärung (pp. 679–690). Berlin/NY: DeGruyter.
123
This page intentionally left blank z
Synthese (2011) 182:475–492 DOI 10.1007/s11229-010-9754-y
Internalist and externalist aspects of justification in scientific inquiry Kent Staley · Aaron Cobb
Received: 20 August 2009 / Accepted: 27 May 2010 / Published online: 18 June 2010 © Springer Science+Business Media B.V. 2010
Abstract While epistemic justification is a central concern for both contemporary epistemology and philosophy of science, debates in contemporary epistemology about the nature of epistemic justification have not been discussed extensively by philosophers of science. As a step toward a coherent account of scientific justification that is informed by, and sheds light on, justificatory practices in the sciences, this paper examines one of these debates—the internalist–externalist debate—from the perspective of objective accounts of scientific evidence. In particular, we focus on Deborah Mayo’s error-statistical theory of evidence because it is a paradigmatically objective theory of evidence that is strongly informed by methodological practice. We contend that from the standpoint of such an objective theory of evidence, justification in science has both externalist and internalist characteristics. In reaching this conclusion, however, we find that the terms of the contemporary debate between internalists and externalists have to be redefined to be applicable to scientific contexts. Keywords Security
Evidence · Justification · Internalism · Externalism · Error-statistics ·
1 Introduction Contemporary epistemologists have devoted considerable attention to conceptual analyses of the nature of epistemic justification but there is great disagreement about
K. Staley (B) Saint Louis University, St. Louis, MO, USA e-mail:
[email protected] A. Cobb Auburn University at Montgomery, Montgomery, AL, USA e-mail:
[email protected]
123
476
Synthese (2011) 182:475–492
whether the factors relevant to the justification of a person’s belief must be internally accessible to that person (Alston 1989; Fumerton 1996; Kornblith 2001; Pryor 2001; Bonjour and Sosa 2003; McGrew and McGrew 2006; Goldberg 2007; Poston 2008). This debate between internalists, who endorse the access requirement, and externalists, who reject it, has been little discussed by philosophers of science.1 Yet epistemic justification is a central concern in philosophy of science. In particular, the wide-ranging debates over evidence and confirmation seem to be concerned to a significant degree with the question of justifying conclusions from data. Theories of evidence can indeed be understood in part as attempts to explicate a concept of scientific justification. But how do such theories depict scientific justification? Do they employ an internalist or externalist notion of justification? To facilitate an inquiry into these questions we reconsider the dichotomy between internalism and externalism from the perspective of justificatory practices in the sciences. In doing so, we find that the dichotomy as traditionally formulated does not adequately capture the nature of justification in scientific inquiry. We motivate our reformulation by attending to the socially-situated nature of practices of scientific justification. As part of this reformulation, we redirect the debate away from a concern with the justification of beliefs and toward the justification of assertions of experimental inferences. Such a redirection has a further basis in the nature of our inquiry, which considers the question of justification from the perspective of objective accounts of scientific evidence. More precisely, we are concerned with theories that treat evidential relationships as obtaining in a manner that is epistemically independent of the beliefs of particular individuals or groups. To that end, we examine this issue in the context of Deborah Mayo’s error-statistical theory of evidence (Mayo 1996). We choose Mayo’s account because it is a paradigmatically objective theory of evidence that is strongly informed by methodological practice. Our main argument, however, insofar as it does not depend strongly on the details of Mayo’s account, would plausibly apply equally well to other objective theories such as likelihood accounts (Royall 1997; Lele 2004; Sober 2008), objective Bayesian theories (Jaynes 2003; Williamson 2008), or Peter Achinstein’s explanatory-probabilistic hybrid (Achinstein 2001). Our thesis is that, as understood from the standpoint of such an objective theory of evidence, justification in science has both externalist and internalist characteristics. Our discussion proceeds as follows. In Sect. 2, we briefly review the terms of the contemporary debate between internalists and externalists in epistemology, and reframe that debate so as to make its terms applicable to scientific inquiry. In Sect. 3, we discuss Mayo’s error-statistical theory of evidence and identify both externalist and internalist aspects of the concept of justification implicit in that account. Although errorstatistical evidence is unrelativized, the justification of experimental conclusions does, we argue, depend on an epistemic situation. Section 4 introduces an epistemic notion— the security of an inference—that shares this dependence on epistemic situation and illuminates the nature of the additional epistemic work that takes us beyond de facto evidence for a hypothesis to the justification of an inference to it. We articulate an
1 See, however, Wheeler and Pereira (2008) and Roush (2005) for interesting exceptions.
123
Synthese (2011) 182:475–492
477
ideal of justification that incorporates both an objective evidence component and a relativized security component. In Sect. 5, we return to the accessibility requirement that is at the heart of the internalist–externalist debate and present our conclusions. We find that security itself is not a strictly internalist notion, insofar as access, reformulated in terms of a community of inquirers, emerges as a necessary but not sufficient condition for securing experimental conclusions. 2 Reframing the internalist–externalist debate A full discussion of the general debate in contemporary epistemology between the proponents of internalism and externalism is beyond the scope of this paper, but a brief excursus into this territory is important for understanding the internalist and externalist aspects of justification implicit in an objective theory of evidence like Mayo’s. Generally speaking, the debate between internalists and externalists concerns whether those factors justifying a person’s belief must be cognitively accessible to the person.2 Internalists argue that accessibility is both a necessary and sufficient condition for epistemic justification.3 The intuitive ground supporting internalism is the idea that providing a compelling answer to questions or worries about the epistemic status of a particular belief requires appealing to evidence or reasons that support the belief in question. Put more explicitly, internalism is the view that Internalism: a belief b is justified for a subject S at time t if and only if that which justifies b is cognitively accessible to S.4 Internalists generally understand cognitive accessibility as a relation holding between a subject S and what S can discover on reflection alone. James Pryor notes that internalists often understand the notion of accessibility in terms of the route by which one has access: one could understand it as meaning that one can know by reflection alone whether one is in one of the relevant states. 2 We are framing the debate in terms of accessibility although there are alternatives. Some think of the debate as one concerning whether the factors relevant to the justification of a person’s belief are either internal or external to a person’s mental life (Conee and Feldman 2001). Others think of the debate as concerning whether one ought to accept a deontological account of epistemic responsibilities (Steup 1999). Although these ways of demarcating internalism and externalism may be fruitful, they are orthogonal to our purposes for reasons we develop more fully below. 3 Several authors (Alston 1989, pp. 227–245; Comesana forthcoming; Goldman forthcoming) have been
developing hybrid accounts that combine aspects of both internalism and externalism. Such approaches emerge from a way of framing the debate that does not treat internalism and externalism as mutually exclusive and exhaustive. We regard these attempts as steps in the right direction and see the work in this paper as providing further reasons for developing hybrid accounts since justification in scientific inquiry requires both internalist and externalist aspects. 4 For the purposes of our paper, we are focused on internalism as a thesis about propositional justification. In the literature on internalism and externalism in contemporary epistemology, many distinguish between propositional and doxastic justification (Poston 2008). We think it is appropriate to focus on propositional justification in the context of scientific inquiry because we are concerned with the justificatory practices essential to justifying experimental conclusions, rather than the status of the beliefs of individual scientists.
123
478
Synthese (2011) 182:475–492
(By ‘reflection’ I mean a priori reasoning, introspective awareness of one’s own mental states, and one’s memory of knowledge acquired in those ways…Most epistemologists understand the notion [of access] in [this way]). (Pryor 2001, pp. 103–104) The basic idea is that the justifiers j for a belief b must be the sort of thing that could be an object of conscious awareness; if j cannot be an object of conscious awareness, j cannot serve as a justifier for b. Externalists argue that cognitive accessibility is neither necessary nor sufficient for justification. It is not necessary because there are subjects (i.e., children or adults with relatively little cognitive sophistication) whose beliefs are justified even though access to the relevant justifiers for their beliefs may be impossible. Accessibility is not sufficient for epistemic justification because there is no guarantee that the information and evidence available to a subject is properly connected with the truth of the belief in question. Given the epistemic limitations of any subject, one cannot take that which is within one’s cognitive grasp to exhaust the full range of what is relevant to the justification of a belief. So, externalists want to develop an account of justification that accords with two basic intuitions. First, since there is good reason to believe that some epistemic subjects possess justified beliefs even though there is no reason to think they could have access to the relevant justifiers, an adequate account of epistemic justification must, at a minimum, show that it is possible for these subjects to possess justified beliefs. Second, there must be a strong connection between justification and truth. The primary ground of this intuition is that the epistemic significance of whatever justifies a belief lies in its truth-conduciveness. (We use “truth-conduciveness” here as an umbrella term that applies to processes that have a tendency to produce true beliefs. Different accounts of justification will specify such a tendency in different terms.) So, in order to facilitate an explicit contrast with the internalist thesis articulated above, let externalism be the thesis that Externalism: a belief b is justified for a person S if and only if that which produces b is truth-conducive. The internalist/externalist debate in contemporary epistemology is not concerned primarily with analyzing the nature of epistemic justification in the sciences. With some amendments, however, one can employ this framework schematically to clarify the nature and significance of particular justificatory principles and practices in the sciences. Our first proposed modification requires a shift from the appraisal of beliefs to the appraisal of assertions as the proper object of epistemic evaluation. Whereas beliefs are private and individually held, at least in the paradigmatic cases, scientific knowledge is best regarded as a public and collective achievement. The activity of knowledgeproduction in the sciences generally occurs within a social structure and this involves acts of assertion by scientists in various forums (i.e., preprints, publications, presentations, decisions taken in collaboration meetings, etc.). In fact, one could argue that it is intrinsic to scientific knowledge not merely that the acquisition of it often requires groups of people but that one aim of the scientific enterprise is a particular kind of
123
Synthese (2011) 182:475–492
479
rationally persuasive communication in which reasons are presented to other members of the community that will serve to underwrite, within that community, the status of particular claims as knowledge. We do not deny that knowledge of scientific matters can be ascribed to individual scientists. Rather, we are directing our attention to a distinct sense of scientific knowledge as publicly accessible content that arises from the socially organized efforts of individuals working in collaboration (cf. Kitcher 1993; Suppe 1993; Thagard 1997; Longino 2002; Wray 2002). While beliefs are certainly relevant to actions particular scientists perform, including the activity of endorsing particular experimental conclusions, the assertion and endorsement of such conclusions in these forums can be distinguished from an individual scientist’s beliefs about these assertions. Although not conclusive, these considerations suggest pursuing the idea that scientific knowledge should not be understood as essentially a species of belief.5 At any rate, whether or not this is the case, one seeking to understand epistemic justification in the sciences and the practices that produce it would be better off looking to what is asserted in the appropriate social contexts than worrying about underlying beliefs, as it is through the interaction of communicative acts that the corpus of scientific knowledge is formed.6 The recognition of this intrinsic social structure of the sciences also suggests a further modification to the internalist/externalist dichotomy in contemporary epistemology. Since the objects of epistemic evaluation are the assertions made by scientists in particular professional forums and, often, these assertions are the product of a wideranging set of experiments conducted in collaboration with many other scientists, the notion of accessibility that divides internalists from externalists must be understood more broadly. Instead of thinking of accessibility as a relation holding between an individual subject S and what S can know through conscious reflection alone, accessibility must be relativized to the particular scientific community rather than individuals within the community. We employ the term ‘community’ recognizing that there are a variety of communities in scientific inquiry and these communities can have distinct characteristics (i.e., they can be broad or small, loosely organized or tightly structured, etc.) and relations to other communities working on related questions. To make this more precise, we can borrow the notion of an epistemic situation from Achinstein’s (2001) work on evidence. On Achinstein’s analysis, an epistemic situation is an abstract type of situation in which, among other things, one knows or believes that certain propositions are true, one is not in a position to know or believe that others are, and one knows (or does not know) how to reason from the former to the hypothesis of interest, even if such a situation does not in fact obtain for any person. (Achinstein 2001, p. 20) 5 See Baird (2004) and Pitt (2005) for views that are critical of the idea of knowledge as belief. Popper’s
notion of objective knowledge results from a somewhat different version of such a critical stance (Popper 1979). 6 Furthermore, the relationship between scientific knowledge and individual belief is complicated by the fact that the great majority of evidential claims are issued by groups rather than individuals, and the relationship between these claims, the beliefs of the individuals in those groups, and knowledge is itself a contested issue (see, e.g., Gilbert 1994; Staley 2007; Tollefsen 2002; Wray 2007).
123
480
Synthese (2011) 182:475–492
We propose that an epistemic situation describes the basic epistemological framework of the relevant scientific community working on a shared problem or question and advancing an experimental conclusion grounded on the basis of evidence produced in their research. Accessibility is a relation holding between a community as a whole and the data or evidence available within its epistemic situation. We are now in the position to reframe the internalist/externalist dichotomy in terms relevant to epistemic justification in the sciences. In the remainder of this paper, we will employ the following definitions: Internalism*: the assertion of an experimental conclusion (h) is justified relative to a particular epistemic situation (K) if and only if that which justifies h is accessible to those within K. Externalism*: the assertion of an experimental conclusion (h) is justified if and only if that which justifies h is truth-conducive. Our goal in the remaining sections is to show that understanding justification in scientific inquiry requires both internalist* and externalist* aspects—that is, an assertion h is justified relative to K if and only if that which justifies h is both (i) accessible to those within K and (ii) truth-conducive. 3 The error-statistical theory of evidence The internalist–externalist debate is concerned with the question of accessibility and not primarily with the manner in which scientists seek to justify conclusions from experimental data. We regard debates over evidence and confirmation in the philosophy of science as concerned with the latter question. But philosophical theories of evidence may be implicitly committed to either internalist* or externalist* views. This can be seen clearly in Deborah Mayo’s error-statistical account. Mayo has herself written that the central focus of her account is to “provide a way to determine the evidence that a set of data x0 supplies for making warranted inferences about the process giving rise to x0 ” (Mayo and Spanos 2006, p. 327).7 An error-statistical theory of evidence is, then, concerned with justification. To see just how error-statistics characterizes the justification of inferences from data, we next review Mayo’s account. According to the error-statistical (ES) account, good evidence results from the use of testing procedures that have certain good characteristics when applied to hypotheses of interest. Tests that possess such characteristics, which can be described in terms of error probabilities, enable investigators to learn from data because of their probative value with regard to the hypotheses being investigated. More specifically, “Data x0 in test T provide good evidence for inferring H (just) to the extent that H passes severely with x0 ” (Mayo and Spanos 2006, p. 328). The notion of severely passing a test can be schematized as follows: H passes a severe test T with x0 if 7 Here Mayo and Spanos use the term “warranted” as a synonym for justified, and not in the sense that the term is used by epistemologists, as denoting the property that, in addition to truth, qualifies a belief as knowledge (Mayo, personal communication).
123
Synthese (2011) 182:475–492
481
ES1 x0 fit H; ES2 with very low probability, test T would have produced a result that fits H as well as (or better than) x0 do, if H were false and some alternative to H were true. While various measures of fit might be employed (Lele 2004), at a minimum, for x0 to fit H, x0 must not be improbable under H by comparison with competing hypotheses. The probabilistic framework supporting these criteria articulates the relevant probabilities (both the likelihoods implicated in fit assessment and the error rates reflected in the severe test requirement) in frequentist terms. That is to say that the relevant probabilities are to be construed as objective facts about the relative frequency with which certain kinds of events occur in specified (actual or hypothetical) replications of the experimental procedures followed, under the assumption of the relevant hypotheses. Note here that, in keeping with our reformulation of the internalism–externalism debate in terms that eschew reference to beliefs, the error-statistical framework is not concerned with relations among beliefs. Any reference to a severe test relates a testing procedure, a body of data, and a hypothesis. The ES account explicitly distances itself from making the status of an inference depend on the beliefs—real or imagined—of investigators (Mayo 1997). The ES account is meant to apply not only in cases where one quantitatively evaluates these probabilities using a statistical model of the data-generating process, but also in experimental settings that take a more casual or intuitive approach to statistical analysis. As the former setting highlights important conceptual features of the error-statistical account, it is worth briefly noting the requirements of a more formal statistical assessment. Any statistical inference in the ES approach will make use of a statistical model. As Cox notes, formal statistical inferences regard “the family of models as given and the objective as being to answer questions about the model in light of the data” (Cox 2006, p. 3).8 Such a model will represent the data as the outcome of repeated trials resulting in values assigned to a random variable. It will specify a test statistic, defined in terms of the data, to be used in determining whether or not a hypothesis of interest passes or fails the test by reference to critical values of that test statistic, also specified by the model. The model will also specify the space of hypotheses (parameter values) between which the data will be used to discriminate. The model will involve assumptions about the distribution of values for the random variable, about the dependence of the outcome of a trial on the outcome of previous trials, and about heterogeneity, i.e., whether or not the distribution of outcomes changes from one trial to the next. It is this model from which the probabilities used to characterize the testing procedure are derived. The model should encompass all of the hypotheses among which one is attempting to discriminate. It will be these specific alternatives to H that must be considered when evaluating probabilities in the context of requirement ES2 above, 8 Following Spanos (1999), we are using the term “statistical model” in the same sense as Cox’s “family of models”—i.e., to refer to a mathematical structure that characterizes certain aspects of the data-generating process without specifying fully the values of all parameters that describe that structure. Of course, the term “statistical model” would also be appropriate for referring to such a structure in which the values of such parameters have been specified, but context will suffice to make clear which sense is intended.
123
482
Synthese (2011) 182:475–492
and an examination that distinguishes between alternatives against which H has and has not been severely tested is crucial to an error-statistical analysis of the exact nature of the inference to be drawn with regard to H (cf. Mayo and Spanos 2006). For our purposes, it is important to emphasize that for the investigator to judge correctly which hypotheses have and have not severely passed test T with data x0 , the statistical model used in such an error-statistical analysis must be statistically adequate, in the sense that it “captures the statistical systematic information contained in the data” (Spanos 1999, p. 16). By ensuring that the actual error probabilities of the test are at least approximately equal to those assumed, statistical adequacy amounts to adequacy for the purpose of reliably drawing primary inferences (Spanos 1999; Mayo and Spanos 2004). How statistical adequacy is evaluated is a point to which we shall return, but it is important to note that statistical adequacy is an objective characteristic of statistical models. Even if one is unaware of any reason to doubt the model one employs in drawing a statistical inference—and even if any such reasons were inaccessible to the investigator—it may fail to be statistically adequate, in which case one’s judgments about which hypotheses are evidentially supported by the data will be mistaken. Thus we note an apparent anti-internalist* methodological consequence of the errorstatistical view of evidence: AI: Error-statistical evidence claims can be rendered false by facts9 to which the investigator has no access. We will consider later in this section a challenge to this claim that purports to show that such defeaters are, after all, accessible. We will show that accessibility can be vouchsafed only for some error-statistical evidential inferences, but not in general. For now, we wish to pursue the consequences of accepting AI. It is anti-internalist insofar as the error-statistical account depicts scientific justification as drawing both upon those reasons that are accessible to the investigator and the reasons implicated in the objectively obtaining evidential relations, even if those are not accessible to the investigator. Thus one might, in the internalist* sense, appear to be justified in asserting a hypothesis, while in the sense of justification that requires an objective evidential relationship between x0 and H, one is not justified in asserting H. Hence, just as externalist views of justification hold that an individual’s belief can be justified by reasons to which the believer has no access, error-statistical evidence relations can be satisfied or fail to be satisfied in virtue of facts to which the investigator has no access. Moreover, there seems to be a resemblance between ES and a paradigmatically externalist account of justification in epistemology. Just as Alvin Goldman’s reliabilist theory makes justification rest on the tendency of a beliefforming process to produce true rather than false beliefs (Goldman 1986, 1999), ES links the justification of an inference to its having resulted from a testing procedure with low error probabilities (Woodward 2000). Contrary to what might be suggested by this similarity, however, there is good reason to think that the error-statistician will not hold a strictly externalist* view of justification. Seeing why requires, however, 9 We use the term “facts” here in a broad sense, to refer to anything, including states of affairs and regu-
larities, that might render a statistical model inadequate.
123
Synthese (2011) 182:475–492
483
looking beyond the schematization of the error-statistical view of evidence discussed above, to the methodological framework that error-statistics draws upon for making what Mayo calls “arguments from error.” That methodology, developed so as to enable the investigator to pursue evidence that meets the requirements schematized above as ES1 and ES2, emphasizes that scientific investigations must “severely probe” for error in the drawing of inferences from data. In the laboratory, this amounts to the need to engage in a wide variety of activities aimed at checking for errors in assumptions about instrumentation, about the control of confounding variables, about the nature of the data-generating process under investigation, about auxiliary theoretical assumptions, ceteris paribus factors, etc. This line of thought in Mayo’s work has intersected with the work of the econometrician Aris Spanos in the development of a methodology aimed at testing the assumptions employed in statistical inference and modeling. Because any statistical inference will rely on an assumed statistical model, such inferences must always answer to the worry that flaws in the model assumptions defeat the inference that has been drawn. Mayo and Spanos develop a “methodology of mis-specification (M–S) testing” to help researchers address this need. Such a methodology, they observe, should provide methods for uncovering and probing model assumptions, isolating sources of any anomalous results, and iterative procedures for accommodating any flawed assumptions in respecified models until arriving at a statistically adequate model—a model that is adequate for subsequent (primary) statistical inferences. (Mayo and Spanos 2004, p. 1008) Laid out in some detail by Spanos in his (1999), the methodology of misspecification testing and respecification can be characterized for our purposes as the attempt to assess a candidate statistical model considered for use in drawing a primary inference. Such an evaluation involves embedding the candidate model within a larger encompassing model that includes alternatives based on different assumptions, and then using the data already in hand to test for departures from the assumptions of the candidate model in the directions reflected in the alternatives included in the encompassing model. A series of such tests will be directed at testing for departures with regard to the different types of assumptions (distribution, dependence, heterogeneity) that define a statistical model. Should the candidate model fail such a test, this will not be taken as evidence for a specific alternative model, but will rather serve as the occasion for respecifying the model, and then reiterating the testing process with the new, respecified candidate model until one has a model that can be vindicated as statistically adequate. Here one can see the methodologically internalist flip-side of the externalist possibility of defeat by facts that are not accessible. For the justification of an evidence claim the mere de re satisfaction of the ES requirements is insufficient; the statistical model through which the satisfaction of those requirements is evaluated must in turn be validated as statistically adequate. By carrying out the mis-specification testing required for such validation, the investigator simultaneously acquires the ability to articulate the grounds for accepting that model.
123
484
Synthese (2011) 182:475–492
Indeed, one might think that the possibility of mis-specification testing advances the internalist* aspect of scientific justification so far as to provide grounds for rejecting our “anti-internalist*” claim AI. Because mis-specification testing requires only the data already in hand and the use of readily-available statistical techniques, one might argue, any facts that would result in statistical inadequacy of a model used in making an error-statistical inference are accessible, whether or not the investigator avails herself of these techniques. Granting this point, we note that it shows only that mis-specification testing constitutes grounds for rejecting AI as applied to formal statistical inferences. This objection, however, does not undermine AI as a general thesis about error-statistical evidence claims, insofar as some such claims pertain not only to formal statistical inferences but to inferences to substantive scientific claims going beyond what can be secured via mis-specification testing alone (see Mayo and Spanos 2006, pp. 341–342). Often a statistical hypothesis “stands in for” or is a model of a substantive claim about causal or other relationships (e.g., “these data include decays of the Higgs boson” or “these structures are fossilized remains of pigment-producing organelles”), such that an inference to that claim involves non-statistical assumptions regarding the adequacy of the experimental set-up, or the adequacy of the statistical hypothesis as a model of the substantive claim. Mis-specification testing underwrites the statistical reliability of the procedures used to assess substantive hypotheses or claims of interest, but does not in general subvent substantive reliability.10 Mis-specification testing does not underwrite a general argument against AI.11 We regard mis-specification testing as addressing two related problems. First, it helps to prevent the investigator from being misled as to which hypotheses have and have not been severely tested (the problem of misleading evidence). Second, it helps the investigator to articulate the reasons that support the use of the statistical model employed (the problem of justification). Our view is that justification in science is externalist* in character insofar as the evidential relations that are of concern in addressing the problem of misleading evidence are objective (as they are on the ES view), and internalist* in character insofar as addressing the problem of justification requires the capacity to access and provide reasons that support one’s inferences from the data. A thoroughgoing externalist, of course, would not accept our identification between the problem of justification and the question of one’s ability to articulate supporting reasons, for on an externalist* account one can be justified in drawing conclusions even if one cannot access any reasons that support such a conclusion. Have we not simply assumed an internalist* point of view in the way we frame the question? In reply to this concern, we should first restate that our concern is with justification in the socially situated contexts of scientific inquiry and communication; it is the nature of these contexts, and not a prior commitment to internalism*, that grounds our understanding of the problem of justification. As we noted in Sect. 2, investigators
10 We are indebted to an anonymous referee for Synthese for this point. 11 We thank an anonymous referee for Synthese for bringing this issue to our attention.
123
Synthese (2011) 182:475–492
485
drawing conclusions from data are responsible for vindicating their assertions and inferences in response to critical questioning from the community of investigators. In the absence of such a capacity for vindicating a conclusion, an investigator may be able to make statements that are objectively supported by evidence, but does not, thereby, contribute to the scientific pursuit of knowledge. Moreover, it is not merely mis-specification testing or other methods of model criticism that serve this dual function. Rather, statistical methods in general can be thought of as directed at both the avoidance of being misled and at providing resources for the articulation of justifying reasons. These dual purposes are hinted at as well in some of Mayo’s own work: [e]rror statisticians appeal to statistical tools as protection from the many ways they know they can be misled by data as well as by their own beliefs and desires. The value of statistical tools is that they allow one to develop strategies that capitalize on knowledge of mistakes: strategies for collecting data, for efficiently checking an assortment of errors, and for communicating results in a form that promotes their extension by others. (Mayo 1996, p. 337) The error-statistical emphasis on methodology seeks to provide strategies that enable investigators to vindicate their evidence claims by appealing to methods employed that either eliminate errors or take them into account in the final inference. Such vindication employs lines of reasoning based on the characteristics of those very same strategies. The ES conditions explain what characteristics such strategies should have (they should be reliable in the sense articulated in ES), and thus guide methodological development. But one cannot justifiably infer H on the basis of the bare satisfaction of the ES conditions alone. Rather, the claimant has to give reasons to show how appropriate methodological precautions against error have been taken (and hence must have access to such reasons). This aspect of internalism* finds a natural place within error-statistics. The conceptual landscape of our account appears incomplete, however. Evidential relations on the ES account are objective in a rather strong sense that they are not relativized to any epistemic situation. Whether data that enable a hypothesis to pass a particular test really do provide evidence for that hypothesis is independent of the epistemic situation of anyone seeking to draw inferences from those data. Yet the justification of any such inferences making use of the evidence is in some sense dependent on such epistemic situations. In the next section, we explicate an epistemic notion that shares this dependence on epistemic situations and illuminates the nature of the additional epistemic work that takes us beyond de facto evidence for H to the justification of an inference to H. 4 Securing experimental conclusions A researcher presents a conclusion from data gathered during research. The decision to present a conclusion indicates that the researcher and her collaborators are convinced that they are prepared to justify their inference in response to whatever challenges they might plausibly encounter. Their confidence will result from their having already
123
486
Synthese (2011) 182:475–492
posed many such challenges to themselves. New challenges will emerge from the community of researchers with which they communicate. Such challenges take many forms, depending on the nature of the experiment and of the conclusions: Are there biases in the sampling procedure? Have confounding variables been ruled out? Is the correct model being employed? To what extent have alternative explanations been considered? Are estimates of background reliable? Can the conclusion be reconciled with the results of other experiments? Have instruments been adequately shielded, calibrated, and maintained? To a large extent, such challenges can be thought of as presenting possible scenarios in which the experimenters have gone wrong in drawing the conclusions that they do. But such challenges are not posed arbitrarily. Being logically possible does not suffice, for example, to constitute a challenge that the experimenter is responsible for addressing. Rather, both experimenters in anticipating challenges and their audience in posing them draw upon a body of knowledge in determining the kinds of challenges that are significant (Staley 2008). Indeed, as Mayo has argued (1996, pp. 200–203), there is good reason on error-statistical grounds for not regarding the mere logical possibility of error as grounds for rejecting an inference. Such a strategy is highly unreliable in that it always prevents one from accepting a true hypothesis, and in that sense has a maximum error rate. It would be valuable to articulate some general principles for determining those challenges to which an experimenter must be able to respond in order to justify an inference from data. Here we merely propose a modest first step toward this aim. We propose, specifically, to articulate a general conceptualization of the problem that such justifying responses address. Our aim is to provide a heuristic that might serve to systematize the strategies that experimenters use in responding to such challenges and allow for a clearer understanding of the epistemic function of such strategies. Our discussion above highlights certain features that can guide us in formulating the concept at which we aim. Responses to the kinds of challenges we have in mind are concerned with scenarios in which the inference drawn would be invalid; they are posed as more than mere logical possibilities, but as scenarios judged significant by those in a certain kind of epistemic situation, incorporating relevant disciplinary knowledge; and an appropriate response needs to provide a basis for concluding that the scenario in question is not actual. We conceive of the practices of justifying an inference as the securing of that inference against scenarios under which it would be invalid. Here we explicate the concept of security as follows: SEC: Let 0 be the set of all scenarios that are epistemically possible relative to an epistemic situation K. Suppose that 1 ⊆ 0 . Proposition P is secure throughout 1 relative to K iff for every scenario ω ∈ 1 , P is true in ω. If P is secure throughout 0 , then P is fully secure relative to K. Before proceeding, some explanation of terminology is in order. This definition employs the notion of epistemic possibility, which can be thought of as the modality employed in such expressions as “For all I know, there might be a third-generation leptoquark with a rest of mass of 250 GeV/c2 ” and “For all I know, I might have left
123
Synthese (2011) 182:475–492
487
my sunglasses on the train.” Hintikka, whose (1962) provides the origins for contemporary discussions, there takes expressions of the form “It is possible, for all that S knows, that P” to have the same meaning as “It does not follow from what S knows that not-P.”12 Borrowing Chalmers’ notion of a scenario for heuristic purposes, we use that term to refer to what might be intuitively thought of as a “maximally specific way things might be” (Chalmers forthcoming). In practice, no one ever considers scenarios as such, of course, but rather focuses on salient differences between one scenario and another. To put this notion more intuitively, then, a proposition is secure for an epistemic agent just insofar as, whatever might be the case for all that the agent knows, that proposition remains true. Applied to inferences from data, we will say that an inference from data x to a hypothesis h, based on results of test T, is secure relative to K insofar as the proposition “data x0 from test T are good evidence for h” is secure relative to K. In the context of the error statistical account, this amounts to making the security of such an inference depend on the security of the principles ES1 and ES2 as applied to the relevant evidence claim. In order to address the pressing concern that we are constructing a useless bit of conceptual apparatus without methodological applicability, let us emphasize two points. First, the notion of a fully secure inference is something we regard as an ideal to be employed only in articulating an account of justification. Second, we do not propose that investigators can or should attempt to determine some degree of security of any of their inferences. (Doing so would require, for example, that one determine just what scenarios are epistemically possible for a given epistemic situation, thus drawing us into debates over the semantics of epistemic possibility that we are eager to avoid.) Rather, the value of the concept of security lies in its capacity to conceptualize methods of justification encountered in scientific practice in a systematic way. Thus, although we have defined a concept that we call security, the methodologically significant notion is not securityper se, but the securing of inferences, which we understand in terms of the use of methods that serve to increase the relative security of an inference, either by expanding the range of validity of an inference across a fixed space of possible scenarios, or by decreasing the range of possible scenarios in which the inference would be invalid. One can thus secure an inference without ever needing to determine its degree of security. Returning, then, to justification, we wish to relate justification to security in the following way: JUS: An assertion of H as a conclusion inferred from data x0 on the basis of test T is fully justified relative to epistemic situation K if: (1) on the basis of test T, data x0 are good evidence for H (in error-statistical terms, ( x0 , T, H ) satisfy ES1 and ES2); and
12 Just how to formulate the semantics of such statements is, however, contested (see, e.g., DeRose 1991; Chalmers forthcoming). The central claims of the present proposal are independent of disputed issues regarding the semantics of epistemic possibility.
123
488
Synthese (2011) 182:475–492
(2) the proposition “on the basis of T, data x0 are good evidence for H” is secure throughout all scenarios that are epistemically possible relative to K.13 This account articulates a notion of full justification as an epistemic ideal. The point is that methods of justification serve two epistemically distinguishable purposes. First, they aim (fallibly) to create conditions that will render (1) true for the inference at which the investigators arrive. Second, they aim to facilitate the pursuit of (2) by providing investigators with the resources to respond to the challenge of possible error-scenarios and, thus, serve to secure the inference proposed. Though full security may remain an unachieved ideal, the increase in relative security puts investigators in a better epistemic situation than they were before. Such methods therefore can be seen as underwritten by a general methodological dictum for investigators considering a potential inference: Consider those scenarios which, for all you know, might obtain that would invalidate the evidence supporting your inference and take the measures necessary to secure your inference against those scenarios. 5 Is security a strictly internalist* concept? In the past three sections we have argued that although scientists ultimately aim to produce evidence that objectively connects data and a hypothesis under investigation, they also seek to vindicate their experimental conclusions by reference to information and evidence available to those within their epistemic situation. The overarching picture this might suggest is one in which evidence is treated as an externalist* notion and security is treated as internalist* notion. We believe that this is false because security itself is not a strictly internalist* notion. In this section, we discuss the internalist*–externalist* duality of the notion of security and then conclude with some brief reflections on the problematic notion of accessibility for epistemic justification in the sciences. There are clearly internalist* aspects to the notion of security. In particular, it is important to distinguish between satisfying the conditions for evidence and the security of an experimental conclusion. The former is an externalist* notion concerning the objective connection between experimental test procedures and the truth of the inferences formed on the basis of these results. But even if one’s test procedures are truth-conducive, this does not entail that the conclusions one derives from these experimental results are secure since one may not have access to the information essential to making the case that one’s test procedures are reliably connected to the truth of the assertion. Even if an investigator’s tests are, as a matter of fact, reliable, since the investigator lacks access to this information and is aware of the fallibility of the various assumptions of his tests, there is good reason for the investigator to attempt to make his evidence claim more secure from defeat. 13 An epistemic situation might be that of an individual, a research group, or a scientific community. Wray (2007) denies that communities (as opposed to research teams) can be bearers of knowledge, which may be thought to render justification relative to a community’s epistemic situation problematic. We take no stance on this and do not think that our view has any direct consequences for this issue.
123
Synthese (2011) 182:475–492
489
Vindicating an experimental conclusion to the relevant scientific community involves showing, on the basis of available evidence produced through various tests and statistical analyses, that one’s inferences are not likely to be defeated due to a false fundamental assumption or a mis-specified model. Given the social structure of scientific inquiry and the fact that the demand for security emerges as a response to the critical scrutiny an assertion must undergo to be added to the corpus of scientific knowledge, the argumentative practice of vindicating one’s primary evidence claim requires an appeal to information that is available within the relevant epistemic situation. To secure an experimental conclusion requires that the reasons ruling these scenarios out are made accessible to the scientific community and this is clearly an internalist requirement. As such, accessibility is (at least) a necessary condition for one’s inference to be secure. But security, as we have defined it, is not a strictly internalist* notion because the attempt to secure an experimental conclusion is not sufficient for the conclusion to be secure. Access to the information and evidence available within a particular epistemic situation is not sufficient for an experimental conclusion to be secure. This follows from the fact that security is an objective notion—that is, whether some inference is secure relative to an epistemic situation K depends objectively on the scenarios that are epistemically possible relative to K. Hence, the claim that a scientific assertion is relatively secure can be defeated by facts inaccessible to those making the assertion. As such, an investigator or collaboration, having taken steps to secure an inference, can be mistaken in thinking that the evidence claim or inference is relatively more secure than it was prior to employing these methodologies. Thus, security depends not merely upon information accessible to those within the epistemic situation but also upon factors that are beyond the scope of what is accessible within the relevant epistemic situation. Although accessibility is necessary for security it is not sufficient and, as such, security is not a strictly internalist* notion.
6 Conclusion The preceding discussion of security provides a fruitful way of connecting our earlier discussion of the intuitive grounds for both the internalist* and the externalist* theses. Recall, that the primary appeal of internalism* was the ability to vindicate an empirical claim against skeptical questions by appealing to accessible evidence grounding the claim. Since the notion of security is partly an internalist* notion it satisfies this intuition. But one of the primary intuitions supporting externalism* was the idea that justification needs to be properly connected to truth; in fact, the disconnect between what is available to one within an epistemic situation and what is relevant to the justification of an assertion is one of the primary deficiencies of the internalist* thesis. Since security is not a strictly internalist* notion, it satisfies the intuition that justification ought to be connected to truth in a strong way. What, then, is the epistemic significance of the internalist* aspects of security? As we noted above, contemporary epistemologists tend to construe accessibility in terms of what is available to the epistemic subject on the basis of reflection alone. Our emphasis upon collaborations and the socially-situated nature of scientific inquiry
123
490
Synthese (2011) 182:475–492
(Sect. 2) precludes thinking of access in this way. Instead, we proposed that accessibility is a relation holding between that relevant scientific community and the evidence available within its epistemic situation. Given the arguments of this paper, it should be clear that accessibility concerns the availability to a relevant scientific community of reasons sufficient to vindicate an experimental conclusion in the face of legitimate questions about its justificatory status. Access thus become distributed, reflecting the broader distribution among the group members of the relevant epistemic tasks that must be undertaken in order to produce and secure evidence (Giere 2002). The significance of this social structure reinforces the importance of access for scientists who seek to secure their assertions. When an experimental conclusion is advanced, the audience to which it is directed may raise questions about ways in which the underlying assumptions on which that evidence claim rests might be wrong. The claimant needs to be able to address these questions, and the security of an evidence claim might be thought of as measuring how well a collaboration can do this in principle. The securing of experimental conclusions thus is a manifestation of the capacity of the collaboration to defend their claims. This is, of course, a fallible and corrigible process and it is clearly possible that the purported defenses or attempts at securing an assertion will fail for reasons that might not be accessible at the point the collaboration is asserting their conclusions. But the vindication of these claims requires reference to the available information if scientists are going to proffer their assertions as contributions to scientific knowledge. Absent a defense of their assertions, scientists within the relevant epistemic situation may continue to raise legitimate concerns about the epistemic status of the proposed assertion. Hence, epistemic justification, at least within the context of paradigmatic objectivist theories of evidence, requires the ability to defend an assertion from legitimate concerns about its epistemic status. This, in turn, requires appealing to information and evidence that is available within an epistemic situation. Clearly this information might not be connected objectively with the truth and, as such, the attempt to secure an assertion might, in fact, fail to secure the claim. There is no guarantee that the methods employed will be truth-conducive, but security itself increases the epistemic standing of the evidence claim relative to the epistemic situation at issue. So, while the appeal to available information may not be sufficient for epistemic justification in the context of objectivist theories of evidence, it is necessary. Likewise, establishing an objective connection between justification and truth is necessary for justification in an objectivist account but it is not sufficient by itself for an assertion to be justified. Justification requires both internalist* and externalist* elements. Acknowledgments We would like to thank two anonymous referees for Synthese for their helpful comments. An earlier version of this paper was presented at the Second Meeting of the Society for the Philosophy of Science in Practice in Minneapolis, Minnesota. We are grateful for the helpful comments from audience members at our talk, especially Deborah Mayo and Aris Spanos. Aaron Cobb’s research was supported by a Saint Louis University Research Fellowship. Kent Staley’s research was supported by National Science Foundation Award SES-0750691.
123
Synthese (2011) 182:475–492
491
References Achinstein, P. (2001). The book of evidence. New York: Oxford University Press. Alston, W. (1989). Epistemic justification: Essays in the theory of knowledge. Ithaca: Cornell University Press. Baird, D. (2004). Thing knowledge: A philosophy of scientific instruments. Berkeley: University of California Press. Bonjour, L., & Sosa, E. (2003). Epistemic justification: Internalism vs. externalism, foundations vs. virtues. Malden, MA: Blackwell. Chalmers, D. (forthcoming). The nature of epistemic space. In A. Egan & B. Weatherson (Eds.), Epistemic modality. Oxford: Oxford University Press. Comesana, J. (forthcoming). Evidentialist reliabilism. Retrieved August 19, 2009, from http://phiosophy. wisc.edu/comesana/. Conee, E., & Feldman, R. (2001). Internalism defended. In H. Kornblith (Ed.), Epistemology: Internalism and externalism (pp. 231–260). Cambridge: MIT Press. Cox, D. R. (2006). Principles of statistical inference. New York: Cambridge University Press. DeRose, K. (1991). Epistemic possibilities. The Philosophical Review, 100, 581–605. Fumerton, R. (1996). Metaepistemology and skepticism. Boston: Rowman and Littlefield. Giere, R. (2002). Scientific cognition as distributed cognition. In P. Carruthers, S. Stitch, & M. Siegal (Eds.), Cognitive basis of science (pp. 285–299). Cambridge: Cambridge University Press. Gilbert, M. (1994). Remarks on collective belief. In F. Schmitt (Ed.), Socializing epistemology: The social dimensions of knowledge (pp. 235–256). Lanham, MD: Rowman and Littlefield. Goldberg, S. (2007). Internalism and externalism in semantics and epistemology. Oxford: Oxford University Press. Goldman, A. (1986). Epistemology and cognition. Cambridge: Harvard University Press. Goldman, A. (1999). Knowledge in a social world. Oxford: Oxford University Press. Goldman, A. (forthcoming). Toward a synthesis of reliabilism and evidentialism? Or: Evidentialism’s troubles, reliabilism’s rescue package. Retrieved August 19, 2009, from http://fas-philosophy. rutgers.edu/goldman/Papers.htm. Hintikka, J. (1962). Knowledge and belief: An introduction to the logic of the two notions. Ithaca: Cornell University Press. Jaynes, E. T. (2003). Probability theory: The logic of science. New York: Cambridge University Press. Kitcher, P. (1993). The advancement of science: Science without legend, objectivity without illusions. New York: Oxford University Press. Kornblith, H. (2001). Epistemology: Internalism and externalism. Cambridge, MA: MIT Press. Lele, S. (2004). Evidence functions and the optimality of the law of likelihood. In M. Taper & S. Lele (Eds.), The nature of scientific evidence: Statistical, philosophical, and empirical considerations (pp. 191–216). Chicago: University of Chicago Press. Longino, H. (2002). The fate of knowledge. Princeton: Princeton University Press. Mayo, D. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press. Mayo, D. (1997). Duhem’s problem, the Bayesian way, and error statistics, or, “What’s belief got to do with it?” . Philosophy of Science, 64, 222–244. Mayo, D., & Spanos, A. (2004). Methodology in practice: Statistical misspecification testing. Philosophy of Science, 71, 1007–1025. Mayo, D., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman–Pearson philosophy of induction. British Journal for the Philosophy of Science, 57, 323–357. McGrew, T., & McGrew, L. (2006). Internalism and epistemology: The architecture of reason. New York: Routledge Publishing. Pitt, J. (2005). Hume and Peirce on belief, or, why belief should not be considered an epistemic category. Transactions of the Charles S. Peirce Society, 41, 343–354. Popper, K. (1979). Objective knowledge: An evolutionary approach. Oxford: Oxford University Press. Poston, T. (2008). Internalism and externalism in epistemology. In Internet encyclopedia of philosophy. Accessed August 19, 2009, from http://www.iep.utm.edu/int-ext/. Pryor, J. (2001). Highlights of recent epistemology. British Journal for the Philosophy of Science, 52, 95– 124.
123
492
Synthese (2011) 182:475–492
Roush, S. (2005). Tracking truth: Knowledge, evidence, and science. New York: Oxford University Press. Royall, R. (1997). Statistical evidence: A likelihood paradigm. Boca Raton, FL: Chapman and Hall. Sober, E. (2008). Evidence and evolution: The logic behind the science. New York: Cambridge University Press. Spanos, A. (1999). Probability theory and statistical inference. Cambridge: Cambridge University Press. Staley, K. (2007). Evidential collaborations: Epistemic and pragmatic considerations in ‘group belief’. Social Epistemology, 21, 321–335. Staley, K. (2008). Error-statistical elimination of alternative hypotheses. In K. Staley, J. Miller, & D. Mayo (Eds.), Error and methodology in practice. Special issue of Synthese, 163, 397–408. Steup, M. (1999). A defense of internalism. In L. P. Pojman (Ed.), The theory of knowledge: Classical and contemporary readings (pp. 373–384). Belmont, CA: Wadsworth/Thomas Learning. Suppe, F. (1993). Credentialing scientific claims. Perspectives on Science, 1, 153–201. Thagard, P. (1997). Collaborative knowledge. Noûs, 31, 242–261. Tollefsen, D. (2002). Challenging epistemic individualism. Protosociology, 16, 86–117. Wheeler, G., & Pereira, L. M. (2008). Methodological naturalism and epistemic internalism. In K. Staley, J. Miller, & D. Mayo (Eds.), Error and methodology in practice. Special issue of Synthese, 163, 315–328. Williamson, J. (2008). Objective Bayesianism with predicate languages. In K. Staley, J. Miller, & D. Mayo (Eds.), Error and methodology in practice. Special issue of Synthese, 163, 341–356. Woodward, J. (2000). Data, phenomena, and reliability. Philosophy of Science, 67, S163–S179. Wray, K. B. (2002). The epistemic significance of collaborative research. Philosophy of Science, 69, 150– 168. Wray, K. B. (2007). Who has scientific knowledge?. Social Epistemology, 21, 337–347.
123